(25*6, 257/ A UNITED STATES DEPARTMENT OF COMMERCE PUBLICATION .<**:°\ c % \ xJ -# geographic base file system- uses, maintenance, problem solving Computerized Geographic Coding Series GE 60 No. 3 U.S. DEPARTMENT OF COMMERCE Social and Economic Statistics Administration BUREAU OF THE CENSUS conference proceedings November 16-17, 1971 Arlington, Tex. s liberty s Liberty S Li^ erty s Liherty light light Light Light Light light Light Light Light Light Light Light Lilao Lila/ A.V 152000 15^ 1*7233 035 ^\ 035 M 23 \ 035 ^ 7233 o 035 ^ 7233 035 1,7302 0% ^ 7302 o 035 «?.„ 035 mo 001600 So 001600 nlO 001600 010 001600 01° 002101 010 002101 010 002101 01° 002101 010 002101 01° 002101 010 O02101 318 319 320 223 218 121* 113 112 111 101 201 210 211 220 221 222 223 U02 1+02 301+ 305 306 213 212 211 ^IHV Digitized by the Internet Archive in 2012 with funding from LYRASIS Members and Sloan Foundation http://archive.org/details/geographicbasefiOOwash geographic base file system- uses, maintenance, problem solving Computerized Geographic Coding Series GE 60 No. 3 conference proceedings November 16-17, 1971 Arlington, Tex. Issued May 1972 U. S. DEPARTMENT OF COMMERCE Peter G. Peterson, Secretary James T. Lynn, Under Secretary Harold C. Passer, Assistant Secretary for Economic Affairs and Administrator, Social and Economic Statistics Administration BUREAU OF THE CENSUS George Hay Brown, Director **?*<*. BUREAU OF THE CENSUS George Hay Brown, Director Robert L. Hagan, Acting Deputy Director Paul R. Squires, Associate Director For Data Collection and Processing Morton A. Meyer, Chief, Geography Division This report was prepared in the Geography Division under the supervision of Gerald J. Post, Assistant Chief for Planning and Procedures. Jacob Silver was instrumental in organizing the conference and in compiling and editing these proceedings. The Census Bureau is extremely grateful to the North Central Texas Council of Governments for hosting this conference, and to Robert L. Wegner, Director of Regional Planning, Barbara Kaplan, Coordinator of Housing Programs, and Joel Wooldridge, Regional Planning Associate, for their fine cooperation. Library of Congress No. 74-176277 SUGGESTED CITATION U.S. Bureau of the Census, Geographic Base File System- Uses, Maintenance, Problem Solving Report GE 60 No. 3: Washington, D.C. 1972 For sale by the Superintendent of Documents, U.S. Government Printing Office, Washington, D.C. 20402 Price $1.25 Stock Number 0000-0000 PREFACE This report presents the proceedings of the third in a series of conferences devoted to local govern- ment use of the Census Bureau's Geographic Base Files in computerized geographic coding. The conference was held in Arlington, Texas, on November 16 and 17, 1971, and the papers presented and the resulting discussions centered on the theme of the Census "Geographic Base File System - Uses, Maintenance, Problem Solving." The major purpose of this series of conferences is to improve communication among users and to provide a vehicle for the mutual exchange of information between users, and between users and the Census Bureau and other Federal agencies. As expressed by the Geography Division, "One of the major purposes of today's meeting... is to 'spread the word:' to help those who have the need, to learn from those who have already taken the plunge into the techniques and the problems of DIME file management and use ..." The proceedings to the conference are presented in the pages that follow. With the exception of the paper given by Mr. Richard M. Levy and the slide presentation by Dr. Robert T. Aangeenbrug, the formal papers are presented in toto. The question and answer sessions have been edited and only the more immediately pertinent questions and their answers are presented. Copies of the first two conference proceedings: 1. U.S. Bureau of the Census, Use of Address Coding Guides in Geographic Coding-- Case Studies, Report GE 60 No. 1, Washington, D.C. 1971. 2. U.S. Bureau of the Census, Geographic Base Files - Plans, Progress, Prospects, Report GE 60 No. 2, Washington, D.C. 1971. can be purchased for $.75 and $1.00, respectively, from the Superintendent of Documents, U.S. Govern- ment Printing Office, Washington, D.C. 20402, or from any Department of Commerce field office. LIST OF PARTICIPANTS Dr. Robert T. Aangeenbrug Director Institute for Social and Environmental Studies University of Kansas Lawrence, Kansas Robert Amsterdam Director Geographic Information System Office of Administration, Office of the Mayor New York City, New York Walter Bieda Program Analyst Program Planning and Technology Staff Department of Housing and Urban Development Fort Worth, Texas Jim Bohnsack Director Management Planning and Systems Department, City of Tulsa Tulsa, Oklahoma Larry Carbaugh Chief Technical Developments Data Access and Use Laboratory Data User Services Office Bureau of the Census Washington, D.C. Jack Carter Desk Officer Office of Planning and Management Grants Department of Housing and Urban Development Washington, D.C. Donald F. Cooke Chairman Special Interest Group on Geographic Base File Developments Urban and Regional Information System Association (URISA) Urban Data Processing, Inc. Cambridge, Massachusetts Rona l d Crelli n Computer Systems Specialist SCRIS Census Use Study Data User Services Bureau of the Census Los Angeles, California Ha rold Crutsinger Systems Analyst Office of Information Services, Office of the Governor Austin, Texas Kenneth W. Downey Planning Engineer Planning and Research Division Federal Highway Administration Fort Worth, Texas Michael Duke Senior Programmer Office of Information Services, Office of the Governor Austin, Texas Albert E. Dunagan Health Information Specialist Health Department, City of Dallas Dallas, Texas Gloria Eyres Research Associate Institute of Urban Studies University of Texas at Arlington Arlington, Texas Robert Fairbanks Senior Systems Analyst Ohio- Kentucky- Indiana Regional Planning Authority Cincinnati, Ohio Glenn Flinchum Statistician Technical Assistance Branch Office of State Services National Center for Health Statistics Research Triangle Park, North Carolina \V a de G. Fox Assistant Director for Data Services Southwestern Pennsylvania Regional Planning Commission Pittsburgh, Pennsylvania Larry Granberry Systems Analyst Texas State Department of Health Austin, Texas Henry Halff Systems Analyst Department of Planning and Urban Development, City of Dallas Dallas, Texas John Halterman Programming Specialist Geography Division Bureau of the Census Washington, D.C. Eric S. Hanssen Consultant (BASYS, Inc.) Integrated Municipal Information System Wichita Falls, Texas Robert Hill Director of Planning Texas Criminal Justice Council Austin, Texas Robert Hollis Highway Engineer Federal Highway Administration Austin, Texas Gary Johnson Senior Systems Analyst Denver Regional Council of Governments Denver, Colorado Lee P. Johnston Consultant (University of Tennessee) Office of Civil Defense Washington, D.C. Barbara Kaplan Coordinator of Housing Programs North Central Texas Council of Governments Arlington, Texas Michael Kennedy Statistician Department of Planning and Urban Development, City of Dallas Dallas, Texas Richard M. Levy Consultant (Information Systems Development, Inc.) Kansas City Health Department Kansas City, Missouri Peter A. Lombard Urban Transportation Planning Engineer Planning and Research Division Federal Highway Administration Fort Worth, Texas Dr. Endfred J. Lundberg Director Institute for Urban Information Systems University of Cincinnati Cincinnati, Ohio Robert W. Marx Gene Hixson Traffic Survey Supervisor Planning Survey Division Texas State Highway Department Austin, Texas Geographic Planning Specialist Geography Division Bureau of the Census Washington, D.C. Stanley Matchett Chief Geography Branch Data Preparation Division Bureau of the Census Jeffersonville, Indiana Howard McCann Planning Engineer Federal Highway Administration Austin, Texas Morton A. Meyer Chief Geography Division Bureau of the Census Washington, D.C. Percy R. Millard Director Data Collection Center Bureau of the Census Dallas, Texas Dr. Billy Moore Bio-Statistician Health Department, City of Dallas Dallas, Texas William Parker Transportation Systems Analyst North Central Texas Council of Governments Arlington, Texas Albert I. Pierce Chief Data Management and Research Middle Rio Grande Council of Governments Albuquerque, New Mexico James Prince Urban Planner Department of Planning and Urban Development, City of Dallas Dallas, Texas Jacob Silver Chief Program Analysis Branch Geography Division Bureau of the Census Washington, D.C. Wayne Snyder Assistant Planning Director City Planning Department Fort Worth, Texas Dr. James W. Stevens Assistant Director Institute of Urban Studies University of Texas at Arlington Arlington, Texas G. Paul Sylvestre Statistics Division Law Enforcement Assistance Administration Washington, D.C. Ronald F. Treichel Program Systems Analyst Office of Civil Defense Washington, D.C. Gary L. Turnock Study Coordinator Dallas-Fort Worth Regional Transportation Study Texas State Highway Department Irving, Texas Mark Wassenich Urban Analyst Office of the City Manager Dallas, Texas Richard W. Renshaw Associate Planner Santa Clara County Planning Department San Jose, California Pat Shannon Systems Analyst Houston-Galveston Area Council of Governments Houston, Texas Charles Schupp Senior Programmer Office of Information Services, Office of the Governor Austin, Texas Forrest Weddle Consultant (Information Systems Development, Inc.) Kansas City Health Department Kansas City, Missouri Robert L. Wegner Director Regional Planning North Central Texas Council of Governments Arlington, Texas Joel Wooldridge Regional Planning Associate North Central Texas Council of Governments Arlington, Texas CONTENTS November 16, 1971 MORNING SESSION Page Chairman's Introduction Dr. James W. Stevens \ Opening Remarks , Robert Wegner 5 Update and Maintenance of Census Bureau Geographic Base (DIME) Files Morton A. Meyer 7 Question Period 11 Uses, Maintenance, Problem Solving --The Dallas/Fort Worth Area The Fort Worth-Tarrant County Geographic Base File Wayne Snyder 13 The Dallas Geographic Base File James Prince 17 Use of Dallas and Fort Worth Geographic Base Files in the Regional Transportation Study William Parker 19 Question Period. 22 Specific Applications of Geographic Base Files: Use of a Geographic Base File in Relating Housing Information Dr. Endfred J. Lundberg 25 Question Period 29 AFTERNOON SESSION Use of a Geographic Base File in a Health Information System Richard M. Levy 32 Question Period 38 Use of a Geographic Base File in the Civil Defense Community Shelter Program Lee P. Johnston 39 Question Period 44 Uses, Maintenance, Problem Solving --The Pittsburgh Area Wade G. Fox 47 Question Period 51 CONTENTS-Continued Uses, Maintenance, Problem Solving in an Integrated Municipal Information System--The Wichita Falls Area DIME File Developments at the Wichita Falls IMIS Project Eric S. Hanssen 55 Question Period 65 Editing a Digitized DIME File: Wichita Falls, Texas Dr. Robert T. Aangeenbrvg 68 Question Period 71 November 17, 1971 MORNING SESSION Uses, Maintenance, Problem Solving --New York City Robert Amsterdam 73 Question Period 77 Uses, Maintenance, Problem Solving --Santa Clara County Richard W. Renshaw 80 Question Period 84 General Discussion 87 Summary of Proceedings „ Donald F. Cooke 99 APPENDIX: I. A General Summary of Errors in the DIME Files 103 II. An Introduction to GIST-New York City's Geographic Information System 107 CHAIRMAN'S INTRODUCTION Dr. James W. Stevens Good morning and welcome to Arlington, Texas. This morning we are very privileged to have such a blue ribbon group assembled for this conference on Geographic Base Files. I have a few general remarks I would like to make before we get into the introduction to place this particular conference in perspective. The subject of geographic base file development and maintenance has recently become a matterof national concern. To facilitate the exchange of information on uses, maintenance, and problem solving, the United States Bureau of the Census has sponsored several conferences on the subject for general discussion and evaluation of plans for establishing an updating and maintenance program. Previous conferences have been held in Wichita, Kansas (November 19-20, 1970) and Jacksonville, Florida (April 1-2, 1971). The conference pro- ceedings in each case have been published by the Bureau and are available through the U.S. Govern- ment Printing Office. 1 This is the third conference in this series and is being co- sponsored by the North Central Texas Council of Governments. The major purpose of this meeting, as in the previous two, is to improve communications among users of address coding guides (ACG's) and geographic base files (GBF's), and between these users and the Bureau. Further, we hope to gain a better understanding of the plans being made for the future use of geographic base files and the technical and administrative difficul- ties encountered locally in this phase. In regard to future activities, we hope to use this meeting to secure comments concerning proposals for a Bureau of the Census, Use of Address Cod- ing Guides in Geographic Coding: Case Studies , Conference Proceedings, Wichita, Kansas, Novem- ber 19-20, 1970; and Bureau of the Census, Geo- graphic Base Files — Plans, Progress, and Pros- pects , Conference Proceedings, Jacksonville, Flor- ida, April 1-2, 1971. Bureau- sponsored maintenance and updating pro- gram in the coming years. The conference program includes general presentations on geographic base file development and specific presentations on developments in Dallas, Fort Worth, and Wichita Falls, Texas; Pittsburgh, Pennsylvania; New York City, New York; and Santa Clara County, California. The program is designed to provide a hearing for a wide range of user experiences in geographic base file development and problems anticipated in future plans. The Census Bureau expects to promote interest in how problems are being over- come, the technical aspects of software develop- ment, and future plans for expansion and main- tenance of files at the local level. Geographic Base File Uses In attempting to meet these conference objec- tives, it is important that participants encourage discussion of uses of local geographic base files in other areas and raise questions concerning current activities at both the local and Federal levels. Some of these uses have already been documented in prior publications. These include the following: Atlanta- -Geocoding of residential construction and demolitions as part of the population estimate program. Orange County, Wichita, and Atlanta--geo- coding of addresses obtained from state em- ployment security files to be used with other information in determining employment distri- bution within the area; geocoding of tax assessor's files. Jacksonville—planning and manpower alloca- tion in law enforcement. Dallas Community Action Agency- -locating survey respondents in poverty areas for cor- relation with census data. Minneapolis — newspaper route location and structuring; survey respondent location; evalu- ation of newspaper circulation. Dallas — building permit studies. Fort Worth- -land use surveys. San Diego- -cross references of geographic areas including indexes for census blocks, traffic zones, and census tracts. Columbus — matching of depositors in financial institutions to determine location of sources of funds. This list, of course, is very incomplete, and it is hoped that the third conference will bring to light many more uses, both actual and planned or potential. Topics and Questions When dealing with a complex subject of this nature, it is often as important to pose questions as to provide answers. For purposes of stimulat- ing some thought concerning the major topics, the following list of general questions has been prepared for consideration. Other questions should arise during the conference. Organization What type of agency at the local level should be responsible for geographic base file development, maintenance, and use? Should different organizations be designated for different tasks required in this, or should one organization be totally responsible? Which agency (or agencies) should be designated? Should the regional councils of governments and regional planning commissions have any role in the development and maintenance of GBF's? Should the GBF be an integral part of the organizational structure for municipal, county, or regional planning and zoning? User Agencies and Uses Is the GBF essential for data analysis in the following areas? Planning and zoning Evaluation of programs Monitoring of programs Management of agency operations Which agencies could make the most use of the GBF"? Are all potential user agencies—municipal, regional, county, State, and Federal—utiliz- ing the GBF as extensively as possible? Will user agencies provide continuing funds for update and use, or will they support it on a project-by-project basis? What uses can be made of a GBF in different projects or on a continuing basis by different functional agencies (criminal justice agen- cies, health agencies, planning departments, water departments, etc.)? Is there a need to match particular files against other files? Is computer mapping essential for local use, or will statistical correlation and descrip- tion suffice? What problems have been encountered in the use of ADMATCH, and what problems in local files have resulted? Have user agencies been educated to the potential of GBF use? Mapping How accurate are the Census Bureau metro- politan maps? How accurate are local maps? Whose maps should be used in constructing and updating GBF's? How should the maps be updated, and what agencies should be responsible for this task? What are the costs to local agencies of mapping and map updating? Are building permits accurate, and can they be utilized in map updating programs? Should "paper streets" be represented on updated maps? Should they be included in computerized GBF's? GBF Coverage What level of geographic unit should be used forgeocoding--parcels, census blocks, census tracts, other local areas? What can be done about including rural areas in the GBF? Is a GBF essential for rural areas or for regional planning? What local geographic codes should be in- cluded? Who should decide on these? Address Assignment How effective is the present addressing system used within the metropolitan area? How can the rural areas be incorporated into a standardized, uniform addressing system to enable more effective GBF development? What city-county and city-city problems exist in address assignment? Can street names and address formats be standardized for use by all agencies? Costs What are the costs associated with GBF development? GBF update? What are the costs associated with agency uses of the system? How large must a data file be to utilize a GBF and related programs effectively and economically? Some more specific questions, which concern future programs in North Central Texas, have been posed by Mrs. Barbara Kaplan of the North Central Texas Council of Governments (NCTCOG). These include: Will the Dallas and Forth Worth City Planning Departments be able to assume the responsi- bility of GBF correction, maintenance, and update for their respective cities? Which agency will undertake these same opera- tions for those portions of Dallas and Tarrant Counties that are exclusive of the cities of Dallas and Fort Worth? What role will NCTCOG assume in the GBF procedures in view of the fact that the GBF may be a necessary ingredient in the NCTCOG Regional Transportation Study already in progress? How much staff time (both clerical and super- visory) will be required to correct existing errors and update the GBF initially, and how will this be financed? What will be done in the Dallas-Fort Worth area in terms of GBF correction and update in this interim period before Census Bureau procedures have been formalized and distri- buted? Some of these questions will be answered in part in the papers that follow. Other similar questions have been broached at other conferences, and answers to those should help guide our efforts. Previous Conferences Many problems and needs can be discerned from the Wichita and Jacksonville conferences. A summary of proceedings at the Wichita Con- ference was provided by Dr. Robert T. Aangeenbrug. He presented the following con- clusions and suggestions: 2 We ought to demand from our agencies and from ourselves a statement of goals and objectives and relevance, if you please, of the goals and objectives. More information is needed about the Federal agency involvement and interest in GBF development, maintenance, and use. USAC should be brought into the conferences and the planning process. (USAC was repre- sented at the Jacksonville Conference, and they are represented at this conference as well.) We need more evaluation of ADMATCH. We ought to have a clearer statement of what I refer to as the developing opportunity of geographies. We ought to have some information about cost estimates. For the Address Coding Guide itself, we need a quality control and reliability study. We need discussion or research and develop- mental work in specific applications. We need to know more about the Metropolitan Map Series. These comments are presented in abridged and outline form to allow easy evaluation. Inter- ested readers should refer to the full text pre- sented in the conference proceedings. An expanded, detailed, quality- controlled Met- ropolitan Map Series should be possible. The extension of city-type addresses is obviously a topic that each of us will need to discuss. We seriously need to come back to the local users. I would encourage local groups in using the one interface for local data files that I think we are all very much committed to using, and that is the Geographic Base File. At the Jacksonville Conference Mr. Edward F. R. Hearle provided some similar comments and suggestions. 3 These included: GBF's are becoming the principal framework for urban information systems, both public and private. Information is being presented in a form for citizen use in public decision making. Nationally compatible GBF systems are necessary to permit interregional comparisons and data exchange and to permit transfer of computer programs from one city to another. The field of computer graphics is literally exploding, and this technology is enormously dependent on GBF's. Estimating factors, or proxies, are being de- veloped to allow local data to be organized and presented to illuminate on a continuing basis the phenomena that the Census Bureau gathers data about every 10 years. Cost benefits and results must be pointed out to operational agencies more than they are now. There has to be one principal agency in each metropolitan area that has responsibility for maintenance of the GBF. That agency has to be deeply involved with the users; it cannot simply be a technical custodian for the file. These comments are excerpts from the summary of the 1971 conference proceedings. State governments will likely become more involved with the program in the next few years. It is extremely important to synchronize the updating of the maps and the files themselves. Funding for GBF creation and maintenance must come from both local and Federal sources. Use of other computer systems should be sought when local equipment will not handle Census Bureau or other available computer programs. Address assignment in rural areas and stand- ardization in urban areas should be encouraged to provide a basis for GBF expansion. The Census Bureau's lead in eatablishing a GBF maintenance system is desirable to provide uniform procedures. More thought should be given to who in each local area should maintain the GBF. Procedures should be established to facilitate the exchange of experience among various cities in the use of the GBF, particularly in computer graphics and various software systems. In conclusion, I think we can appreciate the fact that much work has been invested in the development of the GBF's and that much remains to be done. Our efforts up to this stage of development have been devoted to building and structuring. Much of the technology essential for accomplishing our objectives has been avail- able and can now be utilized, and the local systems for initially putting together the basic materials have sometimes been well organized and supported. However, much remains to be done to provide a system for updating, to develop applica- tions and to provide a local organizational basis for continuing utilization. All of these subjects, while of great importance to Federal and State officials, are of extreme interest to local and regional organizations, both public and private. Our work in the development of the organizational base for GBF's and our efforts to make maximum . use in servicing constituent communities should pave the way for many years to come. OPENING REMARKS Robert Wegner Ladies and gentlemen, we are very pleased indeed to have you with us this morning. It is my pleasure to offer you a formal welcome on behalf of the Chairman of our Executive Board, Mayor Raymond Noah of Richardson, Texas, who sends his regrets that he is unable to be here to welcome you himself. Among the 150 to 200 councils of governments in this country today, not many have a president experienced in infor- mation systems. Mayor Noah has this distinction, among others, so his not being here is all the more disappointing to us. Let me give you just a few words in his stead. This is one of the problems we face, as you do in your work. We describe it in a way that may be different from the way you do; we say that our regional leaders customarily wear four or five hats. It is the putting on and taking off of these hats that poses one of the key problems we face in regional affairs. The first hat the regional leader wears is probably that of a family man, father, husband and so forth. The second, he wears as a professional or employee somewhere, the hat that he wears on his job. The next hat that he wears may be that of the locally elected official, or it may be that of a civic leader of some kind, or it may be in an avocational activity. In any event, his "regional hat" normally has about a fourth or fifth priority. Thus when we look at some of the achievements that we have had in the North Central Texas area over the past five years or so, we must take our hats off to the regional leadership for the support that they have given to regional activities. Regional leadership is the key, not only to the success of regional councils, but I think that it is a key to the effective operation of information systems; perhaps not as to GBF, as such, but certainly in the overall concept of information systems. Two other challenges facing regional councils today also relate to this conference. First of all, are we going to be able to develop compatible systems within a region of several hundred governmental entities of which only about 130 are regional council members in our 11 counties? Second is the challenge of representa- tion, and how we allocate resources and give out the rights and privileges to make decisions on these matters. This matter of representation has not con- fronted us in this region as much as it has in some others, such as Northeastern Ohio, where one regional council was denied recognition by some of the Federal agencies because its major city withdrew its membership. This is the voluntary nature of regional councils. It may be both a strength and a weakness, which also characterizes cooperative efforts with regard to information systems. Fortunately, the activities regarding geographic base files normally have not involved major policy decisions by elected officials, but have operated at the staff level. So if the staff people can find means to cooperate and coordinate their activities, it should be possible to establish some kind of compatible system on an areawide basis. Now the final issue facing regional councils, which I think also faces those of us interested in GBF and information systems in general, is the matter of adequate funding. We say here in North Central Texas that "our priorities are where the money is." That means, very simply, that we will reflect what those organizations who give us the money say we should be reflecting. Since 70 per- cent of our money comes from the Federal govern- ment, we, in effect, reflect national priority or priorities set by Congress as interpreted by the Federal agencies. About 10 percent of our budget comes from the State of Texas, so about 10 per- cent of priorities, roughly, are set by the State. The balance, around 20 percent, is set by our local member governments who provide the dues and support over and above dues. Because so many of our priorities are really tailored to Federal re- quirements, we find that we are responding more to the functional special interests in the Federal government than to our local major priorities. This means we have to devise clever means to tap resources from several functional programs in order to establish any kind of a workable continuing information system including GBF . This is about the only way we know to proceed because information systems as such have no direct source of funds; they are adjuncts to functional programs. Our major challenge is going to be to establish on a continuing basis-- as a part of normal operating systems and pro- cedures—the sort of material that you will be talking about here today and tomorrow. Your interest in looking ahead at how we maintain geographic base files is coming at a very interesting time. This week in Pennsylvania the Highway Research Board is holding its fourth national invitational conference about the trans- portation planning process. In the development of information systems in this country, the trans- portation planning process has been the key source of impetus to, and support for, information systems. Experts in the country are gathering in Pennsylvania to explore, "How to Organize for the Continuing Transportation Planning Process," hoping, thereby, to make the transportation plan- ning process more effective, to link it with other functions so it can be a part of a comprehensive planning process, to assure that it is indeed a continuing process, and also, to link it into the concerns we have now for environmental and social impact. This conference on geographic base files comes at the same time that the Department of Trans- portation is looking at its role in multi- modal planning; that the Department of Housing and Urban Development is looking at ways to make planning more properly, more actively and more effectively a part of the management process; and at a time of heightened criticism and re- evaluation of traditional concepts as to what planning is all about and more particularly, what comprehensive planning is all about. Within the context of this kind or re- exami- nation of what planning is, what planning should be and should do, it is most fitting to confer about the "Uses and Maintenance of Geographic Base Files." Let me wish you the best of luck. I hope that you will find this to be a worthwhile ex- perience, and we hope that your stay here in Arlington, Texas, will be a pleasant one. UPDATE AND MAINTENANCE OF CENSUS BUREAU GEOGRAPHIC BASE (DIME) FILES Morton A. Meyer Perhaps the best way for me to begin this brief talk is by announcing that the Census Bureau has now finalized the procedures which it will follow in the update and maintenance of its Geographic Base File System. Just so there is no misunderstanding of terminology, the phrase "Geographic Base File System" is newly arrived at, and is intended by the Bureau to be used as the generic description covering all of the Bureau's computerized geographic files. As you know, the most important of these files as well as those with the widest current usage, are the DIME or Dual Independent Map Encoding files. The DIME files are now in the process of being digitized, and all 195 SMSA areas in the DIME program are expected to be completed, including all necessary computer processing, by March, 1972. Again, so there is no con- fusion in terminology, digitizing means simply that the map node points are being assigned the following earth surface coordinates: Latitude and Longitude, State Plane Coordinates, and Map Set Miles. (Map Set Miles are x-y coordinates measured from an arbitrary point at the south- west corner of the Metropolitan Map Series coverage for each SMSA.) The first of the digitized DIME files are now becoming available with some 47 areas having been completed to date. The Geographic Base File System (which is the fruit of the Address Coding Guide concept developed for the 1970 Censuses of Population and Housing) is, essentially, a management tool. It provides the "framework" upon which a com- prehensive information system can be built, and it has its greatest potential at the local govern- mental level. Although, much has been written on the potentials of the DIME file as a manage- ment tool, it is just now beginning to be used for this purpose- -mostly for summarizing and analyzing the large volume of local, public and private data which contain a street address as part of the record. Stated simply, data which in the past have been too voluminous or geographi- cally complex to work with can now be "geo- graphically" organized; and by so-doing, a major step can be taken to make local data more usable and more understandable to those in decision- making positions- -whether these decision- makers are the mayor, the councilman, the director of the department of health, the president of the bank or the head of the local department store. However, the potential of the Geographic Base File System (that is, the DIME file) as a tool for organizing data can only be realized if the file is used. Unlike fine wine, the file does not improve with age. If left on the shelf to gather dust, the DIME files and the associated map series which delineate the area soon become out-of-date, and perhaps unusable. Urban areas are dynamic, not static in character. One of the major purposes of today's meeting, therefore, is to "spread the word:" to help those who have the need, to learn from those who have already taken the plunge into the techniques and the problems of DIME file management and use. To aid in this endeavor, as with past conferences, the proceedings of this conference will be pub- lished so that the information gained at this meeting can be transferred to other communities and organizations. The organization of addressed coded infor- mation into meaningful, usable, geographic units is required by Federal, State, and local programs. The home- interview surveys of urban transpor- tation planning programs, housing condition surveys, the location of school children by school 8 district, and the allocation of police personnel for maximum patrol effectiveness are examples of some of them. Unfortunately, (and the Census Bureau's delayed recognition of the need may be one of the precipitating factors) we find that even though many agencies and organizations are using the DIME files (and will be tying their local programs to the Census Bureau's program) no standardized approach to establishing geo- graphic files has, as yet, been available. In the coming year, therefore, we at the Census Bureau will expend a great deal of effort in promulgating a standardized approach to geographic coding, including standardized techniques for the update and maintenance of the DIME files. The major question, of course, "How will we do this?" Last spring the Bureau thought it knew some of the answers to this question. Today we, perhaps, know a little more, and we have some proposals to offer based on the comments made to us at the URISA and AIP conferences, and the continuing discussions and correspondence with various local agencies. Let me list the general framework of our proposal, after which I will discuss each point in some further detail. It includes, as a first step, the establishment of technical and data standards that will result in a reliable and use- able product for both the Census Bureau and the local community. Equally, if not more important, it makes provision for furnishing to the local agency the computer edit, update and maintenance programs which they will need to correct and maintain their own files. Regretfully, we cannot implement this offer immediately, but we expect to be able to offer these programs sometime in April of 1972. The Bureau also has available the edit packages it uses to detect and correct errors in the files. Unfortunately, these programs can only function on the Census Bureau's UNI VAC equipment, and the Bureau does not have available at the present time the resources to reprogram in other lan- guages. However, the documentation and logic for some of these programs are immediately available should some local agency having access to large- scale computer facilities wish to undertake the major programming job necessary in order to have these Bureau edits available for their own use. It is, perhaps, important to note at this time that the question has been raised as to whether or not the Bureau is working with the appropriate agency in any particular area, and if not, can the Bureau locate the local agencies which can make the most "effective use" of the file. In the past, while we have generally worked with metropolitan or county planning agencies, we have also worked with county engineers, or, as we did here in Texas for 15 of the 22 urbanized areas, the State highway department. At that time, our major concern was to estab- lish the Address Coding Guides that were needed for the 1970 census. But we are now in a recog- nizably different situation. Speaking frankly, the process of establishing a continuing update pro- gram demands the participation of a local agency which not only has the capability of carrying out the technical requirements of the program, but which also has the resources, financial as well as infor- mational resources, for updating and maintaining the files for the local area. Our current experience indicates that the agencies most actively interested in utilizing the potential of the DIME files are, in most cases, the metropolitan or county planning commission, or the local council of governments. You can expect, therefore, we will again contact the local planning agencies, metropolitan planning com- missions and councils of governments that co- operated with us in our previous programs and ask them to reaffirm their willingness to partici- pate in a continuing update program. In those urbanized areas where the original Address Coding Guide preparation was handled by a State agency, we intend now to contact the local metropolitan or county planning agency, or council of govern- ments to request participation in the program. We may also refer these "new" agencies to some of you in attendance here today for further expla- nations of both the potential and results of working with the Geographic Base File System. In this regard, it should be pointed out that while we have a strong interest in the maintenance of local Geographic Base File Systems (so that we may use the address coding portion of these records in the continuing activities of the Bureau), the proposed update procedures do not envision a rigid, inflexible system, identical in format, and in use, in every area throughout the United States. Rather, we think of a local DIME file as being constructed in two parts. One part will contain certain standard elements 'that will apply to all urbanized areas. The second part will contain local information and geographic elements which will vary from area to area, depending upon the local use of the file and local requirements. Standardization will be required only for those fields whose content must have a fixed definition (such as, street name, direction and type; potential address range; block and tract boundaries and numbers; minor civil division and place codes; and ZIP codes) otherwise the Bureau would be unable to utilize the local files in its maintenance pro- gram. Also, there are various programs which the Data User Services Office of the Census Bureau is developing for which standardization of selective geographic items in the record is a requirement. Some of these are: UNIMATCH - (Universal Matcher Program). A generalized record linkage system which compiles, as- sembles and executes a file matching system tailored to the user's specific tasks. CRAM (Computer Resource Allocation Model) which is based on the DIME file. DACS (DIME Areas-Centroid System) which calculates areas and locates centroids of blocks, census tracts, or other areas defined in a DIME file. The initial phase of the Bureau program is based on a multi- level operation geared to the specific situation existing at the local agency. Factors that will determine the maintenance system and agency can initially undertake include: 1. The technical capabilities available to the agency, 2. The extent to which local use is made of the file, particularly in regard to the frequency of the update operations, and 3. The geographic coding program or pro- grams in which the area originally participated. To those local agencies who are able to participate with the Census Bureau in the update effort, we plan to provide, along with a copy of the digitized DIME file, the following two listings, both of which are compiled from the DIME file: The "Segment Name Consistency Listing" and the "Coding Limit Line/Unmatched Segment Listing" together with necessary instructions for their use. The purpose of these listings is to identify certain types of residual errors in the file. The Segment Name Consistency Listing is a complete list of all unique names, for both street and non- street features, which appear in the file. The names are listed in alphabetic sequence with numbered streets appearing first. This listing can be used to identify and correct inconsistencies in feature names; for example, a GREEN ST which should be GREENE ST, etc. For each name listed, the corresponding computer record identification numbers are pro- vided, plus a list of the census tracts in which the named feature is located. These identifications permit localizing the error to a specific portion of the map and determining, thereby, what the correct name should be. The Coding Limit Line/Unmatched Segment Listing is (a) a list of the one sided segments which define the outer limit of the DIME file coded area and (b) for those areas that participated in the ACG Improvement Program, the nonboundary segments for which matching of the right and left block sides could not be accomplished and which, therefore, need correcting. For your further information the Bureau is currently developing a COBOL program to enable local agencies to correct the errors uncovered by these listings. This program, which will be called FIXDIME is planned to be available early next year. Agencies making corrections to their DIME files locally will have the option of providing their correction inputs to the Bureau on computer tape if it is more convenient for them to do so. After review and correction of the listings by the local agency, we are asking that copies of the corrections be returned to the Census Bureau for incorporation into the Bureau's files. Corrected DIME file tapes will then be returned to the local agency for use in the first update cycle. The Bureau's intention, limited at first to areas actually planning to update the files, is to then provide the local area with an Address Range Edit listing. The Address Range Edit listing is based on the address and ZIP code relationships of all the segments appearing in the file for each unique street name. To permit checking these geographic relationships for cor- rectness, all segments for each street name are arranged in sequence by "node chain," that is, each street is constructed (by the computer) following the same order in which the nodes for that street are delineated on the map sheet. Streets which contain segments with one or more discrepancies are flagged for review. The type of discrepancies which will be flagged include gaps in the address number sequence between adjoining street segments, overlapping address ranges, odd-even address mixtures in one segment side, inconsistencies in ZIP codes, etc. It is anticipated that these address and ZIP code errors (residual actually) will be corrected as part of the update operation. The correction of the files and the updating operations can only be carried out by the local agency, as only the local agency has the nec- essary sources of information to determine both the correction required and the file modifications 10 needed to incorporate changes in the physical features of the area (e.g., new streets, etc.). To assist the local areas in this effort, the Bureau is preparing a FORTRAN IV program called UPDIME (which we anticipate will be available in a production test version sometime next April) specifically designed to enable the local area to carry out these activities expeditiously. A description of this program is beyond the scope of this paper. In general, however, the pro- gram will make possible corrections and ad- ditions to the file, including the performance of all necessary topologic edits. I have included a brief description of the types and extent of the errors in the file and their causes as an appendix to this paper (Appendix 1). You will note that, by and large, the files are in pretty good shape. Concomitant with correcting and updating the computer tape file is the correction and updating of the map sheets. Each cooperating agency will receive a reproducible copy of the node dotted and numbered map sheets which they will be asked to review and update; that is, add new street develop- ment, delete paper streets still appearing in the file, correct street names, add or delete node dots and numbers, etc. When the file corrections or updated infor- mation are returned to the Census Bureau, the reproducibles containing the map changes will also be returned. The Bureau will in turn make a copy of the reproducibles for its own reference purposes and then return the original to the agency for use in the next update cycle. These maps will shuttle back and forth between the cooperating agency and the Census Bureau after each cycle. To test the update and maintenance system described above, the Bureau has selected some nine areas classified into two groups as follows: System "A" - System "A" areas will be pro- vided with update and mainte- nance procedures designed for agencies which will regularly utilize computerized techniques for their geographic base file operations. Such agencies will be supplied with all available Bureau edit and correction pro- grams and outputs and will, in turn, supply the Census Bureau with tape copies, in standardized format, of all corrections and additions they have made to their Geographic Base (DIME) File. The System "A" areas are Albuquerque, Dallas, Fort Worth, Tulsa, and Columbus, Ohio. System "B" - System "B" areas will utilize update and maintenance pro- cedures which have been de- signed to accomodate agencies whose currently planned use of the file does not envisage an immediate computerized update operation. For these agencies, clerical correction and addition techniques are being developed which will permit the local agency to record the changes taking place for subsequent sub- mission to the Census Bureau as the need arises for the computer files to be updated. A part of the System "B" test includes the up- dating of the Bureau files as rapidly as possible after receipt of the corrections and the return to the local agency a copy of this updated file. The System "B" areas are Evansville, Memphis, Cincinnati, and Hamilton- Middletown. At least one important question remains to be asked and answered: "Where is the financing which will enable local areas to carry out the DIME file update and maintenance programs?" I am sure that each of you has asked yourself the same question. Whether or not desired Federal funding for local agency update and maintenance operations will ever be realized is probably a moot question. But if direct Federal support is ever to be achieved, it will probably be tied to the establishment of a standardized nation-wide system of geographic base files. The Census Bureau does not have funds avail- able to defray the cost of local agency update and maintenance operations. That it would be desirable if the Census Bureau were able to supply some seed money for this purpose appears obvious; if only for the purely pragmatic reasons of self interest. The Census Bureau has a large investment in the Geographic Base File System and one of its objectives is the utilization of these files as input to future census operations. If, on the other hand, this were to result in a file maintained solely for the use of the Census Bureau, in a large sense one of the major purposes of the Geographic Base File concept would have been defeated; and that is its use at the local area level as, to quote Mr. Edward F. R. Hearle, the "principal framework for urban information systems, both public systems and private systems." This implies, of course, that the geographic base file concept can and must be considered in terms of the cost benefits which can accrue to municipal agencies utilizing the files. Simply stated, where the files are being used and found to be useful the funds required for update and maintenance will be forthcoming. I realize that this may sound somewhat naive, and perhaps overly optimistic as well, but I would point out in justification that the local agencies which are worrying most about funding for the update pro- gram are the same agencies which are pressuring the Census Bureau to provide them with the DIME files at the earliest possible date. In conclusion, I would like to read an excerpt from a letter recently received by the Geography 11 Division which I believe provides evidence of the soundness of the philosophy that I have just expressed. It reads as follows, "The (State) Department of Transportation has taken the posi- tion that the maintenance of the geographic base file is a direct function of the local planning agency. Where the local planning agency is presently using, or anticipates using the GBF as a base for a regional information system, we may participate financially in the maintenance of the geographic base file through our Operations Plan for the continuing phase of the transpor- tation study. In those local planning areas, however, where there has been little or no desire to use the geographic base file as a planning tool, we will not urge the planning officials, nor will we offer financial support to maintain the geo- graphic base file up-to-date." Question Period Mr. Amsterdam--Are there specifications avail- able on the various programs that you mentioned? Are there any reports on what they can do or what they will do? Mr. Meyer-- We brought with us specifications, including logic flow charts, for some of the pro- grams for distribution to the conference partici- pants. Dr. Aangeenbrug — I would like to pursue the question that Mr. Amsterdam raised on software development. As far as I know, several of the USAC cities and at least several of the metro- politan areas in the mid-west have developed some software, including point in polygon retrieval of records to test out their location within specific areas. I would like to urge us to address this question tomorrow during the open discussion session because there is a lot of duplication in this development. Some of the software is written in special purpose language, some is written in high level FORTRAN. One of the decisions, in part foisted on some cities, is to stick with COBOL, some to go with FORTRAN. I would hope that Mr. Crellin and others who are tied to soft- ware development would also address themselves to how we might get together fairly soon before we all reinvent various size wheels. We are willing to share our knowledge, and I know others have shared theirs, or we would never be where we are. But a sequel of this kind of conference might be some software exchange of programs already developed. Mr. Meyer- -I would like to reiterate that the Bureau will do what it can to assist in the further development of the Geographic Base File System. Unfortunately, there is no one perfect pro- gramming system. However, programs are re- quired as soon as possible, both to correct the errors that exist in the files and to update their geographic and topologic content. The programs I spoke about, FIXDIME and UPDIME, will be written in COBOL and FORTRAN IV, respectively. The important consideration is to get working pro- grams into the hands of users as quickly as possible. Hopefully, we will have a complete package available to local agencies for production test sometime next April. Mr. Crellin — The update and maintenance pro- gram, UPDIME, is written in FORTRAN IV. As for the status of the programs mentioned, such as DACS, the documentation and listings will be distributed within the next few weeks. These will be included in a publication entitled SCRIS RESEARCH NOTES which will include documen- tation on approximately 10 programs. A de- scription of UPDIME, which is being written by the Census Use Study, is included in SCRIS Report No. 8. This publication should be received very shortly by many of the people present. 12 UNIMATCH is also being written by the Census Use Study. Unfortunately, at this time no adequate write-up of this system is available. However, we expect to have a description by early spring since the programmer is presenting a paper on this subject at the Joint Sperry Computer Conference. Mr. Bieda — To what extent does Census or any other agency or level, have a compendium of sharing geo- location information, to follow up with what Dr. Aangeenbrug was stating? In other words, if I might ask the question of the partici- pants at this meeting; to what extent have you gathered a compendium of what is being done out there on geolocation studies? Mr. Meyer-- 1 do not know the answer to that question. Mr. Silver, do you know? Does such an information exchange exist? Mr. Silver--The URISA Special Interest Group on Geographic Base File Development was planning to put together this type of information and provide it to all of its members. However, at the moment, I do not know the status of this proposal. USES, MAINTENANCE, PROBLEM SOLVING-- THE DALLAS/FORT WORTH AREA Wayne Snyder James Prince William Parker Wayne Snyder 1 -- THE FORT WORTH-TARRANT COUNTY GEOGRAPHIC BASE FILE Like other cities undertaking the DIME project, Forth Worth was faced with the initial problems of recruiting and training a staff, obtaining office space and equipment, and supervising construction of the file. We were greatly aided in these tasks by the excellent support of the Census Bureau. Using the suggested test for screening of ap- plicants, over 100 candidates were examined and from this group four very highly qualified coders were hired. We were very fortunate in that these four coders stayed with the project through its completion so that retraining and other disrup- tions were avoided. We found, generally, that Census Bureau train- ing methods, operational procedures and written manuals were well thought-out and easy to follow. As specific problems arose during the project, Census Bureau support was timely and helpful. In summation, there were very few mechanical problems in preparation of the DIME system. The most serious problems which affected our program came under the general headings of maps, research materials and intergovernmental cooperation. Obviously, the key to successful coding of the DIME system lies in accurate maps. Our biggest problem was maps that were either inaccurate or not updated to reflect current street and non- J In addition to Mr. Wayne Snyder, Messrs. Wil- liam B. Schlansker, Kahn M. Husain, and Glenn Stewart assisted in preparing the report. street features. It was discovered, for example, that one large section of town had been mapped so that streets ran about 30 degrees off the actual physical layout. About one week of coding was lost while these maps were corrected and sub- sequent coding of the area was very difficult. There were, in addition, other instances of in- correctly drawn streets or streets appearing which had been abandoned or which never existed. The other major problem with the maps was that they were out of date. In a rapidly growing area like Tarrant County, it is impossible for the Census Bureau to keep up with the hundreds of block faces which are added every month. It was equally impossible for the coders, except from their own personal experience, to keep track of these new streets. Consequently, many new streets were simply not coded. From time to time, the coders were required to look up street information, such as, address ranges or street names- for questionable ACG records or for streets being added to the file. We discovered that in most cases there was no reliable and quick method of obtaining this in- formation. Sources used included street guides, city tax and utility files, field checks and electric utility maps. Each of these sources had its drawbacks. Street guides proved to be hardly more accurate than the census maps themselves. City files usually con- tained the mailing address of the property owner (which is often different from the physical ad- 13 14 dress); electric utility maps showed the physical address but were bulky (about 1000 sheets) and difficult to index; in addition, field trips were time consuming. Often a coder reached the point of diminishing utility in research and had to guess at street information. A further problem (which may or may not be common to other parts of the country) became apparent in determining street names. It is quite common in this area for major thoroughfares to have several "official" names (often found on maps) and several "unofficial" names (sometimes found on maps) even within one city's official limits. The problem is especially common as a street passes from one jurisdiction to another. It is difficult for coders working in different parts of the county to standardize these names. Even standardized names can be incorrect since, in many cases, neighboring parcels of land can take their choice of approved street names to use as mailing addresses. The Tarrant County DIME project involved coding streets for over thirty municipalities and large tracts of unincorporated land. In general, only the cities of Fort Worth and Arlington (covering about 70 percent of the county's popu- lation) had enough interest in the project as well as the necessary resources (manpower and infor- mation) to make a significant contribution to data collection. The seemingly simple matter of obtaining ac- curate corporate limit boundaries for the county required two man- months of work by city attorneys and the planning staff. Some cities were sus- picious of the project and others were simply not able to help. While every effort was made to code these cities correctly (using available reference material) time and circumstances simply did not permit active involvement for all municipalities, and the final DIME file suffers to some extent from this lack of cooperation. In summary, it should be pointed out that problems confronted in the construction of the DIME file must also be faced by those charged with its maintenance. At the present time we have no accurate estimate of our success in solving these problems or of how accurate and inclusive our file is. Should our file prove accurate within acceptable standards, then our methods of problem solving can be used for file maintenance. If not, new methods must be developed. Present Status and Use The present status of the Fort Worth file is that the address coding guide tapes were created, the DIME coding was completed and the city is awaiting DIME tapes with digitized coordinates. The anticipated uses of the GBF have never materialized. Under current conditions GBF- related projects are competing for funds and priority (and losing) and enjoying widespread disinterest at most levels of city government. Enthusiasm for the ACG-DIME projects was high in the summer of 1968 and again in the summer of 1970. During construction and ex- pansion of the GBF the anticipated benefits created interest in the GBF concept- -interest to the tune of $30,000 budgeted in the 70-71 budget year. These funds were appropriated to conduct a comprehensive, computerized land use study which was designed to utilize the GBF. Additional studies were begun in the Public Works Depart- ment to design a street inventory system using GBF and in the Public Safety Department to relate all police and fire statistics to geographic areas. Local utility companies and the Chamber of Commerce were eager to participate in building a comprehensive GBF. The utility companies provided, free of charge, listings of all accounts in Tarrant County, with selected account data which could be useful in community facilities planning. The Chamber of Commerce took steps to coordinate access to the county-wide GBF between the city and the various expected user groups (public and private) who would find the GBF framework useful. The city's Data Processing Department ex- perimented with the ADMATCH programs and wrote new software packages to better fit the matching systems to our own computer con- figuration. All this work was accomplished in anticipa- tion that the digitized DIME file would be available around January 1971; that city editing and up- dating of the DIME file could be accomplished and that land use, public works and public safety computer systems would be operational for testing by the summer of 1971. Cooperation was elicited from the utility companies and other potential users on this basis. However, to date, a final digitized file still has not been delivered by the Census Bureau, and consequently, even the most preliminary editing and updating has not taken place, let alone the installation of related systems. Because of the time lapse between the original plans for the use of the GBF and the receipt of a digitized DIME file, potential user groups have lost interest in the project. Uncertainty as to the delivery data of the digitized DIME file and the 15 usefulness of the DIME file, have led to de- emphasis of the GBF application in funding in the current year's budget. Many important programs were cut back or deleted because of the eco- nomic slowdown, and this unfortunately included funds for the GBF. Given the budgetary practices of the city of Fort Worth, the earliest and sub- stantial effort related to the GBF could begin would be October 1972. Any GBF work done this year will have to be accomplished on the coat- tails of other related projects, if any. Possible Future Expansion and Directions Several following prerequisites must be met before any meaningful steps can be undertaken toward expanding the current Fort Worth-Tarrant County GBF in order for it to become a truly centralized data base system for the future. 1. Identifying, editing, and correcting the errors and mistakes in the existing undigitized DIME file must be completed. The current file is almost two years old and obviously needs updating. 2. The digitized file, yet to be received from the Census Bureau, must be operational with some of the corrected planning data analyzed so that we can test the utility of the DIME system as a viable tool for storing and standardizing data for municipal planning and administrative purposes. 3. Appropriate measures must be undertaken by the City, the Council of Governments, and the Census Bureau for keeping the digitized file up- to-date by mapping and coding the areas which have been urbanized in the past eighteen months, and a permanent framework must be established for keeping track of the future expansion of the urban area on a continuing basis. 4. All municipal governments in Tarrant County and other public and semipublic agencies must agree to contribute data and funds for continuing expansion of the system. Should the above requirements be met, the City of Fort Worth can proceed with plans to (a) store existing data files based on the GBF; and (b) develop additional desired files for use with the GBF. Table 1 is a compendium of data elements already collected that are easily trans- ferable to the GBF system. The table attempts to relate available data to possible geographic reporting units in a manner which shows whether the data reporting is currently available (Avail.), possible if additional coding of data isdone(Poss.), or not possible given present data availability (--)• The data elements listed come primarily from five data systems now operational in this city; the land use-zoning system, the population pro- jection system, the vehicle registration system, First Count Summary Tapes from the Bureau of the Census and local utility files. The list of data items furnished in the table is by no means exhaustive. Other agencies or organizations could make use of the GBF system and contribute data and funds for continued expansion of the program. Other files of high priority which could be developed once the GBF had proven successful would include a street inventory, fire and crime statistics, utility distribution data, traffic counts and other data which should be analyzed by geographic areas. Maintenance and Updating At this time the city of Fort Worth is not prepared to assume the responsibility for main- taining and updating a countywide GBF. This does not, however, preclude Fort Worth partici- pating in an updating and maintenance program which may be developed on a regional basis. Should the Council of Governments establish a cooperative maintenance program in North Central Texas, the city of Fort Worth may well be able to participate in the provision of update information, manpower, and even limited funding in association with related city projects and activities. An update program sponsored or coordinated by the Council of Governments could provide a regionwide data base to meet the needs of all area agencies. A regional system would allow the much needed uniformity of data classification for comparison purposes. Also, system development and maintenance could be accomplished at relatively less cost to the individual agencies. Finally, a greater degree of coordination among all these agencies is the key to complete expansion of the current GBF for the benefit of all area development agencies in Dallas- Fort Worth. 16 o X Q. < o o Hi > OD CO a m t- < Ql 2 o o w Hi < o 5 I- cc o u_ C3 -p c >> >> >> d +j i-H iH r-l rH rH rH rH rH rH r-i t. c P •rl •H -H CO 01 -P -H •rl •H •H s- 3 h CO CO CO CO CO 01 u CO CO CO CO as d 3 > > > o > CO > > > > > H CJ a. < > P s § >. >, >, r-i r-< rH r-i rM rH r-i C CO P •H iH CO +J 01 ■H 4-> ■H •H •H a cj en 3 CO CO CO CO 01 OS rl CO CO CO as iH > > > CO > > > > d > r> o Si < ft < ft ft < ft ft ft ft a) i-l u d < c o . iH rH rH M ,-t rH r-i rH rH iH iH ■rt -H •H ■rl •H ■rl ■H tH •rl tH •H be ol al 01 CO CO CO CO CO CO CO CO CO CO CO 01 c os 5 > > > > > > > > > > > > > > > > > > > > > > > 03 O > > > > > > > > > > > > ■rl < < < < < < < < < < < < < o U &. •H a d u on ai o d U o pH r-l rH r4 r-i rH r-i r-i r-i r4 r-i P •H -H •rl ■H •rl •H -rl ■H •H o d CO CO CO CO CO CO CO CO CO CO CO CO V > > > > > > > > > > > > > CO < < < < < < < < < < < < < 01 3 +> r-l i-t rH r-i rM rH r-i r-i 01 O ■H •H •H •H ■H ■H •rl •H ■H ■H S « KS CO CO CO CO CO CO CO CO cO CO CO as S u > > > > > > > > > > > > > O H < < < < < < < < < < >«J < < U > > > > > > > > > > < < <: < < ft < < ft < < < <: f>> rH r-i r^ r-i rH r-i rH rH +> a •H 01 •H 1 •H •H oi •H •rl -rl d CO cn i CO 1 ' 1 OS OS 01 CO CO CO a o s > i > > > > > > > < ft < < < ft < < < < 01 >> 3 J* *-* r-i rH rH rH rH 01 CJ •H cn ■H 4-> ■H •H 01 -rl C CO oi i CO rl CO CO cn CO CO CO at r-l > i > CO > > > > > o co < ft < ft < < ft < < < < JC r-i *-< i-H rH rM r-i rH rH S>> o ■H oi i •H 1 01 ■rl •H co" -H -rl ■rl -P CO CO i CO 01 CIS CO CO CO CO 03 as •H i-( > o > o > > > > > > O CO < ft < ft < < ft < -p T-l t-i r4 r-i rH O 'H -rl I i •H 1 1 1 iH -H -H •H 3 T3 CO I i CO 1 1 CO CO 1 CO CO CO CO > > > > > > > > a. < < < < < < <: < < CO cn CO 01 ■H H-> C 01 CO ■H >> 01 < Q c s 4J > ■P £1 C-H ^H o C O CO CO •H 1 c •H 3 on CO z 4-» ■H ■p "3 p b o +J C 3 8 Ih c CO 1 CO tm •H 01 01 O rl Q CO on +-> i^ c rH a u E P 1 01 E-i ■P C CO 3 rH o on 01 CJ r-i a! •H rH P. t-t 0) s > CO w rl a •D 3 O H rH ■H o 3 r-i ■H O ft ft 01 ft c rl & V PH V cj I & J3 8 01 H 0) ft •s 1 ft N 6 CJ W OS O a T3 CO d CO p ■H H-> ■P ■p c 3 o rl b ft CO 3 rl P J3 ■H . 3 CO 01 rl J CJ ft CJ ft O E-i J S CJ =c > CJ o 01 IH •p 1 u CD CO 01 CO as 1 tJ J3 3 o 01 C «-( o CO CJ 01 § ■a ■p rl -c g c cO p 0> CO o on a 4H on u rG -H N O ft 0) CJ c 01 a P r-i T! tn n a c •D c u a) rl 91 ft 01 cn as 01 as a CO N a ■H tn 01 -p p ■H ■H Ih P c tn n CJ rl rl ft 0) o ■rl a> ft rH +J P 3 rH CI a 01 0) 01 rl u cn 3 ft cn CO O ■H ■H -p a ft X % rrt H H Q CO -H ■r-i rH T5 bl) ■H •o CO a > 0) at . ■rt rH 0> cn P rH tn a Al 01 ■rt ft s* 01 h tn p 3 n o CJ o. c tn tn m ■rl •H ■H 01) fin h(l a c c ■H •H ■P ■p P tl S-, rl o a ft ft 01 0) 0) u u rl a d d p as CO d CJ Q CJ r-H •H CO cn > p 1 < ft James Prince^-THE DALLAS GEOGRAPHIC BASE FILE Problems and Uses Perhaps this paper can begin with a clarifica- tion of terms: What exactly is a geographic base file (GBF)? In the sense we are using it here, a GBF is a computer- readable file that contains an inventory of the geographic base of a city, in- cluding streets, blocks, etc. This describes the address coding guide, but generally speaking, when the phrase GBF is used something more substantial is indicated. In fact, a GBF is the representation of the basic features of a map in computer- readable form, including rivers, rail- roads, and x-y coordinates as well as streets and blocks. This definition describes the GBF referred to in this presentation. Approximately four and one half years ago, the City of Dallas Planning Department took the first step toward the construction of a GBF by re- tracting Dallas County in preparation for the 1970 census. Shortly thereafter, updating procedures were begun for the Metropolitan Map Series (MMS). The Address Coding Guide (ACG) was prepared in 1968-1969 and the coding of DIME files was completed in the summer of 1970. The next step in the process will be for the Bureau of the Census to return the digitized DIME file, thus giving the City of Dallas a workableGBF. One difficulty encountered in the work done to date has been in working from the guidelines supplied by the Bureau of the Census. As work progressed with the updating of the MMS, we found that considerably more time was required by our staff than had been indicated in the initial guide- lines. This was a result of the errors which existed in the maps supplied by the Census Bureau. Another problem encountered by the Planning Department was in the acquisition of data from some of the smaller cities in the Dallas SMSA. Additional effort on the part of these cities and more support from regional organizations, such as the North Central Texas Council of Govern- ments, might resolve this problem in the future. In the period between encoding of the DIME file and return of the digitized version from the Census Bureau, the City of Dallas has been able to use the Address Coding Guide in lieu of the GBF to meet geocoding needs. The Planning Department has on two occasions utilized OS ADMATCH and the ACG to geocode city data files. The first use was in assigning census tract, block numbers, and ZIP codes to the 1970 Dun and Bradstreet Business Establishments file. The other effort was in geocoding building permit records covering the period from 1950 to 1970. Both of these projects were very successful and accomplished in a few months at relatively low cost what an army of clerks would have taken much longer to do. Beyond overcoming initial problems in master- ing the capabilities of OS ADMATCH (the documen- tation was a preliminary release and, as such, very poor), the only problem encountered in the geo- coding efforts was a lack of quality control in the ACG. We found information missing, incon- sistencies and errors in spellings, invalid census tract numbers, unnecessary abbreviations, and a lack of building names, shopping centers, and air- ports in the ACG. Overall, however, the ACG was adequate for our purposes and, where possible, was corrected. The Future of the GBF in Dallas 1 In addition to Mr. James Prince, Mr. Michael Kennedy assisted in preparing this report. The city of Dallas is currently involved in the '.development of a new base mapping system. This system will provide maps for the entire city at a scale of 1" = 200' at the greatest possible level of accuracy. The information to be shown on the maps will consist of all streets, water features, natural barriers, railways, lot lines, and rights- of-way. Our contracting company will digitize and store all information shown on the maps. The digitized data file will be updated on a weekly basis, as will the cronaflex master copies of the maps themselves. In this way the city of Dallas will always have a means of acquiring the most up-to-date map and/or printout of information available. In addition to the benefits derived by the city itself, other participants in our program, such as the water department and possibly one or two of the other utilities, will have access to the maps and to the digitized data file. Of course, any of the participants may retrieve from the file a computer printed map showing any combination of data. 17 18 Eventually, our data file will include all infor- mation from the city, plus that pertaining to the participating utility companies. The most logical link between our system in Dallas and the DIME file of the Census Bureau can be found in the updating procedures. The city of Dallas will have a detailed and precise main- tenance program for its own base mapping system. As long as our data is filed using the same geo- codes, census tracts, etc., we will be working with compatible systems. In this way, Dallas can participate in keeping the Census Bureau's GBF up-to-date with little additional work on the city's part. The information which we maintain for our base maps can be used directly to update the GBF. We can therefore provide current data concerning streets, tracts, rights-of-way, zoning, utilities, etc. to any requesting agency. offices. A few of these warrant some discussion here. 1. How should we as a city planning group treat the problem of inaccuracy of the Metro- politan Map Series? The MMS maps are not sufficiently accurate for our particular needs. Our new base mapping system will be much more accurate and will satisfy our needs for planning and operational tasks. 2. As long as we supply the Census Bureau with updated information, does it matter to the Bureau whether or not we update the MMS maps? If the MMS maps are not used by Dallas, should the city be expected to spend time and money to update them? We will already have more accurate maps of our own. In essence, Dallas will have its own brand of GBF based on its new base mapping system. Most of our needs will be satisfied by our own data file, while at the same time we could help main- tain the Census Bureau's information file and make use of it, especially on a larger area basis. This brings us to the subject of a regional GBF. Our local area would benefit greatly from a centrally based data file which would be maintained by all area communities. An agency, such as the North Central Texas Council of Governments, could maintain the GBF for this local area with help from participating local governments. The uses which Dallas might make of the DIME file are numerous and probably limited only by funding and imagination. In connection with an information system, the GBF can be used to retrieve information by selected areas in the city through the specification of grid coordinates. The GBF will also allow distance and area measure- ments, the ability to list streets and address ranges in a given area (useful for zoning cases), and computer mapping through the use of SYMAP and GRIDS, the recent Census Bureau product for which Dallas has become a test city. All these possible uses are in addition to the "bread and butter" capability for address matching and geocoding. GBF Developmental Considerations A number of questions concerning theGBFand the city of Dallas have arisen within our city 3. Will the GBF be useful to information systems being developed by the city of Dallas? The city is developing an information system to be based either on the census block level or on the parcel level. The GBF would be workable either way. Our new base mapping system will be digitized at the parcel level. 4. Will the Council of Governments assume the responsibility of maintaining and updating a GBF for the North Central Texas region? Such an arrangement would provide ready access to area information and would simplify the participation of cities in the maintenance program. 5. Will the participating cities and the Council of Governments receive any financial aid to defray the costs of updating the GBF for the Census Bureau? Funding will no doubt be one of the major questions to be resolved in relation to establishing a maintenance and updating program for the GBF in this region. I would like to say at this point that the preceding questions and comments are not meant to imply that Dallas is not interested in the maintenance program, merely that we have had questions along these lines regarding means, uses and finances. The answers to these questions, I feel, are needed before we can fully evaluate our particular situation. Once we have answers to our questions, I am sure we cando a better job of answering the Census Bureau's questions. Mr. William Parker 1 - USE OF THE DALLAS AND FORT WORTH GEOGRAPHIC BASE FILES IN THE REGIONAL TRANSPORTATION STUDY In cooperation with the cities of Dallas and Fort Worth, the North Central Texas Council of Governments 2 (NCTCOG) has undertaken a Re- gional Public Transportation Study to develop a comprehensive public transportation systems framework for the region which will insure a public transportation system incorporating alter- native modes of travel. The area coverage of this study coincides with that of the Dallas- Fort Worth Regional Transportation Study conducted by the Texas Highway Department from 1964 to 1967. This area includes 2600 square miles, covering all of Dallas and Tarrant Counties and parts of seven contiguous counties. As part of the study, subregional studies are being conducted for the central business districts of Dallas and Fort Worth to prepare plans for public transportation subsystems within these areas. Another subregional study will prepare plans for the area between Dallas and Fort Worth, known locally as the mid-cities area, to provide for movement to and from the regional airport which is now under construction. Although the primary objective of the Regional Study is to seek alternatives to an auto-only trans- portation system for the North Central Texas region, another important objective is the develop- ment of a transportation information system that will permit continual evaluation. Governmental entities in the region will be the data sources for this effort; required types of data will include land use, population and employment estimates, socio- economic conditions, travel patterns and data on transit operations. An important part of the Regional Transporta- tion Study involves acquiring and analyzing socio- l In addition to Mr. William Parker, Mrs. Bar- bara Kaplan and Miss Gloria Eyres assisted in preparing this report. 2 The North Central Texas Council of Governments region is composed of 11 counties with a popula- tion of over 2,000,000. The Council was estab- lished in 1966 and has become one of the most active in the State, with projects being developed in over 20 program areas. The major cities of the region are Dallas and Fort Worth. economic data and trip data so that future travel patterns can be predicted. The Geographic Base (DIME) files provided by the cities of Fort Worth and Dallas are being used in the analysis of employment data on which to base estimates of trip generation. The study area was divided into approximately 500 regional analysis areas for the purposes of the Transportation Study, and the goal in employing the DIME file is to assign employment data in 16 SIC categories to regional analysis areas. In developing these regional analysis areas, the transportation planners at the Council of Govern- ments sought to establish areas that would be coterminous with census tracts and Highway Department traffic survey zones, and that would also correspond with city boundaries, major thoroughfares, and natural boundaries wherever possible. Traffic survey zones, of which there are some 6000 in the study, were originally established by the Highway Department for the 1964 Transportation Study; new construction and other changes sometimes made it impossible to aggregate these zones to satisfactory regional analysis areas. For example, the LB J Freeway in Dallas did not exist in 1964 when the traffic zones were drawn. As a result, traffic zones tend to zig zag across LBJ. In this instance, regional analysis areas were constructed based on census tracts following LBJ alone. However, in most cases, regional analysis areas are aggregates of or coincide with traffic zones and census tracts, and in all cases combinations of two or three analysis areas are contiguous with tracts and traffic zones. This allows the aggregation of tracts and traffic zones to regional analysis areas for checking purposes. Data and Reference Files The primary data sources used in establishing employment figures for the study have been the accounts of the Texas Employment Commission (TEC) and United Fund files. Dun and Bradstreet files serve as backup for TEC files. The employ- ment accounts maintanined by the Texas Employ- ment Commission comprise two files: (1) account 19 20 number, firm name, and employment data; and (2) account number and address. Both files were obtained for the months of January, February, and March 1970 for comparison with data from the same period in 1964 which was used in the study conducted by the Highway Department. A problem in the use of the TEC files im- mediately became apparent. The addresses main- tained in these files are not necessarily the same as the street addresses of the reporting firms. A listed address may be the address of whoever mails the employment reports to TEC (for example, an accountant); or, the available address may be for the headquarters of a firm with multiple locations. Also, TEC data does not cover employers with fewer than four employees and does not include public agencies and non- profit organizations. Further, TEC may maintain one account listing no matter how many locations a firm has so that, for example, there is only one listing for "Buddies" supermarkets even though there are 38 such markets in Tarrant County. The transportation planners first hoped to make TEC a reference file for the project. Had this been possible, only the businesses not re- ported with TEC would have had to be researched and added to the TEC files. However, this ap- proach was not workable because account and addressing ambiguities prevented the files from being manipulated in the necessary ways. The planners thus went elsewhere. They turned to Southwestern Bell and General Telephone as the best sources of the most complete listing of employers in the study area. The combined files from both telephone companies became the ref- erence file for the study. Numerous problems also arose in using this file. Addresses had to be corrected in cases of post office boxes, rural routes, building names, etc. Records for multiple locations had to be added and other records deleted. However, this file of 70,000 records was more flexible than the TEC records and, therefore, better suited to the needs of the Transportation Study. The file has been updated once and is now ready for the second update which will produce the list of employers to be used in the final report. Employment data from TEC files is being added to Southwestern Bell records manually. In cases of multiple locations, TEC figures have been split and distributed to the multiple locations. In cases of fewer than 50 employees, employees have been divided equally among the firm locations; when there are more than 50 employees, individual decisions have been made by the study directors, and often the company has been called to obtain an employee breakdown. In some cases it has been necessary to aggregate TEC data because some firms with multiple locations have reported more than one SIC classification, each of which could be distributed to more than one location. Dun and Bradstreet and United Fund tapes were acquired and used for backup and checks. Dun and Bradstreet tapes were used as a backup for the TEC tapes when the latter did not list the employer in question. Firms based outside the intensive study area may report employment from an out- side central office; employment data for such firms would thus not be included in local TEC files. Sears is an example of a firm that does not report to either the Dallas or Tarrant County areas. Also, Dun and Bradstreet employment information is tabulated by indifidual address whereas TEC data may include total employment for all loca- tions. United Fund employment data were used to supply data on public agencies, nonprofit organiza- tions, and the military. In cases where no em- ployment data were available and no type of business could be determined--usually this occurred in cases in which there were fewer than four employees--a service SIC code was assigned and each firm was assigned two employees. When corrections in the data files have been made, ADMATCH will be used with the DIME file to assign employers and employment data to regional analysis areas. Once this is accom- plished, countywide employment totals by SIC codes will be checked against totals in County Business Patterns published by the Census Bureau. An additional check is available in the yearly estimates of county employment totals made by the Texas Employment Commission. If county totals do not check, the totals in the regional analysis areas will be reviewed. Area totals found to be out of line for a particular SIC code will be adjusted by the addition of dummy records to add or subtract employment in that particular SIC code. At a later date adjustments will be made to individual employers. In preparation for this final phase in estab- lishing employment figures, the DIME file supplied by Dallas and Fort Worth has been edited and manipulated to meet the specialized needs of the study. We are working with a preliminary file that has not been edited or digitized and many errors were found. For the purposes of the trans- portation study, the DIME file was edited to delete nonstreet features and all streets with zero addresses. The file was also split into right and left records for use with ADMATCH, and in con- junction with this, an account number was gener- ated for each record. Each number is equal to the record number of that record on the original DIME file and will allow location of records in the 21 file. Each account number includes a digit that identifies the right and left sides of the records. 3 The file also had to be checked for the position of the high and low addresses and for mixed odd and even address ranges. Before the file was split, records were adjusted to show proper odd and even addresses by adding to or subtracting from the given addresses. Records that were too badly mixed were rejected. Use of ADMATCH ADMATCH was used to preprocess the DIME file, and in accomplishing this many typical pro- blems arose because of differences in what ADMATCH had been told to expect and what was actually on the file. For example, Arlington has a street known as Avenue H East. ADMATCH had been told that "H" meant "highway" so with the logic we had given it, "Avenue" was read as the street name. Of course, ADMATCH can be pro- grammed to recognize this case, but at the expense of not recognizing "H" as "highway." In the case of Avenue H, businesses with addresses on this street will be assigned to analysis areas manually. We have found that although ADMATCH is a very flexible package, it still will not do many of the things we need to have done. ADMATCH has been used twice with our Southwestern Bell data. The first pass was at a 99 match level and was made before we had corrected many of the problems we knew existed in our data files. At this level 33,000 of 53,000 records were accepted for a match rate of 62 percent. The second pass was at a 97 match level after some corrections had been made in the files. This level of match provided much more realistic acceptance figures--49,000 of 63,000 records were accepted for a 77 percent match--and will probably be the level chosen for our final match of em- ployment data to regional analysis areas. Records with zero street addresses that were rejected when the file was edited either do not have street addresses or the addresses were not coded. Businesses in these locations either have been assigned an address or it is known that addresses do not exist. These businesses will not match with a record in the file and will, therefore, be rejected. Records which are rejected during the matching process will be run through a program which will be written to match addresses in specified ranges or particular account numbers to given analysis areas. The remaining businesses will be manually assigned to analysis areas. 3 Before editing there was a total of 233,970 split records. Of these, over 13 percent were dropped because of zero addresses. Most of the computer work required for this ef- fort is being done on an IBM 360/40 at the city of Dallas and the data services personnel at that city are doing most of our programming. Maintenance and Updating The Transportation Study found errors and gaps in both the Dallas and Fort Worth DIME files. We would like to have had these files corrected and updated before beginning our study, but the cities did not have the right people avail- able at the right times. Therefore, we had to work with preliminary versions of both files. We have edited the file to remove incorrect records. All businesses in these locations will be manually located in an analysis area. A major problem in maintaining and updating a DIME file is establishing a standardized re- porting system for changes. Many smaller cities may not have updated maps for use in maintaining the GBF. Even in larger cities that attempt to keep current records, it may be difficult to up- date the file because of poor communications and lack of coordination. In some cases, developers may build new streets without properly reporting this construction. One possible approach to the problem of up- date and maintenance is the use of aerial photo- graphy. By February 1972 the Regional Trans- portation Study will have available a series of aerial photographs of the North Central Texas region. These photos will be printed at the same scale (1" = 800') as the Census Bureau's Metro- politan Map Series, and it will thus be possible to overlay the aerial photos with transparencies of the Metropolitan Map Series in order to compare street features. This should make it possible to discern new streets, additions to previously existing streets, and any other alterations in the street pattern that will have occurred in the period between the preparation of the Metropolitan Map Series and the taking of the aerial photographs. Once these changes in street features have been determined, the next step would be to contact each of the cities where changes had occurred in order to gather information on street names and address ranges. Following this, node numbers could be assigned as necessary. This procedure could be followed on an annual basis since a new set of aerial photo- graphs will be available each year as part of the data needed for the Regional Transportation Study. 22 Of course, this aerial photography approach to updating and maintenance is not problem-free. Many of the same difficulties resulting from poor record-keeping practices which currently com- plicate the updating process will not be cured by the use of aerial photographs. Nevertheless, this approach does hold promise because it should obviate the need for costly and time-consuming field visits; it will eliminate the "paper street" problem; and it will insure that the alignment of new street features is accurately recorded on the Metropolitan Map Series. In any case, the aerial photographs of the North Central Texas region will be prepared because they are necessary for the Regional Transportation Study, and their use could considerably reduce the costs of updating and maintaining the DIME file for the Dallas- Fort Worth area. Future Plans and Uses for the DIME file For the purposes of the Regional Transpor- tation Study, we would certainly like to see the DIME file maintained and updated. As a major user of the file in this region, the Transportation Planning Department of the North Central Texas Council of Governments views the DIME file as an integral part of the data base and information system which is essential to the preparation of comprehensive long-range transportation plans and programs. We realize, however, that the use of the file should not be limited to the Trans- portation Department of the Council of Govern- ments. The DIME file could and should be a means for providing geographic aggregations of information to the municipal decision- making process as well as to the planning efforts of those municipalities. For all these purposes the file must be updated and maintained on a periodic basis. Since the DIME file and the other files in the information system will be used in the preparation of substantive studies, plans, and programs, additional funding for these projects should be sought to cover the cost of updating and main- tenance. If this approach does not prove to be successful, then alternative means of financing will have to be explored. In any event, we do regard DIME file main- tenance as a priority activity; and we feel that at least in this region, the Council of Governments as the regional planning agency is in the best financial and organizational position to coordinate its up- dating and maintenance. However, the participa- tion of local governments in the region will be essential, and to this end a series of meetings with representatives from the Dallas and Fort Worth City Planning Departments and COG Staff is contemplated in the near future to discuss plans for cooperation and coordination of regional efforts toward updating and maintenance of the DIME file. Question Period Dr. Stevens--Before we get into some questions, I was wondering if Mr. Meyer would like to treat some of the questions that Mr. Prince has raised regarding the Census Bureau's involvement? Mr. Meyer--The first question was, "How should we as a city planning group treat the problem of inaccuracy of the Metropolitan Map Series, recog- nizing that the MMS maps are not sufficiently accurate for our particular needs?" It was also noted that the city's new base mapping system will be much more accurate and will satisfy the needs for planning and operational tasks. The second question, which also relates to the maps, is "As long as we are giving the Bureau updated infor- mation do we also need to update the MMS maps?" Briefly, the answer to both of these questions is, "Yes." I say this not because of the Census Bureau's interest, but because of what I conceive to be the local area's interest. If you think about it a moment, you will realize that the MMS maps are the geographic base which defines the areas for which the Census Bureau collects, tabulates and publishes statistical information. If the maps are in error, the statistics which the Census Bureau publishes will repeat the error. For example, if tract boundaries are delineated incorrectly, the statistics provided will relate to the area defined by the erroneous tract boundary. It will not relate to the data being produced locally which, purportedly, would cover the same area. It should also be noted that the MMS map series are designed only for "statistical" accuracy. If "engineering" accuracy is required, they will not serve the purpose. The same point may be made regarding the coordinates assigned to the map nodes. They will permit a "statistically" satis- factory computer representation of the map. They were not intended to do more. 23 I would also argue that in the area of Geo- graphic Base Files, uniformity and compatibility across the board - certainly as regards the basic items of information and the use of "standard" computer routines - are a virtue. It would seem to me that an area can only do itself a disservice if it isolates itself from the main stream of activity. In addition to the above points, the Census Bureau is also beginning to utilize the GBF system "in house" for a wide variety of surveys. The forthcoming economic censuses, covering calendar year 1972, will use the GBF to code the business and industrial establishments to geo- graphic location. As a result, it now becomes possible to publish economic census data by tract and the Bureau is planning such a publication for cities of 500,000 or more population. For the smaller tracted cities, (in coding guide areas) although codes will be assigned down to the tract level, the data will not be published. Nevertheless the data is available, and within the limits of disclosure, it could be produced at cost in the form of unpublished data. In fact, it might even be possible to tabulate the data for selected geo- graphic areas as specified by the city. But, I think I should add that "tailor-made" tabulations of this type are quite expensive. I think I have probably extended my reply to answer the third question also, which is "Will the GBF be useful to information systems being developed by the city of Dallas?" In summary and to point out the obvious, census data, whether derived from the decennial, economic, or agri- culture censuses, or from special surveys, are available from no other source. If you hope to tie even some small part of this vast wealth of information into your local system, the common key and future "open sesame" are the GBF files. Mr. Johnston- -I would like to ask this question about the information system that Dallas is preparing based on the 1" = 200' maps. Are you planning to digitize the centroids or the boundaries of the parcels? In either case, how do you plan to update parcel splits and maintain historical records on those? Do you have any ideas as to what will be the full cost of this operation? Mr. Prince- -I do not know that I can answer every portion of the question. Our updating procedures for our new base maps will consist of working directly with the subdivision and tax offices to re- cord daily and weekly changes concerning indivi- dual lot lines. We are not trying to record on our base map the update of property ownership. While this will be done for the city's own records, the digitized information will be concerned primarily with lot lines and the changes which are necessary for repairing the daily subdivision operation. We do have a preliminary cost figure on this which could run in the neighborhood of $40,000 per year. This figure reflects costs related to the mapping system, and there will be other operating costs incurred by the city not directly related to the mapping system. Our updating procedure has not been finalized; therefore, we do not yet have a firm cost figure. Mr. Treichel--I have two questions, one for Mr. Prince and one for Mr. Parker. Mr. Prince, you mentioned that the utilities would play a part in using the GBF when you finally get it in shape. Do you foresee the distribution systems of utilities - sewer and water lines, power grids, etc., - be- coming part of your files in a digitized fashion? Mr. Prince--Yes. Right now what we have tried to do is get the utilities involved in constructing the base map, primarily from a financial stand- point. All base map data will be digitized initially. As additional users enter into the main- tenance program, they will have their particular data files digitized. This includes the major utility companies in Dallas. Mr. Treichel--My question for Mr. Parker is based on the sensitivity of employment data. You are collecting information that is in some cases extremely sensitive; that is, sensitive in terms of publication. What problems have you run into with that, and how are you protecting the sensitivity? Mr. Parker- -There definitely are disclosure re- strictions on much of the information that we are using. The data itself will be used primarily to give us a picture of the distribution of employ- ment in order to develop work trip patterns for the region. No employment figures will be pub- lished, so there should be no disclosure problems. Dr. Aangeenbrug--I have several questions. The first for Mr. Prince with regard to the accuracy of thel" = 200' series maps. What order of monu- mentation exists or is planned; and second, what is the cost of this mapping project and who is paying? I am a bit afraid that you are leading a lot of users astray as to the mecca of accuracy that is coming, when, in fact, thel" =200' maps may not be used ever by a civil engineer. Mr. Prince- -I might preface the answer by saying that we are working with the contractors to pro- vide us with an orthophoto base mapping system. They are going to use for control the existing USGS control points. They will establish, as needed, additional control points throughout the city and will monument these within the cost of the total project. 24 Dr. Aangeenbrug--What level of order are we talking about in terms of monumentation? Mr. Prince- -This is second order. You asked about the total cost of the project. Initially we are involved in contracting for mapping of 1/3 of the city which covers the central business district and the immediate surrounding area, basically a 64,000-acre area. We will have the option to obtain the maps of the rest of the city as needed and as money becomes available. The cost of the initial 1/3 is estimated at $140,000. The total cost of the entire city with the options run slightly over $300,000. These figures include all digitizing. Mr. Renshaw--! am interested in Mr. Parker's success in developing employment data from three files. We have always had different sources and resolving problems of different definitions of what constitutes a "firm." Mr. Parker-- We have had a tremendous amount of problems with that. First of all, this information is equated in all three files-TEC, Dun and Brad- street (D and B), and United Fund - on the basis of employer name and SIC classification. In one file an employer will report a certain number of employees under one SIC classification. This number of employees and the SIC classification may vary in another file for the same employer. To handle the differences we set up a priority system. For employers with only one location, we used the listing in TEC. If it was not listed there, we looked for it in D and B. Employers with multiple locations were more difficult. We tried to use both TEC and DandB to get accurate information for all locations. If we were not able to do that then we called the employer for the information. It was really quite a tedious process. USE OF A GEOGRAPHIC BASE FILE IN RELATING HOUSING INFORMATION Dr. Endfred J. Lundberg Background The Institute for Urban Information Sys- tems of the University of Cincinnati had developed several large files of local data for the purpose of analyzing the process of "community change." Most of the files are dynamic and self renewing through data flows established in several research projects. The files are interrelated through geographic identifiers, the most useful of which to the Institute is the street address. For the time being, data relationships are being created only for census tracts and political sub- divisions. At the time the Institute decided to tie the data files together spatially it was necessary to create our own local address coding guide (1967). The guide was developed by drawing street names and number ranges between intersections from the customer account records of the Cincinnati Gas and Electric Company, relying primarily on the addresses of electric service accounts. To this original file were added the street name and number ranges available from the Engi- neering Offices of the City of Cincinnati and Hamilton County, Ohio. Addresses of new and changed construction as well as demoli- tions were drawn from a building permit application system which covers 67 juris- dictions in the Cincinnati metropolitan area. The initial ACG enjoyed completely controlled input of street name spellings and numbers. The development of this homemade ACG was a harrowing experience. Due to the mediocre computer facilities at the Uni- versity at that point in time (since vastly improved), it was necessary to do all ad- dress matching and records searching by tape which is a tedious and frustrating procedure. The first comparison run that was made produced a 40 percent error rate. Subsequently, the error rate was reduced to about 3 percent; a very recent run has produced a .8 percent error rate. Originally, the structure of the local ACG was designed to be expandable by adding identifiers for school districts, police ser- vice areas, generalized hospital service areas, and other unusual but useful codes which normally would not be found on the Federal Address Coding Guide. To do the address matching (long before ADMATCH was released by the Bureau of the Census), a set of address-matching computer programs were written by the Institute. The tract-oriented index (ACG) and the package of computer programs were called SAMI or Street Address Master Index. We expect to use ADMATCH when its full operating system version is released for IBM equipment. It is not presently worth our trouble to convert the disc operating system version to fit our computer configuration. The updated master ACG is undergoing reformatting so that it will be usable with either the SAMI programs or the ADMATCH package. Both formats will be maintained by the Institute. The SAMI format has been used by several local organizations to tract their own internal files. It is anticipated that this kind of use will expand considerably in the next two years. The Census ACG has been put under the control of our file management software (Generalized Information System/IBM) as a searchable, updatable base file. This ar- rangement will allow us to change the Address Coding Guide quite easily when the annual 25 26 updates are received from the Cincinnati Gas and Electric Company and the buildingpermit files. The DIME system will also be added to the Institute's repertoire of geocoding techniques after the current work in learning to use the ACG and maintaining it at a high level of accuracy has been completed. II. Experiences with the Census ACG When the ACG from the 1970 census was released for the Cincinnati SMSA it was compared by computer with the local ACG, represented in the SAMI system. An over- sight in programming caused a listout of 13,000 differences on a base file of 49,000 records. We discovered that the compari- son program did not treat the suffixes from the two formats in the same way. This was corrected and subsequent comparisons runs created a true exception list which amounted to some 14,000 items; 7,000 records which were in one ACG file but not in the other, and 7,000 minor differences in similar records in both files. Major software in use includes full Oper- ating System (O.S.), Multiple Variable Task Mode (M.V.T.), the Houston Automatic Spool- ing Procedures (HASP), Time Sharing Option (TSO), Conversational Remote Job Entry (CRJE V), and numerous special application packages. A larger computer configuration is on order. III. ACG Updating An especially good source of address changes and additions to the housing stock is available in the Cincinnati area for frequent updating of the master ACG street name and number list. This source is the Cincinnati Gas and Electric Company which makes its customer records available each year for this work, as it did in constructing the local version of the ACG. New electric service connections, disconnects due to demolition, and changes due to alterations of buildings are readily identified through their records. Building permits also can supply some useful information in ACG updates. The next updating of the local ACG will take place in January, 1972. The cost is expected to be nominal. Through a modest contract with the Com- munity Renewal Program of the City of Cincinnati, the Institute hand checked all 49,000 records in the Federal ACG. The 7,000 name differences between SAMI and the Federal ACG were reconciled into one master list, to which the other 7,000 small items were added as corrected. About 2,400 other differences in number ranges have yet to be made, by hand. The new master ACG now has some 56,000 records reconciled to census tracts only. The other identifiers found in the Federal ACG, e.g., congressional districts, were not checked nor corrected due to the anticipated high cost. Therefore, all data aggregations which must cover the entire SMSA by small areas are possible only at the tract level at present, using the updated ACG. We believe this will be sufficient for our analytical purposes for the next two years or so. For those who might be interested in the equipment used at the University for the foregoing work, we have an IBM 360/65 with 750,000 bytes of fast core and 1 million bytes of slow core. There are three 2314 multiple disc units, considerable communications gear for remote terminals, and the usual pe- ripheral equipment such as tape drives and printers. IV. Using the ACG in a New Housing Information System Numerous community organizations, public and private, have large files of data related to many aspects of housing. Local governments and other public agencies often can be persuaded to share data on housing, drawn from their regular data processing operations. Land parcel files, fire and building inspection activities, and building permit applications all provide useful data on housing activities in the community. When a variety of such files are searched and data items brought together and related, a new pool of information about many aspects of housing is created. In this subsystem project, we are bringing into participation a neglected, but crucial group, the business community. Their data and their financial support for such a system are indispensible elements of any effort to identify and study housing conditions and developing trends. In this connection, commerce and industry could make very great use of geographic base files, provided they have some initial assistance in becoming familiar with their technology and how they can be applied to internal data files. Not only can new infor- mation be developed by relating files between organizations, but individual agencies can 27 derive new information from within their own files, which information must have been implicit but not previously drawn out for actual use. This new effort to establish a continuous and simplified flow of housing data rests upon four premises: 1. That local organizations can be per- suaded to yield up in aggregate form (to tract level initially) housing data on the basis of self-interest in the larger information pool which would result. 2. That local and Federal funding sources can be developed to create the prototype sub- system and that the local sources will thereafter maintain the system at a reduced cost. 3. That the resulting new information on housing will present a much more detailed and useful understanding of what is taking place in the larger community (by small areas). This should make it possible for public housing programs and private in- vestment opportunities to be better defined and developed, with greater benefits to the general public. 4. That such a subsystem for housing information will be unique, meeting a long standing need for more complete and con- tinuous housing data by local governments, semi- governmental agencies, local business interests and Federal departments. Part of this uniqueness would be the transferability of at least the concepts in the system, if not the entire package of data items and com- puter programs. This proposal also implies an unusual amount of cooperation among a variety of agencies within one community. In the Cincinnati area such cooperation is not unusual, but is instead the norm. Conse- gently, discussions with a variety of public and private organizations toward design of a housing information subsystem have met with very positive response, and work is proceeding on the basis that the challenge is mostly technical and educational, rather than one of overcoming any significant op- position to the idea. It should be emphasized that approaches to private firms which have valuable housing data have been made mostly on the basis of self-interest. It appears that they see very large corporate benefits coming from information which they have not pre- viously been able to acquire due to expense or technical difficulties. Local governments have long desired much improved housing information but have not been able to estab- lish a continuous system for monitoring a housing inventory and change characteristics due to financial limitations and the sheer scope of the task. Past experimental research on using in- direct data indicators to monitor community changes have led us to the conclusion that the most useful or important aspects of housing can be followed adequately by blending to- gether selected sets of housing data items in both straight counts and correlation models. While the inventory must obviously be a straight count, certain characteristics of change can be implied on a probabilistic basis, when inferred or implied from other ongoing activities which pertain to the housing stock but which are not openly related to it in daily operations. A boundary problem inevitably occurs in that too many data items might become available, only a few of which would be really useful in identifying characteristics of change in housing. Shall the system collect every item that any potential user could think of, or should some rather arbitrarily selected subset of data items be picked out and used in the system design? The latter course was chosen, but will be moderated by negotiating the peripheral data items which might be left out or in according to requirements of major participants. Rather than present a collection of data items which undoubtedly will undergo change as the project progresses, it appears more useful to outline here the information output goals of the system, along with certain minimum qualifications: 1. An inventory of all residential housing units, by street address and tract number; updated at least annually with continuous inputs from permanent data sources. No field surveys would be used. 2. A set of descriptors of each kind of unit in terms of (a) human habitability and (b) marketability. 3. Identify gaps in the housing stock in terms of needs and demand. Both must be geographically identifiable, hopefully going below the tract level in later refinements of the system. 28 4. Market activities in residential hous- ing. This would include historical trend patterns by type of housing by geographic area, with some consideration of the impact of special phenomenon such as new seweror water systems, new transit or highway developments and general zoning policies of political subdivisions. 5. Annual survey of single family resi- dences and apartment renters; focusing upon attitudes toward builders and owners, mort- gage firms, public policies which affect housing, the general community, and several other important choice factors. This data would be acquired in a separate project, but added to the data pool on housing each year. 6. Population changes as related to housing. Population mobility and mix of characteristics will require additional models to relate to housing activities and choices. 7. A series of special reports on activi- ties and conditions in housing to be for- warded to public and private agencies for program planning and investments, for trans- portation planning and for utilities planning and construction. Schools, hospitals and many other organizations would benefit by early notice of changes in the housing stock, especially as influenced by population changes. It is anticipated that public policy can be better sensitized concerning housing needs of the poor and program opportuni- ties to alleviate undesirable conditions in housing. One of the early steps will be to create a basic inventory of the housing stock from the electric utility records which contain a more refined identifier of each kind of residential unit than does the 1970 census. This will be checked for type and volume against the census tapes, by tract. (The Institute is a recognized summary tape distribution center.) Initial differences of significant magnitude between the counts will have to be reduced by investigation. The condition of the housing stock will be determined by inference by correctable condition indicators from both the census tapes and the electric utilities files. More precise condition data will be available from the inspection records of the city of Cincinnati, but since the indicators and associated methodology must apply to most other areas in the SMSA which do not have continuous housing inspection programs, the more detailed data from the city of Cincinnati will be used only to confirm, rather than to specify, the housing con- ditions. It may be possible to persuade the electric utility to have its meter readers conduct an annual survey, with training provided through the Institute. This brief outline of the work and goals involved in creating a housing information subsystem is only a statement of intent. While work has started on the subsystem, its ultimate success is not yet assured by any means. It was by coincidence that the housing information subsystems work began about the time of this particular conference. It is hoped that a positive report of ac- complishment can be made at some future gathering. In any event, this entire project would not be practicable without a good Address Coding Guide and a set of "file matching" programs. The ACG will be used to "tract out" the internal records of any data source which has an interest in, or contribution to make to, the housing information subsystem. In many communities the 1970 census files, when used with the Federal ACG, are nor- mally used to create a baseline in housing and its condition. Updating from that base- line has proven to be inordinately difficult, for many communities have tried it. It is our hope that a different approach to deriv- ing the count and condition of housing in an SMSA can be achieved through bringing into involvement the business community which heretofore has not usually been asked for data from its internal files. Another major change would be that certain characteristics of housing would be implied or inferred rather than determined by field survey. At the geographic level of census tracts such inferential models might well prove adequate as a basis for decision- making by public and private organizations, on the basis of what apparently is in existence in the community and how the community is changing, especially in this instance as relates to housing. 29 Question Period Mrs. Kaplan--I would like to ask if the housing model will be able to predict demand by geo- graphic areas or sub-regional areas? If it will accomplish this, then how are you going to handle the problem of areas where there is presently little or no development, without having housing demand projections serve as self-fulfilling prophesies? Dr. Lundberg--I cannot answer your second question at all. I do not know. Your first question as to identifying demand, I think can be a very straight forward thing based on other kinds of physical developments that are moving in certain geographic directions. We are doing a small study in Northern Kentucky right now, determining where development is taking place; that in itself is difficult to find out. We are relating that and school populations to the 1970 census base-line data. We get the distinct im- pression that, as simple minded and as old as it is, if you put a freeway out someplace, or a major road, or sewer or water systems, your develop- ment will inevitably traipse out there after it, partly conditioned by the topography. What we are concerned about is in meeting demand on a market basis. It seems the builders are guessing. They are making some bad guesses, and they are leaving some terrible messes for local governments and school dis- tricts to deal with after the fact. When you are speaking of need, defined by public policy, there is a tremendous fight going on in Cincinnati, as in other places, as to where they are going to put large-scale public housing. Well, maybe we will go to the Dayton Plan; fragment the public housing and spread it all over town. These seem to me to be rather arbitrary decisions, partly dictated by necessity. Where do you put your public facilities? Followed on an arbitrary basis by those who have the power to say, "We will put out money there" or "We will not put our money there." We have not identified any real clear pattern of thought because we do not know enough about how they make these decisions. Mrs. Kaplan--How does this housing demand model account for certain nongeographic factors such as local community decisions to encourage or discourage development by enacting building codes and zoning laws that are either more or less restrictive than those of neighboring cities in a metropolitan area? Dr. Lundberg--Are you asking about market demand? Mrs. Kaplan- -That is right. Dr. Lundberg--I think in the Cincinnati area we can rely quite a bit on past history because nearly every development trend line which we can identify is straight line and almost flat, almost 3 percent to 4 percent of the population, housing and school growth. That is not true all over the country, but Cincinnati has an unusually stable set of trend lines. While the creation of a very simple extention of current conditions represent a pre- diction, it may not hold in other communities. We are not experienced outside of our own community as far as trend lines are concerned. I am rarely willing to take a step beyond two to three years at the very longest in making such estimates. Mr. Bieda--I am wondering if you could elaborate a little further on how you plan this housing condition inventory without a rather full field audit? Dr. Lundberg--This is one of the things we are holding meetings on right now. We are not sure we can pull it off, but we are concluding we can because it has worked in some of our other pro- jects using what we call unobtrusive measures. Many people have come to us and asked for data on housing conditions. We tell them to look in the census listouts to see if there are any toilet facilities in there or not. We also think there are indicators in the data that can be pulled out of the banks and the mortgage institutions. They go out and appraise a variety of homes. The turn-over rate is not high enough that we can cover every building in Cincinnati this way in less then about 30 years. We think, however, that from a variety of sources we can get at the general condition of housing, by type, at the tract level. We have not identified clearly how it happens, but we know the mortgage institutions begin to "red line" certain areas and refuse to put money into them. They are the first ones to make that decision; the first ones to begin cutting back on something that may speed up the deteriora- tion process. We are trying to find out what information they are basing it on; they claim it is something out of their files. We used to think it was an exchange of opinion over cocktails at 30 lunch, "No, we are not putting money out in Avon- dale anymore, its kind of risky." It turns out that fire and insurance companies have something to say about it, and police records and crime rates have something to do with it. We are not sure, but we intend to find out. Mr. Bieda--How, then, does this relate to the housing condition inventory? For example, you have mentioned the census tract, but you want to get down to block face. Dr. Lundberg--That is right. Mr. Bieda--The best you are obtaining from them is asking for summarized information, i.e., you are not going to have disclosure problems. How do you plan to extrapolate down the line to that block face level? Dr. Lundberg-- Again, on the basis of other pro- jects where we deal with very sensitive data, we have a social services information system that deals with data, person by person, family by family. We have found that we started at the tract level, little by little, these organizations began to trust us, and maybe that is an unique situation that cannot be duplicated. They have trusted us to the point now where we can pull any- thing right down to an actual family and ask questions about its case. They permit us because they know it will not go beyond us. We think the same degree of trust will evolve in our relation- ships with these business groups. We already have some tapes of data which we probably should not have, and we are experimenting with those. Mr. Renshaw--I am very interested in your com- ments on the role of private business in developing and using geographic base files. I would also like to have you comment a little further on some of the problems you had in areas using private sources for developing and maintaining some of these files. Dr. Lundberg--Do I understand your question cor- rectly, as to what techniques have been used to gain cooperation with recalcitrant areas? Mr. Renshaw--Yes. What techniques and what problems do you expect when you develop your files at block level? Also, do you think other cities could expect to achieve your level of success and cooperation? Dr. Lundberg- -I suggest there are two criteria that we have found mandatory in our situation. Number one, they have to know you individually as a person and trust you. It makes no difference whom you represent. A one-to-one relationship, and this is slow to develop. Secondly, they have to see a hard information pay-off that is of real utility. Barely are you dealing with the president of the company, you are dealing with somebody further down the line who has to show a company benefit in hard terms as a result of his involvement. We try to create the trust first and then show the benefits second. If they can trust us long enough over the develop- ment of one of these systems, then they can live with the ups and downs, the delays and the un- certainties; together we develop a working relationship. I know it sounds rather subjective and vague, but we find that this substitutes for the formal committees where you sit around brainstorming and establish a formal group that perhaps produces nothing useful. We sit down with business representatives, with school district represen- tatives, etc., and we help build up a network of relationships. Oft times we will go to formally called meetings with representatives from 10 different organi- zations, and it is like old home week; about 50 percent of the people are the same ones on all these different groups. The network of relation- ships begin to generate an ease of accomplish- ment. We have gone to the school board, for example, and asked them for certain data that publicly they cannot announce they have, but privately they will give it to us because we will use it with discretion. I am not trying to claim great virtue; it is just a simple matter of trust and good sense. The companies will open their data provided they trust us. There is no contract agreement in the world that we are aware of that can force us to keep information truly confidential. We think the best way for them to keep us honest is to be able to shut-off the data flow instantly, anytime they feel like it, for whatever reason. Consequently, we "walk on eggs" and listen very carefully, and we try hard to meet their needs. A third point that might be very useful is that we do this as quietly as possible. Rarely does anything hit the press; if it does, it is announced as a largely finished product, and we give as much of the credit as possible to the local agency. The Institute is behind the scenes as much as possible, out of sight. Consequently we read frequently of things in the paper that have been done by a community organization, but much of the tech- nical work was done by the Institute. Another point is that we inevitably have to invest much time in the teaching and training functions. We just sit down and explain each 31 point over and over again. We answer questions and we put in a lot of birddogging and "hand holding" time. We think that is indispensable. Once they find out they are going to gain something, and it will not cost them anything in public, that you are not going to suddenly up and run away because your consulting firm has disappeared, and that you are going to be here for a long time, they are willing to work with you. They see a long term possibility, and a short term pay-off as well. Again, I do not know if it will work in other communities, but we have evolved these requirements over a period of years. Mr. Amsterdam- -I wonder if you could comment on the involvement of the Cincinnati government and the other local governments in your work. How much interest have they given, and how much have they participated in the work that you are doing? Dr. Lundberg--Here again, it is not a matter of formal agreements of participation and coopera- tion. They are quietly and informally structured. The city and the county have a very unique and interesting situation where they share a single computer center. I think they are very intelligently trying to eliminate information overlaps and duplications. In the process of their growing up in data processing and developing their systems, we were coming along about the same time with ours. We have had our ups and downs on various points, but by and large, there has been a reasonably good relationship between the leader- ship there and ours at the University. I am able to call down there now, for an example, and ask for certain kinds of data items. I have to submit a formal request, and it goes to their Control Board. We have only been turned down once and that was due to a legal technicality. When they were designing data formats, such as for their land parcel file for the county and for the city's housing inspection program, they were made street addressable, in a form useful to us as well. They have gone to a lot of sweat and labor to make every land parcel have a geographic location street address, even if it isapsuedo-address. We gave them a copy of the ACG we updated, and they use the same guide. This kind of cooperation is wonderful. USE OF A GEOGRAPHIC BASE FILE IN A HEALTH INFORMATION SYSTEM Richard M. Levy Introduction We in Kansas City have been concerned for the past several years with a total health infor- mation system; not just for the sake of an infor- mation system, but for the sake of making program decisions . I am not going to describe how we in" Kansas City created the original Address Coding Guide and later the Geographic Base (DIME) File or the GEOPLAN system on which this discussion will be based. First, I do not have all of the background information; and second, I think, based on some of the things said this morning that we need to concentrate more on what we are doing today, some actual uses of the DIME file for problem solving. I get a little discouraged when I attend con- ferences. Usually two-thirds of the discussion is concentrated on what are we going to do in the future. Too often we lose sight of the fact that even if we are not 100 percent where we want to be, there are some actual uses we can make of the programs or of the data at the present time. GEOPLANS was initially developed in 1967 by the Metropolitan Planning Commission - Kansas City Region because they did not know when Census Bureau's Address Coding Guide would be pre- pared and made available. Therefore, the Com- mission developed its own geographic coding file and related systems. A description of the early GEOPLANS was presented by Mr. L. Dale Sherman in your first conference in Wichita, Kansas. * About 1969, the company I am representing was formed, and several members of the metropolitan planning staff became the initial members of this organization. With us, we transferred the idea of the GEOPLAN system and began relating it to public information systems and management consulting. Bureau of the Census, Use of Address Coding in Geographic Coding: Case Studies , Conference Pro- ceedings, Wichita, Kansas, November 19-20, 1970. One of the areas we concentrated on was that of health information systems. There were several questions that we, ourselves, raised in the very beginning before we began doing any work. One question was why do we want to design, develop and implement a computer based health information system that would utilize a system of geographic coding - GEOPLANS - as the nucleus of the system? It was going to cost a great deal to develop, and it would take at least two years to complete. Kansas City was fortunate in that they always had a fair fix on what was happening. They knew they had "X" number of births and"X" number of deaths. They even knew how many deaths were due to infective and parasitic diseases. They knew the number of people living in the city by census tract, and they knew from Census Bureau information about race and sex of the population down to the block. They knew the number of housing inspections that the Health Department was making; the number of weed and trash inspections; the number of TB and VD cases in the city; the number of nursing visits; and that there were 4-1/2 pounds of pollutants for every man, woman and child, deposited 365 days a year in the atmosphere. The question we then raised was "Why can't we set program priorities on the basis of these statistics?" We had a fair idea of the answer; but in terms of outlining it in the original process we were going through, we felt we needed to go into it a little deeper. One reason is that we did not know where the mothers of the 8,000 plus births lived or where they went for their hospital care; or where the dead had resided or where they passed away; where were the increases and de- creases in population taking place; where were the substandard homes located; where do the 18.3 active T.B. cases per 100,000 come from; and where did the 4,000 reported V.D. cases eminate 32 from? The missing common denominator in all cases is a geographic identifier, such as street address, census tract, block or ZIP code area. In addition, Kansas City was probably plagued more than many other cities because, while it was only 27 th in population nationally with approxi- mately 500,000 people, it was about fourth or fifth in the country in terms of area (417.3 square miles). That means that the poverty pockets, that is, the pockets of problems, were by far more disbursed than in many other places. If you are talking about setting program priorities, you can- not just say, "Here is a centralized place. Every- body that needs its services come to it." The answer to some of these seemingly enigmatic questions was originally to be found in the Department of Housing and Urban Develop- ment's 701 funding of an Address Coding Guide through the Metropolitan Planning Commission in Kansas City. This system was later finished, enhanced and implemented by Information Systems Development, Inc. (ISD, Inc.) for the Health Department. The system now consists of over 80 COBOL and FORTRAN programs which have the ability to: 1. Take all geographically referenced data 2. Geographically place and link this data 3. Provide common file management, main- tenance and retrieval of this data 4. Provide a wide range of several types of computer graphic displays of the data and 5. Provide detailed statistical analysis capa- bility such as complex regression analysis or multi-contigency correlation analysis or cluster analysis. (All of these statistical packages were developed in conjunction with the GE OP LANS package.) It is obvious that simply having a very sophis- ticated computer based data- linking and planning system does not, in itself, create a Health Informa- tion System or even answer some of the afore- mentioned questions. However, for the past two years ISD, Inc. has been working with the Health Department of Kansas City, Missouri, and with the entire health community in the Kansas City metro- politan area to develop a community- wide Health Information System that utilized secondary data as its primary input. We have worked with all sections of the Health Department, looking at how they are performing their tasks, what forms they are using, what are their objectives, policies and procedures, etc. One word of caution, integrated health infor- mation systems do not happen overnight. Even with all of the sophisticated software and hard- ware which is available today, a tremendous 33 amount of effort and time must still be spent in educating people to the fact that any health in- formation system, no matter how large or small, is geared to provide better overall service through the use of man's creative thinking. About the same time period, 1969, Dr. Edwin O. Wicks accepted a new challenge and moved from State Health Director of New Mexico to Health Director of Kansas City, Missouri. When he arrived, he raised the same questions that we had been discussing, "What is happening and where is it happening?" We had the "what," but we did not have the "where" too well. One of the first things that Dr. Wicks did that was a concrete step towards what he wanted in terms of an information system and a health program was to apply for 314B funds for neighbor- hood family health planning centers. Prior to that, most of the health programs in Kansas City, such as TB and VD screening programs, had been offered at centralized locations. I do not know how many of you are familiar with Kansas City public transportation, but it is terrible. Those persons that did not have cars or lived far away simply were not utilizing the services. Dr. Wicks decided, therefore, that services should not be offered only where they were con- venient to the Health Department, but that they should be available at the neighborhood level where they are convenient to the people with the problems that need them. The task then became, "Where are the problems?" We also realized that it was going to cost money and that this system could not be developed with a small budget. At that point, the Council on Aging in Washington authorized a grant for Kansas City, a city- county grant, that really had Federal, State, county and city funds in it to develop a Planning Programming Budgeting System (PPBS) for the elderly. They simply wanted to take a good look at all programs (public, private, governmental, non-governmental) that had any impact at all on the elderly, to determine whether a coordinated approach to facilities planning, to program planning, etc., could be developed. The thing that Dr. Wicks came up with early in the game was that most health programs not only affect the elderly, but also affect the rest of the population. As a result of this, he let us examine in detail some of the other areas of the Health Department. Where they were using antiquated data systems that could not be geocoded or were not at the present time applicable to GEOPLANS, he let us reformat the system, re- design it, so it could be used. The types of data files currently the Kansas City Health Information System are listed at the end of this report along 34 with a graphic description of the Integrated Health Information System. Kansas City Experience The following are a few examples of these efforts and their relationship to the Geographic Base File and ultimately to GEOPLANS. A- -Comprehensive Area-Wide Hospital Survey For many years Public Health physicians, fa- cility planners, and others all across the country have seen a real need for comprehensive area- wide hospital data that utilize the same base of information so that they are "Apples and Apples" as opposed to the traditional different accounting and admitting procedures. In June of 1970 a consortium was formed in Kansas City consisting of: 1. The Kansas City Area Hospital Association (representing all 33 hospitals in the eight county area) 2. The Mid-America Comprehensive Health Planning Agency (our 314B agency) 3. Blue Cross-Blue Shield 4. The Kansas City, Missouri Health Department 5. Information Systems Development, Inc. (ISD, Inc.) The original intent of the project was two-fold. First, Model Cities was providing (through Blue Cross) comprehensive health insurance for 2,500 neighborhood residents. Blue Cross wanted a comparative actuarial study in addition to needing to know where all of the model City residents were going for their hospital care. Second, we could develop a solid data base of hospital mor- bidity and financial information for the entire community. After much discussion it was agreed that the following data items would be collected. 1. Hospital Name 2. Individual Address 3. Admitting Date 4. Discharge Date 5. Date of Birth 6. Age 7. Sex 8. Primary ICDA Discharge Diagnosis 9. Secondary ICDA Discharge Diagnosis 10. Total Billed Charges 11. Amount Paid by Third Party Payments 12. Amount Paid by Others The study ran from July 1, 1970, through June 30, 1971. After much haranguing, political pressure and pleading, we were able to collect 100 percent of all inpatient discharges (195,000) . This is the first time in the United States that a 100 percent survey has been completed for a full year for a major metropolitan area. The patient discharge data are address- matched and geocoded with both census tract- block and x-y coordinates (centroid of block face) identifiers. Once the data have been geocoded, the hospital master file is updated and a series of six statistical computer reports are generated for each hospital. In addition to the statistical reports which the hospital receives, a computer map is also pro- duced showing the hospital's service area - geo- graphically locating where patients are coming from to receive service. The 1970 First Count Summary Tapes are also used in this system to calculate certain types of patient rate breakdown based on the population of census tracts. B-- Ambulance Reporting Information System For the past three months we have been working with the Kansas City, Missouri, Health Depart- ment to develop an Ambulance Reporting Infor- mation System. The initial work was the development of a short, key-punchable form to be filled out by the ambulance company after each incident. In Kansas City, there are approximately 3,000 emergency ambulance calls per year. At the present time, we are receiving information on these forms for about 95 percent to 100 per- cent of these calls. We have written several computer programs which will provide the essential information for evaluating the ambulance program. For example, the city is required to pay $20 per run, or the difference between what the individual pays and $20 if the debt is not satisfied within ninety (90) days after the run. These programs provide the city with an aged accounts receivable status. In addition, we will be address-matching all calls and doing computer graphic scatter plots to be used in emergency care facility planning. These scatter plots will show each call taken down to the block level so that we can see the service area for each ambulance company and emergency-care facility. We will be able to calculate the time required from the time the call is received until the ambulance gets to the emergency scene, and the time from the scene to the emergency-care institution through address- matching of the calls. This time analysis will provide invaluable information for facilities 35 planning and for the proper locations of the ambu- lances, whether they be roving or fixed locations. C — Tuberculosis Records Service System Within the past five months we have imple- mented for the Kansas City, Missouri, Health Department a Tuberculosis Records Service (TRS) System which consists of nine automatable forms. These nine forms replace all of the then existing fifty-four forms that were beingutilized to record tuberculosis patient information. The Tuberculosis Records Service System which we have implemented was originally de- veloped by CDC. They spent a total of four years and one million dollars in developing this com- puterized system. There are currently eight sta- tistical reports which are used to evaluate the tuberculosis program and to provide better patient care. Through the use of geographic coding and computers, we will be able to address-match all of the patient records and assign them to a geographical area. Through the use of computer graphics, we will be able to analyze both existing nursing districts and to plan for additional or revised districts. We are currently developing additional statis- tical programs which will correlate the incidence of tuberculosis with substandard housing, air pollution, and income so that as our reporting and techniques become more sophisticated, we will be able to forecast, with a high degree of probability, specific geographic areas for tuber- culosis preventive measures. In addition, through the use of geocoding and address-matching of current patient cases, we will have another significant input in the realigning of nursing districts to fit actual case loads. It should be emphasized that the Health Depart- ment of Kansas City, Missouri, in developing its Integrated Health Information System, has not only developed new scientific approaches and util- ized new computer techniques, but has also been most careful to take a look at what other cities and governmental agencies are doing in other areas so that we are not simply reinventing the wheel, but utilizing solid, significant products, as well as our own new developments. Several months ago when the city began to feel that the TRS System would be a more efficient and economical approach than developing its own system, the State of Missouri was rather reluctant to go this route at first, feeling that the cost of developing a State-wide new system would, in the long run, be more beneficial. However, through a sophisticated cost/benefit analysis pre- pared by the Health Department and its con- sultants (ISD, Inc.), the State has now reversed its former position and has decided to adopt the TRS System State-wide; and is, in fact, looking to Kansas City and the Health Department for guid- ance and consultation in the adoption of this new system. D-- Environmental Services One of the largest areas of the Health Depart- ment, in terms of manpower and scope of effort, is the Environmental Services area. Tradi- tionally, this division has been composed of six separate semi-autonomous sections, and insuring that accurate information is recorded, has always been one of the key elements in both providing and evaluating consumer-oriented programs. One of the reasons for inaccurate recording of information is that the forms in the past have been too complicated and time-consuming to fill out, or there is a total lack of forms to use. The Environmental Division has been able to overcome these problems through the use of a new type of survey form. This form provides the following capabilities: 1. It is a keypunchable form. 2. It summarizes all possible violations into either a check list or a fill-in answer. 3. Instead of having to send a crude looking and often ambiguous complaint letter, the department now merely makes a duplicate copy of the form and mails it to the indi- vidual requiring services, thereby elimi- nating almost one full man-year of clerical time. 4. It collects other demographic information, such as the age of all occupants, the head of the household, primary source of family income, etc. Based on six months experience in the field, this form takes approximately 40 to 50 percent less time to fill out than the old subjective type forms. Most importantly, in addition to being able to greatly increase the number of housing inspections per year, this information can now be an additional input to the total health information system, and we can now address- match and correlate such variables as the re- lationship between elderly and substandard housing, and high incidence of T.B. among the low- income living in substandard housing, inclose proximity to high concentration of air pollution. We have always assumed to be fact the relation- 36 ships that I am describing. It is only in the recent past that we have had the capability to factually determine these relationships on a small-area basis. In many cases the need to provide better services to the general public coincides with the desire to perform address-matching and to utilize certain of the capabilities of the GEOPLANS system. For example, for the past eight months we have been working with the Health Department to eliminate the concept of the various sections within the Environmental area. Instead of having one man, one specialist, come to your home for a weed complaint and fill out one form used by his section; and another man come out on a rat complaint, fill out a form used by his section; and yet a third specialist come out and fill out yet a third form used by his section; and even a fourth man inspecting health hazards within your home utilizing yet another form, and finally one additional specialist inspecting the small trash fire you have burning in the back yard; you will now have all of these specialists replaced by generalists utilizing common forms and following common procedures. In the past each of the six major sections of the Health Department (Food, Dairy, Housing, Air Pollution, General Sanitation and Rat Control) broke up the city into several geographic areas which their inspectors served. The boundaries of these areas are not necessarily, and in fact in most cases, are not coterminous. However, since the Department is now composed of only one unit, the need to develop new districts was obvious and the address-matching capabilities that the Health Department now had, played a great role in this. The following factors are all weighed and looked at before dividing the city up into the new districts. 1. Estimated case load by function, which was partially derived by reviewing the location of housing and general sanitation violations for the past several years. 2. Averaging man-hours required to resolve each of the various case types, i.e., citizen inquiry, inspection, etc. 3. Averaging travel time from any given point to any other given point within the proposed district. 4. 1960 population by small area, 1970 popu- lation by block groupings and projected 1975 and 1980 population by census tract. Some of the data, such as the location of all the food establishments that needed to be in- spected, were address-matched manually because it was felt to be more efficient for this small a data item. Then through a procedure of multiple regression analysis, the population and housing variables were correlated with housing violations and general sanitation violations. Four variables were found to have a predictable value and were incorporated into the regression equation. One of the advantages of a geographic address- matching system is that it creates the potential for continual monitoring and surveillance of open and closed cases and the possible need for future redistricting. State Geocoding Efforts Recently the Missouri Geocoding Subtask Force Committee recommended that the GEOPLANS system currently being used by the Kansas City, Missouri Health Department be adopted for use at the State level for both urban and rural types of health- related data. Through the use of Geographic Base Files and computer software, such as GEOPLANS, the State of Missouri will be able to integrate, correlate, and analyze, both statistically and graphically, health data state wide. The State is planning on utilizing the Census Bureau's DIME files for the four participating SMSA areas in Missouri (Kansas City, St. Louis, Springfield, St. Joseph). These four files will constitute the urbanized area which will be used to pin health data down to the block face level. For the remainder of the State (urban area), the State will utilize GEOPLANS to develop and maintain several additional geographic base files which will identify, and geocode with an x-y coordinate, any data which has any of the following types of geographic identifiers attached to it: 1. ZIP Code 2. Municipality or City Name 3. County 4. Township 5. Rural Route 37 DATA FILES CURRENTLY IN THE KANSAS CITY HEALTH INFORMATION SYSTEM 1. Census Files (Incorporating Real Estate and Personal) 2. Health Facilities Data 3. Police 4. Tax 5. School Education and Facility Data 6. Building Permit 7. Hospitalization 8. T.B. 9. V.D. 10. Family Planning 11. Ambulance 12. Environmental Sanitation 13. Air Pollution 14. I.R.S. Income Data 15. Social Security Recipient Information 16. Immunization 17. Births 18. Deaths 19. School 20. Visiting Nurses 21. Aging Data 22. Maternal and Child Health Data 23. Dental 24. Manpower Data 25. Employment Security 26. Transportation KANSAS CITY, MISSOURI, HEALTH DEPARTMENT INTEGRATED HEALTH INFORMATION SYSTEM PLANNING MODEL COMMUNITY VALUE SYSTEM GOALS-*- OBJECTIVES -*- EVALUATION CRITERIA 38 Question Period Mr. Renshaw--! have been looking at the graphics that you have on display but I did not notice what software systems you are using. Mr. Levy- -Originally we started off using the SYMAP system. Since then we have developed our own package which we call GEOPLOTS. Mr. Renshaw- package? •You are not using the GRIDS Mr. Levy --No. This is a program that we de- veloped ourselves in the past 8 months. Mr. Weddle- -There are three types of computer graphics which the Kansas City, Missouri Health Department utilizes. Density plots, scatter plots, and contour plots. We have two different density packages; one being a revision of SYMAP which was developed at Harvard and the other was developed by our firm. The Health Department has utilized both types quite extensively for several types of data files. I might point out that, with the exception of the SYMAP package, all the other graphics packages are written in COBOL for a S/360 Mod 30 and up computer. Mr. Fairbanks-- 1 wonder if you could review the rural geocoding techniques for us? Mr. Weddle--We are working with the State of Missouri in developing a State -wide geocoding system. Of course, the State would like to geo- code data to the nearest meter, but for a lot of the data that are available, you just do not have the capability as far as address-matching and geocoding data to a census block face. The bulk of data which the State will be analyzing will have only a rural route and a county or a township identifier associated with it. The State has estab- lished a system to geocode rural types of data files to at least the township level. In this way, data files can be aggregated together and displayed graphically using townships as the smallest level of analysis in the rural areas. In the urbanized areas, the State will be utilizing the DIME files for the processing and aggregation of planning data. Some of the rural base files for the State have already been completed and are being used for analyzing data, such as the computer density maps we have on display around this room. There still remains the problem of how accurate and complete the four DIME files will be once they are released to the four SMSA areas. Assuming they are more complete than the GBF the Health Department is currently using, the next problem is how are they going to be updated on a continual basis. Right now there appears to be no real co- ordinating body or responsible organization to handle this problem. Mr. Crellin--When you encoded township bound- aries, how often did you find that they coincided with roads? Mr. Weddle --I am not too familiar with how the township boundaries match up on roads, .so I am afraid I cannot answer you on that. Mr. Levy-- 1 might pointout that instead of having a highly skilled person do all the digitizing for much of the graphics that we have on display, those illustrating information at the State level, we took a secretary who had never seen a digit- izer before in her life, and she did it with very few problems. She got it done in about 3-1/2 weeks, doing it in her spare time during the day. USE OF A GEOGRAPHIC BASE FILE IN THE CIVIL DEFENSE COMMUNITY SHELTER PROGRAM Lee P. Johnston Background The Office of Civil Defense (OCD), in con- junction with counties and cities throughout the United States, has been creating Community Shelter Plans (CSP) since 1966. This program provides for the development, publication and dis- tribution of local shelter plans assuming the most effective utilization of existing shelter assets in time of emergency. The majority of the 1,700 areas undertaking CSP's in the past have been in the less populated counties in the country. Since a manual process for allocating people to specific shelters has been the heart of the program, planners in the larger metropolitan areas have been reluctant to begin such a task. The large amounts of data which must be dealt with in terms of population and shelter capacities have been enough to warrant second thoughts. Added to this is the complex problem of determining shortest paths in a network, applying various travel times and considering the differ- ences between daytime and nightime allocations. OCD, realizing the vast problems mentioned above, contracted with the Census Use Study and Systems Development Corporation to develop a model to handle the allocation phase of the CSP. The University of Tennessee was responsible for overviewing the development phase and has tested and designed a total system for implementation to begin in 1972. The first part of this paper will discuss the general inputs and outputs of the model and the allocation process. Secondly, a detailed descrip- tion is given on the role of the Geographic Base File (GBF) in the model. Thirdly, the problems encountered in testing the model and in designing the administrative system will be mentioned. Finally, three possible applications of the model to other types of planning studies are discussed briefly. Network Allocations of Population to Shelter Model (NAPS) The NAPS system relies basically on the inter- action of three types of input data available on a national scale, with supplementary information necessary from the local level. Policy options must be supplied at the local level. All of the data necessary to run the model may be classified in three categories: the network over which the population is allocated to the facility; the popu- lation demand for the facility; and the available supply of facilities. Figure 1 is a schematic over- view of the model . The network is the central component of the system and must be present in the form of the GBF. The original selection of this model was based on the standardized network provided by the Geographic Base Files. Without going into details of the GBF, it is important to stress that the files provide three basic needs of the model. First, because of the availability of coordinates for each node, the length of all street segments may be calculated and the output may be presented in computer graphic form. Secondly, the availability of block face identi- fiers enables geocoding of all information to either block face, street segment, or census block. This, of course, means that both the popu- lation and the facilities can be located on a parti- cular block face or street segment. It also means, though that certain attributes of the network may be coded for greater accuracy. For instance, the type of street (i.e., freeway) may be coded, as 39 40 FIGURE 1 - OVERVIEW OF THE NAPS MODEL NAPS MODEL DATA BASE POLICY AND PLANNING FACTORS FALLOUT SHELTER SURVEY DATA ACG-DIME GEOGRAPHIC BASE FILE TRANSPOR- TATION FACILITY CHARACTERIS- TICS 1970 CENSUS DATA PERSONS PRESENT RATES EMPLOYMENT DATA T -H ± DATA PREPARATION MODULE i . POLICY AND PLANNING FACTORS SHELTER SUPPLY BY STREET SEGMENTS STREET NETWORK DESCRIPTION TIME-SPECIFIC SHELTER DEMAND BYSTREET SEGMENTS ALLOCATION MODULE SHELTER ALLOCATION PROCEDURE MODEL OUTPUTS ALLOCATION STATISTICS » POPULATION COVERED ► SHELTER UTILIZATION ► SHELTER DEFICIT . TRAVEL TO SHELTER DISPLAYS DRAINAGE AREA MAPS ALLOCATION SUMMARIES EIR DATA ALLOCATION TO SHELTER ROUTE TO SHELTER MODE OF TRAVEL 41 well as the travel speeds associated with each type. Barriers to movement may also be identified in this fashion. Thirdly, and perhaps the most overlooked advantage of the GBF, is the adjacency information indigenous to the files. This means that it is pos- sible to construct a table consisting of all of the street segments that are connected to each seg- ment in the file. In the broader realm, it would be possible to construct a list of all blocks or block groups which are adjacent to every block or block group in the area. Because of this at- tribute, it is then possible to identify all segments or blocks which lie within a specified distance from a particular point without calculating the distance between that point and all other points in the network. More mention is made of this attribute later in the paper. The population demand portion of the NAPS model consists of the total population by city block. For a nighttime allocation most of the information can be derived from the Third Count Summary Tapes. 1 The daytime population presents the larg- est problem to the system because of the lack of a standard source of daytime population distribu- tion. The method of deriving this information is presently accomplished by geocoding employment data and school enrollment data to the block face. A person's "presence" rate is applied to each type of employment to determine the number of people probably located at that site, based upon the num- ber employed. For instance there may be 3 shoppers for every employee at a retail site. The figures used were derived from the Bay Area Transportation Study, but may be altered to reflect local conditions. The system is flexible enough to use any method employed to derive the daytime population distribution. As geocoding becomes more widely used, it is our optimistic hope that urban areas will begin estimating daytime population distribu- tions in a manner similar to present estimating systems for residential distributions. The supply in the NAPS system refers to build- ings indicated as fallout shelters. Associated with each facility is its address, the amount of space available and the degree of protection provided. The address gives the capability of geocoding to the GBF, while the degree of protection may be used as a policy option by deciding what degree of protection is desired. 1 The Third Count Summary Tapes from the 1970 census provide for population and housing counts by block. The allocation phase of the NAPS model dis- tributes the population along the street network to the available space. This phase begins by identifying all street segments with shelter space available. Based upon the table of street segment connections, a variation of a minimum path algo- rithm is utilized and people are allocated to the closest facility if there is space available. The time taken to travel over any link is considered to be a function of the length of the link and the rate of travel associated with that segment. This is weighted according to the number of people on the street to account for congestion. If the popu- lation on a street is too great for the closest shelter, the model finds the next available facility and allocates accordingly. The user also has the option of varying the time to reach shelter from 1 to 99 minutes. The output from the model consists of three separate types. First, a summary listing is given for the entire area and for each facility. This summary gives the number of people reaching the facility and the mean time taken. Secondly, as seen in figure 2, a detailed list is presented describing the allocation of the people on each street segment. The segment to which they are allocated is given along with the number of people allocated and the time to reach the facility. Finally, output is available in computer map form. Each allocation is plotted on an E.A.I. 3500 Data Plotter using different colors to separate the areas and overprinting the area number over each link. Specific Use of the Geographic Base File The NAPS model is actually divided into two main parts - the preprocessor module and the allo- cation module. For purposes of describing the specific uses of the GBF in the model, though, I will classify action into four phases. Although a full description is not possible within this paper, I feel it is important to briefly describe these detailed uses. The first use of the GBF is in the preliminary processing. These actions refine the network described by the GBF. The first task is to check the census tracts on both sides of the segment to see if the block face is within the appropriate planning area. If it is not, the segment or block face is not used. Secondly, the model deletes all nonstreet segments by checking the nonstreet code. Thirdly, the model reclassifies the accepted street segments to the street type indicated by the user; these may be pedestrian paths, city streets, freeways, or barriers to movement which will be deleted. The purpose of this process is to enable 42 rHCOM'S't-CONOtHNt^TfNi-IOrHO^.NOOO onoiDNMeiinddi'eoHi.n VhSAVMriH 0(>0010rt'*Offi N ri ri B9. coi'Oicoflomoivffiflioom^ooootowtoooo^oiN^oiOH^toinNomtfiMHOotfloi oooooooHNh^NHffiHnhNooo^OhnionHOinooiON^TrONefainuiooict-ooQ H ooooooooioirtmmmininmtnmminoirtOOOiftooommoootrtOmooooinoooirtmo OOOOOOOOCMO)OOOiHrtmtONt»Mt-MINt~t , 5TJaox oooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooo rHrHrHrH.-IHi-lrHHrHiHHHi-liHHi-liHHiHHtHiHHtHtHHH.-lrHHiHiHiHiHHrHrHrHrH oooooooooooooooooooooooooooooooooooooooo N OONOOiHNOmcOINN. 0) H 01 O) A A 91 00 H H rH rH N N t~MOOO>OrHNMiHMC»5CMINt»NiHNNO>050"3 , 'a , inoO-lrHrHHT-lrHi-lrHrHHH oooooooooooooooooooooooooooooooooooooooooooooooo NNNNNNNNi-IHHHHiHiH^li-lrtrHiHrHrHrHi-liHrHHrHHrHi-lrHrHrHi-li-lrHi-lrHrHrHHrHiHrHtHrtH §0S t»MO©HNC0^rHin<0PJ^P1^i-IC>).M^OOI~OiHinO^tD*l , O(Nr-l« w winin G B I S ACCESS COORDINATES AND BUILD KEY PERFORM DATA RETRIEVAL GEO- GRAPHIC DATA BASE 63 addressed to keep us from reliving this experience. The use of DIME as a data organization and access tool in the integrated system addresses these issues. With GBIS, DIME coding becomes an integral part of a wide variety of the routine local government operations. While retaining its value as an analytical tool, the expanded use of the coding scheme under GBIS provides additional maintenance benefits not present in a single usage situation: Coding and digitizing are integrated into the existing municipal operations to cut the cost of a special maintenance program. Maintenance costs are easier to cost justify with the increased use of the DIME files. More wide- spread departmental support for maintenance and use. Likely candidates for municipal operations to support the required DIME coding efforts include: Subdivision plat processing Base map maintenance Street dedication and acceptance proced- ures Likely beneficiaries of these maintenance operations could include municipal activi- ties such as: Real property tax assessment Building inspection Street maintenance Utility billing Police and fire dispatching In short, our research indicates that inte- grating the use and maintenance of DIME files into routine local government opera- tions will be a key factor in minimizing maintenance costs while maximizing user benefits. (3) Creating GBIS Required Special Prepara- tion of the Census DIME File GBIS is based on several assumptions concerning DIME file coding including: Each coordinate pair represents a unique position in geography. High/low address ranges are accurate. Street names conform to the accepted spelling. Independent checks were made to verify that these assumptions were accurate, and to- correct cases where they were not. While our file had been topologically edited, problems were encountered in using the coordinates. When plotted by the University of Kansas, severe distortions were noted. As a result of our experience, we feel that the final edit of DIME file geography should be an actual computer plot of the file. This type of check in the final analysis can tell the coder much more about the geographic accuracy of the file than any other single type of edit. These errors were corrected through a manual digitizing process. We feel that this type of manual correction is preferable for low volume digitizing. It is inexpensive and offers the same relative accuracy as the automated process originally used to create the DIME file coordinates. While the auto- mated process can provide more accuracy, the Census Metropolitan Map Series is too distorted geographically to take advantage of this increased precision. The address ranges were field checked and found to be seriously in error. About 33 percent of the street segment records contained errors such as: Ranges which overlapped into neighbor- ing segments. The direction in which address numbers ascended was juxtaposed. Addresses within the segment which violated address range limits. The University of Kansas has made these corrections on the file. In addition the University has found a variety of duplicate spellings and nonexistent street names in the file. Since GBIS requires a unique street name and address, these errors were corrected. Having completed this, we then had a "clean" DIME file updated to 1968 . Other segments are presently being coded to bring the file up to present day. These type of corrections may be required for DIME files in other cities depending upon: 64 The accuracy required by the proposed uses for DIME coding The level of probable error in the original source data used to create the DIME file for a given city The final step in file preparation was the selection and massaging of DIME data elements for inclusion in GBIS. Of the 240 characters in the DIME street segment record, only 96 were found useful. This information was massaged to reduce the storage requirements from 96 to 56 bytes per record. These and other steps were taken to conserve storage area. DIME File Development and Maintenance Should Be a Local Responsibility The Wichita Falls experience indicates that the development and maintenance of DIME files can be handled by local government. There are several factors that bring us to this conclusion: Normal local government operations pres- ently provide most of the required updating information. Local governments stand to benefit most from the usage and maintenance of the DIME file. Urban street address conventions vary so greatly and frequently as to prohibit any Federal agency from accepting the respon- sibility for the routine updating of the geocoding data. There are roles in DIME file development, how- ever, for all levels of government. Exhibit V outlines possible roles for each level. The Federal responsibility appears to center around establishing general coding standards for a Geographic Base File System (GBFS) and pro- moting the use of that system. In line with this responsibility, the Bureau of the Census is sponsoring this conference and several publica- tions outlining the concept and use of the DIME files. Coding conventions which treat common problems in applying the GBFS in urban areas remain to be developed. The Bureau has also begun a program to develop and promote updating procedures in five metropolitan areas. Unfor- tunately, this program is presently without the appropriate funding. Ironically, the funding for projects such as these prototypes is where the Federal government can play a very meaningful role. A multitude of local efforts in the design and development of the Geographical Base File System will be crippled without some Federal support. State and regional planning agencies appear to have a general responsibility in the areas of: Technical training and orientation Coordination of parallel geocoding Specialized technical assistance Exhibit V. AREAS OF RESPONSIBILITIES FOR DIME DEVELOPMENT ~-"~~-—-^_^^ GOVERNMENT AREA OF -— ^____^ UNIT LOCAL FEDERAL REGIONAL OR DEVELOPMENT ~— -^______^ GOVERNMENT GOVERNMENT STATE AGENCIES Design Maintenance Manpower Funds Coordination Systems Update and Implement Manpower _ Manpower 1970 Files On-going file Operations Manpower - - Improve Mapping Base Monument at ion Manpower Funds Coordination Photogr amine try Manpower Funds Manpower Geocoding Software Manpower Manpower - Editing Routines Manpower Manpower _ Computer Mapping & Analysis Manpower Manpower - Development of Standard Manpower Coordination DIME Coding Conventions 65 State agencies have a permanent interest in local developments not present in decennial census operations. As such they can play a significant role in adapting Federally established coding standards to the local requirements of their communities. DIME files developed for neighbor- ing localities can make use of common coding supplies, data resources, processing facilities and manpower only if these efforts are coordinated from a State or regional level. These same agencies can provide the training and orientation needed to develop geocoding capabilities in local governments. In addition, many State agencies possess specialized equipment and expertise necessary to developing local geocoding systems. Our experience in Wichita Falls is a good case in point. The city has neither the experienced man- power nor equipment to establish geographically precise base maps. The commercial cost of such a job is prohibitive. The city could, however, produce these maps with the assistance of the Texas State Highway Department, which has both the equipment and expertise in aerial photo- grammetry. Again, some Federal support would be necessary to supplement State and local efforts perhaps in the form of DOT Urban Transportation Planning funds. This type of joint action in DIME development will be an important key to future progress. In the final analysis, however, the local govern- ment should retain the prime responsibility for the use and maintenance of DIME files. Local governments can benefit from this responsibility by using the DIME system to assist daily govern- mental operations, as we have with GBIS. The design and implementation of DIME systems must consider the unique requirements of indi- vidual local governments for geographically related information. This development must also recognize the need for updating and upgrading the existing "1968" DIME files. As part of this im- provement program, more local and Federal attention to the poor quality of existing urban mapping will be essential. A DIME file, as we noted earlier, is an automated map. As such the data contained in the file can be no more accurate than its source. The full power of geographic coding can only be realized as this sourceof data improves. In summary, our experience indicates that local governments can realize substantial advantages by using the data access properties of DIME. The full extent of DIME capabilities remain to be explored. These local explorations can proceed only with the active participation of regional, State and Federal authorities. The successful development and maintenance of these systems, we believe, will provide substantial returns for all the governments involved. Question Period Mr. Johnston--Do you believe that the amount of usage to which the file is put is dependent on city size? If it is, is it really necessary to automate all the operating files in a small to medium size SMSA. If it is not necessary, what can small to medium size SMSA's do with their DIME files under present conditions? Mr. Hanssen--Let me address that two ways. Geographic information is used on a daily basis, by every local government. In this context I am referring to geographic information as any data which is tied to geography. Analysis of this data by location or special area, we have found, is a very low frequency operation- -perhaps a monthly report would be a good example. As planners, a number of us deal with geographic analysis on what we consider a daily basis and tend to ascribe this same usage to other city departments. When you think of the tax assessor, he does use geo- graphic information but not for analysis. We tend to ignore this usage of geographic information. I think this relationship probably holds true up to a population size of at least a half million. I further feel that this contention is more likely to be true of smaller SMSA's rather than larger ones. As for nonautomated files, it would be my contention that using DIME files as a basis for at least formatting the addresses in anticipation of future automation would be an improvement in the existing nonautomated files. For example, pres- ently in the 3x5 card files in our building inspection department, there are all sorts of data elements. None of them are well formatted. If this data collection procedure continues, the file will become more and more difficult to automate, just by its very growth. As the file gets larger the trade offs involved in automating the system begin to prohibit automation. If we begin to use DIME as a way of formatting and organizing the data, the automation of the data will increasingly become more and more feasible since the data conversion will become easier to handle. There are other nonautomated systems, card and microfilm systems, which can benefit from 66 the standardization of data formats and geographic data retrieval such as was developed with GBIS. You do not need to have a computerized system to benefit from these geocoding techniques. Mr. Johnston — The reason that I raised the ques- tion is because we have come in contact with several SMSA's that really do not have any source of computers at this point in time. Their major question is not really having an update system, but "How do we use the files without the computer?" Mr. Hanssen--I think the technique is applicable to automated microfilm search techniques which, if you are looking at a small SMSA, are cheaper then buying a computer. Using DIME as a means of handling geographically identified data, whether it is computerized or not, facilitates both manual and automated access to data. Mr. Renshaw — You mentioned that you will have two index files, a street address index and an intersection index. lam wondering if you have a feel for the level of problems that you are going to encounter with references such as, rural route numbers, places names, building names, or shopping centers. Mr. Hanssen — In the case where these are used as common identifiers, the one that we always pull out when we are talking about this is the First National Bank; the computer does not know where the First National Bank is, and it will not unless you tell it. What we have done is build an alias file, "Also Known As" or AKA file. These types of files are common to existing police systems and are built on basically the same type of logic as the search for a criminal's alias. We are looking for a criminal address in this case, one that really is not legal but one which is none- theless used. The size of the initial alias file is going to be pretty small. Conversely, the number of address rejects are going to be initially pretty big since we can only hypothesize which alias will occur. As we start to lea,rn more and more of the alias addresses in common use, we can build a more comprehensive and effective file. This file will include addresses on State highways, farm-to-market roads and a variety of other nonurban addresses within the corporate limits. We feel that the expanison of the AKA file will be influenced by a learning curve. As you get enough information to build the alias file, you are subsequently also training clerks and terminal operators to use the correct address in place of the alias; so that when they see First National Bank they enter 800 Scott because they know the real address. For this reason we expect the size of the AKA file to level off at some point in the future. Now there will be some cases where address rejects will continue to occur and that is going to be more of a matter of training than address standardization. Mr. Snyder--My question pertains to analysis. I know that you have done some things with garbage facility service and sanitation service. What other analysis have you done or will you be doing? Mr. Hanssen- -Let us look at the early demonstra- tion of the IMIS project since it is a running system. Encompassed in this project is an auto- mated tax assessment system which will examine neighborhoods to determine where variations in selling price warrant reassessment as well as estimating what this reassessment might be. Hav- ing assisted reassessment, the system continues into Board of Equalization hearings to produce comparable assessments for each case. These reports on assessments of comparable properties will allow the Board to determine if real inequities exist. More information on this system and the project as a whole is obtainable by writing the IMIS project at the City of Wichita Falls, Texas. Getting back to analysis, however, where does this assessment system fit in? Along with the neighborhood assessment figures, the assessor or Board of Equalization could use the University of Kansas mapping package, CHORO, with the Census GRIDS mapping package to produce a graphic image of the report so that they can better com- prehend the results. What we are demonstrating here are good operation uses for computer mapping. Mr. Bieda--In classification of these assessment, maps, as part of the demonstration, you have cer- tain responsibilities to do a little pioneering. I realize the tax assessment may not be the right tool, but how about the blight analysis problem? Were you able to do any analysis? Mr. Hanssen — As a consequence of putting the tax information into the system, we have created a small data base which could be used by an applica- tion program for blight analysis to determine various residential conditions. While such a pro- gram has not yet been written (we have to write them as we need them), it is interesting to note that in the integrated system the more applications you put in the less new data that you have to collect to run subsequent applications. Using a "tax" data base for an urban blight analysis program is just one of many examples. 67 Mr. Cooke--I was very interested to hear your comment about field checking the information in the DIME files. I have a number of questions along those lines. How much effort was required in the field check? How do you define an error in the addresses? What exactly did the field check involve? To put this into some proportion, what is the approximate size of your DI ME file in segments or blocks? Mr. Hanssen--The DIME file contains 6,000 street segments and an additional 1,000 segments of nonstreet features. The field checking required 60 man-days using three two- man teams. An addi- tional 15 man-days were required to verify field work. We found errors in 33 percent of the address ranges after we checked everything in the field. What is an error? An error, in terms of GBIS, is an address range in the DIME file which over- laps or fails to exhaust the real address range in the field. For example, the DIME file may have 1 to 11 as a range while in the field it is 1 to 13. We found a third type of error which was where you would have 1,3,5,7,151,9, 11. This violates our address range assumption, so we had to take these and assign these addresses to the "Also Known As" file. I have rushed over maintenance rather hur- riedly and on purpose. Dr. Aangeenbrug from the University of Kansas was responsible for editing the DIME file as we now have it. I think it would be beneficial if we review with him some of the problems that the University of Kansas incurred in this work. Dr. Robert T. Aangeenbrug-EDITING A DIGITIZED DIME FILE: WITCHITA FALLS, TEXAS (This was a slide presentation, and what follows is an abstract of the remarks accompanying this presentation) Introduction This is a slide presentation which has been used in part to record the editing, creation, and final checking of a digitized DIME file for the city of Wichita Falls, Texas. Under the USAC con- tract we were required to investigate the geo- coding efforts of the U.S. Bureau of the Census; and with the cooperation of the Census Use Study, the Data Access and Use Lab and the Geography Division, we managed to edit and utilize the digitized DIME file for Wichita Falls this July. This talk does not contain illustrations from our rather extensive set of slides; however, they may become available for distribution at a later time. The funds for editing the DIME file including computer time and administrative overhead were donated by the University. This represents a rather sizeable research investment on behalf of both the state of the art and the Wichita Falls Consortium. The first steps in anticipating the use, creation, editing, and incorporation of a DIME file or a Geographic Base File into a municipal information system is familiarization with the network of infor- mation necessary to establish a Geographic Base File. Normally, key city boundaries and physical features are used for this. The initial step we took was to find the common base map most generally used within the city organization and by the public in Wichita Falls. It turned out to be a one inch equals fifteen hundred feet map which is produced by the City Engineering Depart- ment and was reproduced in various forms by several banks and civic groups within the town. We used this to acquaint outselves with the basic network of the city of Wichita Falls and to bring our Geographic Base File team up to a level of familiarity with the network. The second step we took was to strip this basic map of some of the noise. However, we kept a street index from the common city map as a handy look-up device during later editing stages of the file. The simpler 1"=1500' maponly shows the street and some of the key city boundaries which are more or less the equivalent of the final DIME file. A quick reminder- -the basic edits of the DIME file are: First, check the file. Are the number of records correct? Is the layout according to the instructions sent with the file? Second, is the topological definition adequate, and do we in fact have a network that is appropriately bounded? Third, are the coordinates accurate within ac- ceptable limits as defined by the user himself? The fourth check is the address check indicating that the segments form a logical sequence of address ranges within the local community. In this process, then, basic edits and familiari- zation are the key features. We now turn to a more detailed description of how we prepared the attack on the DIME file, so to speak. Orientation of the Project Staff The basic orientation I briefly accented in the introduction consists not only of the familiariza- tion of the general city network of geographic information, but provides a knowledge of the basic tools--the metro maps, the node maps, the geo- graphic hierarchy of places (MCD, tract, ED, block, etc.). Illustrated here are the node coded maps for Wichita Falls. You will note that the amount of information on these maps varies from very busy to extremely quiet. That is, in the downtown area, interlaced with railroad tracks, we find the intense packing of information. Here we note that the 1"=800' map series is rather difficult to use. Node numbers are hard to read, and the information is somewhat difficult to read. In fact, some of the information literally wears off of the maps if they are used. We combated this by ordering duplicate cronaflex maps, one set which we literally keep under lock and key. 68 69 On the edge of town, we find a rapid drop off of the level of information that is coded. The map indicates the immense variation of the rate of information packed on this network in a town such as Wichita Falls, not atypical for this part of the Great Plains. An interesting feature to note is the discon- tinuity of geocoding along certain drainage features. We note here that a specific drainage canal is coded in one tract and articulated in precise detail, while that very same drainage canal in the next tract is not coded and its inter- section with other key features is not even indicated. We also find on occasion individual nodes that appear rather unconnected in the net- work. These will require some additional familiar- ization and also a checking of errors. The difficulty with the node maps is that they are relatively noisy and somewhat hard to read. Also, the scale we worked with here probably is not entirely adequate, particularly for the down- town area. However, we were lucky. We dealt with maps of the same scale for the entire city. We held several briefing sessions on the use of the node maps. In the set of slides shown here, we illustrated the topological structure of networks, indicating nodes (vertices) and links or segments (edges) that existed in the networks. Our problem consisted of finding whether the street system and its intersections could be used to provide a structure for searching addressable information in the city's information system. The geographic coordinates which were added to the file would add the geographic location and allow us to map the information, change its scale and perform other transfer technology for analysis of such files. You will note here a simple sequence of slides indicating how information can be added to this file and also showing the basic segment file with its high and low address range and other infor- mation including that of nonstreet features. The orientation sessions, then, included general familiarization with the city and some of its features, especially new additions, boundary changes and special physical features that exist in the city. We also identified within the city organization those people most concerned with keeping address information up-to-date. We went to the Planning Division and the City Engineering Department, as well as the Assessor, where we obtained excellent cooperation and managed to get the basic building blocks of address information in reasonably good order. Cooperation was far better than we had expected. One of the important aspects of this orien- tation process is to identify a key person or key persons within the city organization. We found a man who knew both the street identification procedures and address assignment practices in the city organization, and who is in part respon- sible for their update and maintenance; there- fore, we depended on him rather heavily. Editing of a Digitized DIME File We received the software for the editing of the DIME file the summer of 1970 prior to our demonstration of the 400- block downtown area of Wichita, Kansas. This was reported in the first of these conferences held in Wichita on November 19 and 20, 1970. In January, 1971, we did receive a digitized version of the Wichita Falls DIME file. This was done through the courtesy of the Census Use Study which felt, even though the file had not met all the quality control checks within the Bureau, that it was important for someone to start using a digitized DIME file at an early date in order to test some of the software and introduce this in a local information environment. Please note that the rest of this presentation deals with a file that is in fact not entirely up to the QC standards within the Bureau. The file does represent, to the best of our knowledge, the current digitized files that are being sent out by the Bureau. Major differences probably occur in the process of digitization within the Bureau itself. We feel rather strongly that the kinds of errors and examples of problems we ran into will not change substantially for the bulk of our users; hence, our report to you. When we got the file in late January, we immediately copied it on a spare tape, looked at the format statements and waited for a double check with the Census Use Study on the status of the Coordinate Insertion program called COIN which was developed by Mr. Leo Scheurman. We received the program shortly thereafter. It was not until mid- February that we felt our time schedule and project obligation for Wichita Falls would allow us to start with the file. We reviewed the software again and did make some changes in the format of the file for internal use. We also changed several of our programs to make them work a bit more con- veniently for the process of debugging this file. Our first burst of editing was started after we received our node number maps for which we, of course, had to send out a large amount of cash, as well as, wait seven and a half weeks. We conducted a briefing session and isolated areas in the city where we expected problems with the 70 node structure. We also obtained descriptions of the address and street system and their update and maintenance procedures through the courtesy of the project team in Wichita Falls, led primarily by Eric Hanssen and assisted by people from the Planning Department. At the same time, we conducted informal sessions with the project staff on topology and its applications. This prepared them for the use of this structure and to anticipate special problems which do occur in networks of this kind. By the end of April we had received our node maps, made duplicates, and received the go-ahead to begin to debug the file and try to target a "clean" file for the Wichita Falls people by the end of May. This task was accomplished with a crew of three programmers, systems analysts from our staff, with the cooperation of two project members from Wichita Falls. This task was completed under my direction by the end of May. Essentially, the lapse time of key effort took place between mid- April and early June. The actual debugging and use of the topo-edit took us about three weeks. It went very well. As you can see on these slides, we dumped the file, printed records, bound the print out in a book and began staff briefings. We set up error codes for the kinds of errors found in this file. One of our research assistants came up with the scheme of hand-colored codes where there were specific error messages in the topo-edit. This then, was also sketched on the map. This slide indicates the segments and nodes where there were errors indicated on the first pass. We then went back to our main file, and looked at the errors on the maps and in the file. We tried to amend the errors due to reversal of nodes or lack of chaining or any other set of errors. Through our remote terminals, we performed much of the corrected data entry, and we went through the second pass. After four passes, we completed the update of our DIME file and had it topologically correct. In the second phase we took the topo structure and tried to edit this information using the co- ordinate locations as guides. These first slides indicate the file structure, that is, topologic file with coordinates. You can see it is somewhat messy and contains some rather extreme errors. Our plan was to take this file and try to match it to the scale one inch equals eight humdred feet basic maps for overlay purposes. We used a person from out own staff to visually match any errors in the network geography. The final file, by the way, had approximately 7 percent topological errors. For example, missing links, open chains, node reversals, etc. I consider this an extremely good first try. Also, the geographic coordinates were reasonably accurate; and again, I feel the information provided got us off to an extremely good start. On the overlay comparison we used a computer plot and the original node maps to overlay and identify digitizing errors. The digitized file contained two kinds of errors. One which we refer to initially as the "coffee break" error; that is, the one showing the rather extreme dislocation by roughly ten thousand feet in length along one of the coordinate axes. We were later informed by the Bureau that this may literally be a mechanical slippage rather than a resetting of a digitizer dial that we had first expected. We might call this a "digit slip" rather than a coffee break error. However, the former is also likely to occur given our experience with digitizing equipment in other projects. Rather than try to visually edit the entire city, we plotted each census tract and edited the entire area tract by tract. Several other types of errors occurred in the digitizing although the number of these was rather small. For example, there were sometimes problems of the block file not closing due to zero coordinates occurring particularly on the margin of the map. This, in the case of Wichita Falls, Texas, occurred with zero coordinates for State plane coordinates. In Texas the point of origin is in El Paso and the plotted distance would represent several million feet. This we controlled on a later edit, but it is something to watch out for. I suppose a simple search to identify (0,0) coordinate values would be worthwhile. These errors occurred primarily along key boundary areas, that is, census tracts and minor civil division boundaries which for some reason were miscoded with zero values. Other types of errors probably had to do with the standard mental reversal of numbers. A sequence of blocks was digitized correctly, but the machine coder sometimes reversed numbers on keyboard entry. This is not particularly un- common. It usually leads to fairly easily re- coverable errors. The final device we used, in addition to plotting the overlay as you have seen in these illustrations, was GRIDS. We received the GRIDS program from the Census Use Study under a special agreement to share our experiences in the use of this program and report any special problems with it to the Census Use Study. The remarkable thing about GRIDS was that it worked the first time we loaded it. One or two program errors were found, one of which had to do with its interpretation of some of the coordinate values; but it worked extremely well. It is an inexpensive program which can be used for printer-generated network checking, which is what we used in this particular example. Our cooperation with the Census Use Study was excellent, and I highly recommend the GRIDS package to most users. We were the first non- IBM user to use GRIDS and probably were the first to check an entire metropolitan area DIME file with this program. In the end we had a debriefing session with our staff to go over the kinds of errors we had. We then decided to recap our experience and put it on slides. At a later date, we may make a movie simulation of our experience. The general DIME file topological edit is excellent. The structure and accuracy of the digitized DIME file provided us was quite good. The major portion of the DIME edit system that we did not check in this first phase was the address edits. Here we felt the software anc the status of the address coding guide for Wichita Falls indicated some problems. Later on during our discussion, we can share with you some of our address edit problems. It may well be that between one third to fifty or sixty percent of individual address records or record sets may contain some form of error in the GBF's created for metropolitan areas. One thing we know for sure, that the problem is even 71 more difficult to tackle for larger areas. Geo- metric growth of error appears likely. In addition, there is the difficulty of finding a staff of people who are familiar with all the various address combinations possible in a large metro- politan area. To review this slide presentation, we found that it takes a skilled and motivated staff to implement a DIME file. The software is fairly difficult to use on small installations. This software requires about 70K (bytes) and is designed for use on a third-generation computer. Plotter editing is rather expensive for most cities, and we would urge cities to use the metro maps with a GRIDS overlay to do any kind of editing. Once the DIME and digitized edit has been completed, one can build other files. For instance, one can produce some of the population density mapping routines we showed you with relative ease. One can rearrange areas and do various kinds of analyses anc operational checking necessary in municipal information systems. The DIME topo-edit works. The digitized DIME file, or whatever we wish to refer to, seems like a rather accurate file and certainly an excellent building block to use. I commend the Census for its efforts and urge other users to get prepared to handle what is a rather complicated file with a rather good payoff. I hope you can get money to do this, and I hope you can get the users to keep the system updated. Question Period Mr. Fairbanks- -How many records were involved in your update and edits? Dr. Aangeenbrug- -There were 7 100 segments, and in the address edit, we found 2400 records that had address errors only. We have not really begun to check all of the variations of street spelling yet. Eventually we will get involved with that. The address edit presents the difficult Dr. Stevens --One point here. Did you say that the University is contributing to the project? Dr. Aangeenbrug- -That is right, about $120,000. It is contributing the time of several research professionals, as well as some computer time and other resources. I plan to contribute one quarter of my time to the project for three years. This, by the way, refers to the entire project effort over three years. It does not refer just to our geocoding efforts. task in this system right now. We have data annotations from the field operations such as; "This block side has an address range of 10 through 90, but there is a box number here that says RFD 3, and it is within the city limits of Wichita Falls." I am sure the fireman knows how to get there. There is a tremendous variation in density and in addresses in midwestern cities that are again different from the special problems you have. I am sure in the Northeastern United States additional and/or different variations occur. I think the Geography Division people have a pretty good global view of this problem. Mr. Fairbanks --What were the contents of the file you received nine months ago? Will it be similar to those files we will receive? Dr. Aangeenbrug- -It was the original DIME file with coordinates. The coordinates that we did 72 get, did not receive all the official approval in the Bureau; and we simply convinced them that some use was better than no use. We agreed to let the Bureau staff know what we found. It is just about three months lapse time. Part of this had to do with our initial manpower. We did not have that much. We started with a team of three people initially. We had to wait before we re- ceived the cronaflex maps, and then we could not read some of the cronaflex metro maps. It took telephone calls. But you know, in order to send maps the Bureau still had to get money in the form of a check. I had to make out a personal check because the State cannot provide any, and I had to get a temporary loan from the University Endowment Foundation. The way in which this information is distributed by us to our user is quite different. I wish we could do it with GPO coupons or something. Somehow, all these little kinds of hang-ups caused us a whole lot of lapsed time. Cooperation in the Bureau was excellent. The system was complex enough and then there were interuptions: "Do not bother me now. There are problems in Covington, or where ever, and we have got to do this for very good or more important reasons for cooperation on the national level than there is with you." After all, somebody had to deliver something to the President in November. These kinds of problems held us up. Others had to do with training. Mr. Hanssen--It might be of some use to know that as of July 1, Dr.Aangeenbrug's crew had expended 440 man-hours on topologic cleaning of the DIME file. That does not include the address check that we have done in Wichita Falls since that time. The only way to get geocoding started is to invest the manpower needed to clean up the files, find local government applications for it in the daily departmental environment where it will help somebody do his job and work with him in his use of the file. Mr. Johnston-- Are your experiences documented? Dr. Aangeenbrug--In part. We are coming out with a special publication on update and maintenance of the DIME file for Wichita Falls. We are currently having some difficulty with USAC which has not yet put any of the experience in final print, officially approved. This is being held back in terms of publication. We submitted our Systems Analysis Report, all 6400 pages of it, to USAC a year ago, and it is still not available to general users. This does not inspire people to document their work. The elapsed time is a serious handicap. I think that, if I can use the words, that is an extremely serious failure of USAC. You know here are people that literally worked double and triple overtime on this project, first the project staff in Wichita Falls and then our editing crew. I can no longer inspire these people to get things in on time when the feedback on official publications is this slow. As of today, I do not know when these documents will be ready. I have gotten a small grant from the University of Kansas to put these experiences in a movie which will probably come out in 3-1/2 months. I keep a daily log on my USAC experiences. I am interested in this aspect of our project and this Geographic Base File presentation will be, in part, incorporated into a training movie that will incorporate some of these illustrations. I am afraid we will not really get anything out for another three months, in part, because there has been a shift in funding which is not inspiring the people that are doing the task that we originally contracted to do. USES, MAINTENANCE, PROBLEM SOLVING-- NEW YORK CITY Robert Amsterdam GIST, New York City's Geographic Information System, has been in operation for the last 18 months. The principal elements of this system are the Geographic Base File, SAMS (Street Address Matching System) and SYMAP (general purpose mapping program). Both of those program packages were bought for use by the city. We are presently providing address matching and data manipulation services to about 20 city agencies. In addition, we have installed the GIST programs and files at the computers of six of the largest agencies, and their personnel have been trained to use the programs themselves. Our goals are to encourage agencies to reformat and standardize their files and to use the more common geographic codes, such as, census tract, census block, and health area, to promote more efficient com- munication within and between agencies and more use of non-city data, such as census data. The initiation and development of New York City's GBF is largely the work of Mrs. Evelyn Mann of the Department of City Planning. I am sorry that she could not be here today to tell you about it, but she is attending another conference in Washington, D.C. Our GBF is a blockside file with one record for each block face in the city, including under- water blocks extending out to the city's boundaries in the Hudson River and Atlantic Ocean, so that the file includes all land and water within the boundaries of the city. There are 155,500 records covering 37,600 blocks. Work on the file began under Mrs. Mann's direction about 10 years ago. Besides the City Planning Department, other agencies that have participated in development work over the years have been the Sanborn Map Company, the Tri- State Regional Planning Commission, the New York Police Department, our Bureau of the Budget and the Office of Administration. The GBF is a 342-character record and the most significant fields are related to the street name and the range of house numbers. The record may contain up to three old or alternate street names for that particular block side since for some streets different segments have different names. Each record contains the names of the two intersecting streets, identified at the high end and at the low end. There is an odd-even parity code so that if the range of house numbers on the block side - the high number and the low number - are both even yet odd numbers are permitted on that block face, the parity code will identify that fact. This happens in New York frequently on block sides that are facing parks or some other open area. Each record contains the census tract and block number, tax block number, and also a tax block suffix number which was added to the field because tax block number is not unique. The record contains a health area, health district, ZIP code, and community planning district, which is a community identification and has a legal definition. The community planning districts are defined in the City Charter and a community planning board has been established to participate in local planning of new facilities and services in each area. There are 62 community planning districts. Finally, we have the x-y coordinates of the block center. We have determined the center point of each block and all records related to one block will have the x-y values for the same point. That x-y is based on Tri-State Regional Planning Commission's digitizing work. The basic information in the GBF is updated on an annual cycle, borough by borough. One of 73 74 the five boroughs is always being updated, both to correct errors and to add new information. In the cycle that is going on now, we are adding 1970 census tract and block numbers and schoo' district codes. We also expect to add the Police Department's street code numbers. I distributed copies of a report "An Intro- duction to GIST, New York City's Geographic Information System" (see Appendix II). Earlier this year it was distributed by Deputy- Mayor Timothy W. Costello to all New York City administrators, commissioners, directors of planning, directors of research and directors of data processing. This was designed to acquaint them with what information GIST had and what capabilities were available. The response to that, I would say, was very satisfactory. We are now getting calls on the average of 3 or 4 each week from different city agencies asking to use GIST in studies with immediate objectives, or else, in longer- range projects where the existence of GIST makes the other project feasible. I have also distributed a pamphlet which summarizes all applications which the GIST staff has participated in. It could have been larger if I had included projects that we plan to work on, things that other agencies are con- sidering, but I limited it strictly to things that we have put effort into. I will mention four of these as examples of what we are doing. One significant one is the development of a new Vital Statistics Processing System. There are 130,000 birth records and 90,000 death records processed annually in New York City. The Vital Statistics Unit has clerks who code each form as it comes into indicate the method of delivery, age and condition of the mother on birth records; the cause of death and other medical information on death records. In addition, these clerks are required to code the health area and health district for each address based on looking up the information in a hand- book. We expect to replace that part of their operation by doing the address matching auto- matically and code, through the computer, not only health area and health district but also census tract, census block, community planning district, and school district so that this infor- mation is all captured on the vital statistics files at the same time. The new system, which is expected to start next January, will be more accurate, less costly; and it will also make the birth and death data much more useful to other city agencies, in addition to providing the regular weekly reports that the Health Department now distributes. A second application which has been important to the city government involves the multiple- dwelling unit file. This is a file that has been maintained for many years by the Department of Housing Maintenance. It lists every building in the city with four or more apartments. It contains about 147,000 records. The addresses and the street spellings in this file are well maintained. We matched that file to add tax block numbers to the records, and we matched all but 1,500 records. Those 1,500 are being individually reviewed to determine why each failed to match. There are a number of different reasons. One significant one is that our file, right now, contains only current official street names while the multiple-dwellings file has old names and un- official street names that are commonly used. Also, where we had information that a street had been closed by City Council action, the street had not yet been physically closed at this point. There were still buildings on that street and there were people living in those buildings. Our problem was that our file was too up-to-date. In the future we will not delete records from our file until there is physical confirmation that the street no longer exists. The reason this work was done was to enable the rent control operation to get information on assessed values, taxes, and tax areas from the Finance Administration. They needed the tax block number verified in order to get financial data. The multiple-dwellings file will also be used in our Community Shelter Plan project. Another project involves the file the Office for the Aged maintains for the people who have applied for half-fare passes. We have a program in New York under which any resident over 65 can apply for a half-fare transit pass. This will allow them to use buses and subways at half price during non-peak hours. There are 526,000 people in the program, 526,000 records. We successfully matched 92 percent of this file. The results are being used to prepare a sample for a new Federal program that the Office for the Aging is starting, Dial-a-Ride. They want to find out how much need there is for a taxi service or some kind of a car service which would take elderly people to hospitals and clinics. Also, the half-fare file is being used to determine the effectiveness and the reach of the Office for the Aged's programs, that is, how much response are they getting in each community of the city. Another project has used the Census Bureau's file of retail establishments. We were successful in convincing one group in the Census Bureau that the GIST and the Street Address Matching System were the best that could be used in working with 75 New York City data. Under Census Bureau supervision, their file of 70,000 retail establish- ments was address- matched with an 87 percent success rate. One of the reasons we did not do better was that we found out afterwards that the Census Bureau used a three- letter abbreviation for boulevard which we had never encountered before. I think i : was BLV. This was an ab- breviation that was new to us. We think we failed to match about 1,000 records just because of that one abbreviation. In addition, many retail store owners do not know their address. As is common, they give intersections of two streets for their address, and we cannot match on street inter- sections. Now for problems in using the file. I will mention four of them. Alternate names are a problem. I mentioned before the problem in which the street is known by one official name and has another local name in the community. There are many of these, many different variations on this situation throughout the city. Perhaps the most striking one is in the community of Whitestone in Queens. Here the city has official street names and official house numbers which were established about 20 years ago. Many of the homeowners in the area still do not recognize these and they use their old house numbers which are what appear on their front doors, and they may use the old street names; or they use the old house number with the new street name; or else the new house number with the old street name, so that four permutations are all in use. The Police Department, for their on-line dis- patching system, SPRINT, has had to build four parallel files to describe the area. We will have to adapt to the same situation. We are developing procedures to take in as many possibilities of these old and alternate names and make them recognizable. Right now, though, we use only official names in address processing. The old and alternate names that I mentioned before are not available during normal processing; they are in the file as reference only. Another continuing problem for a number of years has been discovering when a legally mapped street is officially opened. We have tried to track this down through all the city agencies. The latest clue we have had is from the Department of Traffic Control, which is responsible for putting up street name signs, directional signs, stop signs, and the other types of traffic signals. We asked them how they are informed when a new street is opened up. What we were told was that generally they receive no official word from the Borough Engineers who planned the new streets or from any official source within the city. Generally, when a homeowner complaint reaches them,-- "We have been living on this block for a year now, and we still do not have a stop sign," or "We still do not have a street light," or "We still do not have a street name sign," the Traffic Department personnel respond to this; they in- vestigate and determine that there should be a stop sign or some other type of traffic sign on that street, and go to work on it. Traffic Department people are concerned about closing this com- munications gap, and we are working with them. We believe this channel could be the most effective way GIST will have to find out when a street opens. Another problem to mention is that many of the streets on Staten Island and in parts of Queens are going up as part of private developments. The developer may or may not turn the road bed, the street bed, over to the city. If he does, there is generally a period of about two years when the legal papers are in process. That is, the action of the city accepting responsibility for maintaining the road is in the hands of the city legal council. During this period different city agencies have differing legal positions as to whether the road does exist according to standards and require- ments of each agency. Another problem for us is familiar place names, which have been called traffic generators. These include the names of hotels, hospitals, and apartment buildings where people live and do not use their normal address, but rather the name of the building as their residential location; also government buildings, department stores, theatres and other public buildings. These we are trying to include in the GIST geographic base file right now. The same applies to street intersections. Therefore, the UNIMATCH program which was described yesterday, would be a very powerful help to us. I would like to talk a little bit about what we have done with x-y coordinates. In 1962 the Tri- State Transportation Commission, as it was then called, had the entire region that they were responsible for flown; they had aerial photo- graphs taken, and digitized every block in New York City. They digitized all corner points. The file was edited and corrected to a very great extent and was used by Tri-State in land use studies and traffic studies. The file has been available to the city for a number of years. It is a reasonably clean file, but Tri-State never updated it. Around 1969 when GIST was getting organized, we determined that the easiest way to use this file was to take the average of the corner points for each block and call this the center of the block 76 and use this point as the x-y identification for the block. It has its limitations. However, we have found it useful for many city activities. I think the question this meeting should be spending more time on is, "What are we using x-y coordinates for?" I will briefly describe the applications for New York City. In the yellow booklet that I handed out, the map on the cover is an example of an incidence map. This was done for a study of property tax policies. We were asked to provide maps of 10 different land use classifications for each of which the study had prepared a sample of a thousand buildings. You will find toward the back of the booklet maps of single-family houses, walk-up apartment buildings, and elevator apartment buildings. You can see the difference in the distribution of the different samples of buildings. This was important to the tax study to show where the buildings lie and gave some insight as to what the property values ought to be in those areas. In a companion set of maps, each building is represented by a symbol which indicates its relative assessed value. We also used x-y coordinates in our ambulance allocation project. We wanted to determine what the incidence of calls from specific localities was, and we used this information as a means of determining ideal locations for satelite ambulance stations. These are now being situated at sites away from the hospitals and closer to the areas of greatest need. We have also used x-y coordinates, and they have been critical in the CSP-Community Shelter Plan- which New York City is developing. Here it is critical to be able to determine the location of disaster shelters and of the population clusters, to be able to calculate the distance from people to each nearby shelters, and to determine the most feasible shelter assignments. A final request that we are working on right now is from the Sanitation Department. They recently obtained high-capacity compacter trucks which pick up waste from hospitals and housing projects where trash is left in large bins. The trucks are equipped with fork lifts which can pick up the bin and dump the trash right into the truck. These trucks do not operate normal routes, but rather, go from designated site to designated site. The problem the Sanitation Department has is to determine the most effective routing for these trucks. The x-y coordinates of the 1,000 sites where pick ups are made are the input we are providing. With these data, optimum routes can be calculated. So I am saying that we have used x-y coordinates extensively, and the coordinates that we have right now have been valuable. The question I would like to pose to this group is, "Do you know how x-y coordinates will be used in your city?" DIME files are being produced and are now becoming avail- able. They will need some more corrections, but the time is approaching when this group has to ask, "Will the localities make use of the DIME files? Will it be cost- justified to maintain the x-y coordinates which are more expensive to update than, let us say, an address range? Will there be benefits to the local government?" I raise these questions because they will determine whether or not resources are provided to maintain the GBF, and if maintained, in what manner. I will mention one more subject that we regard as a problem right now; that is, whether New York City should go into the DIME project, and in what way we should? If I can attempt to summarize the feeling of New York City agencies involved in this work, it would be that the present GIST block center points are adequate for a large number of uses. They are not ideal. There are certain things that we cannot do with these points. For example, we cannot distinguish different locations on one block, nor can we identify points at street intersections. On the other hand, we feel that to have just one point for each node, as the DIME file has, would not give us that much more information. The possible users that we are seeking to address ourselves to now in New York are users who are concerned with the width of the street, and the exact configuration of blocks and streets. These can have very peculiar patterns, and some- times the DIME node system conceals the actual complexity of an area like Herald Square or Fifth Avenue and 23rd Street, where there are many streets coming into an intersection. The block that could appear on a DIME file as a small triangular block, may in fact be no block at all, but simply a widening of the roadway where two streets begin. These are problems that we have to address our- selves to if we want to serve the Highway Department, the Traffic Control. Department, and the utility companies. They are all interested primarily in activities in the road bed of the street. Our view at this time is that if we do try to develop anything more precise than our present coordinate system, it will be a system that identifies the corners of the blocks so that we can depict the actual shapes of the blocks, and also it will identify intersection nodes so that we can make use of DIME-type edit programs. Of course, we would want to be sure that we had a method of updating this as time went on. This is the challenge we face, and this is essentially what we are doing at this time. 77 Question Period Dr. Aangeenbrug--I almost hesitate to ask any questions since I used to work for the Port of New York Authority, and I remember the headaches of New York streets. I am quite interested, however, in how you can make any sense of the Tri-State file when used with trucks, when you have such problems as turn penalities for one way streets. You do not want to send the garbage truck up the wrong way. Mr. Amsterdam--With the GIST geographic base file that I have described here, that is, with block center points, we do not have the access to infor- mation on length of street, nor do we have data on direction of traffic flow. The Sanitation Depart- ment people are not too concerned with turn penalities. They are more concerned with the fact that in a housing project the disposal point may be in a back alley so that the truck driver spends half an hour backing down an L-shaped driveway in order to get to the point where he is going to make the pick-up. This is a more impor- tant time penalty than the time for turning with the direction of traffic. The delays at the site and the time spent in queues is where the time is lost. Obviously, a GBF is not going to give them any information about this. They are collecting field information to use in reducing these delays. Dr. Aangeenbrug-- We made a heroic attempt at using the x-y coordinates for the entire Tri-State area. One thing most people are not aware of is the particular way the x-y coordinates are ar- ranged. It is an interesting story, and maybe you can shed some light on it. We found that the axis of the whole system was set on the major north- south thoroughfare for the island of Manhattan. Rather than running north- south like most con- ventional coordinate systems, the axis is centered on one of the Manhattan Streets. Mr. Amsterdam--You are correct. The Tri-State Commission chose as their reference point, Columbus Circle. They wanted a point near the center of the three-State metropolitan region, a landmark that was well known and fairly explicit, such as the Columbus Circle Monument. You are right. They did align their grid system with the major north- south and east- west streets of Manhattan, so that the north-south axis is 8th Avenue, which becomes, Central Park West, and the east-west axis parallels 59th Street, which becomes Central Park South. This was convenient for their measurements in Manhattan. A lot of the Tri-State work is con- cerned with trips into the Central Business District, which is Manhattan south of 59th Street, and much of their trip coding was done clerically. This was entirely their own decision. It was a very convenient tool for them, and it certainly has been convenient for us. Another point to mention is that they gave Columbus Circle an x-y value of 500-500 rather than 00-00. All readings in the region, therefore, are positive readings and the actual origin point is somewhere in North Carolina, which is im- material for all of our calculations. Dr. Aangeenbrug--One of your questions ad- dressed itself to the potential use of DIME type information. I think the DIME features could fairly easily be extended. For instance, in an area like Times Square, having essentially no addresses on one side, it could be given a special category of addresses, say by identifying it as parks or traffic zones. Also, depending on how the DIME coding is done, you can collapse these nodes to special points. For example, we did traffic assignment in the Port of New York Authority Terminal, and we assigned zero value links and additional dummy nodes, extra zero coordinates, to indicate a cluster of interchange linkages. You can extend a topologic file this way. This would' allow us to trace a trip which comes into the Port of New York Authority Terminal and connects with 56 mid-distance regional buses. There simply would be a zero turn penalty for each possible trip linkage. You could see the opportunity linkages we had, and ultimately, we could add cost and time constraints. I think the DIME file, essentially a similar network, can be used in New York City in the same fashion. It is not all that difficult. The only problem you face, of course, is the updating and maintenance of such an elegant file. I suggest that if you take the DIME file at all, you simply take it down to Manhattan south of, say 59th, and see if you can get it to work there. Anyone that would want to use the DIME file and tries to debug it for all of New York City is dealing with too large and unmanageable a matrix. Mr. Amsterdam--I won't argue with your last point. Again, let me say our interest is not so much on traffic studies within the city. A lot of these are done, as you point out, by the Port of New York Authority, by Tri-State, and by other regional agencies. But the needs within the city are more focused on things like highway paving. 78 The Highway Department wants to know what size area of paving surface they will be confronted with for a specific contract. The width of a street is important to them and how that width varies, whether abruptly or gradually. Also the exact configuration of the block is important. All of this the DIME file conceals. It does not give you clear information, and sometimes it gives you mis- leading information. Dr. Aangeenbrug--No, you can use it. You can chain down the street as long as you have records that are tied to the DIME file, to an individual block face, or to the segment in this case. You can add pseudo nodes for complex blocks. If the Traffic Department keeps files by segment, the DIME system can carry it, depending on what conditional information you tie to it. It is simply an edit system which chains nodes and provides you information about a particular length of link. Even that you would have to add because you do not know the length of the link. All you want to know is the condition of the street. That has nothing to do with DIME, but though the use of the DIME system you can look that information up with very great efficiency. I think there is a potential there that the city ought to look into. I would suggest using a small sample, a place that has most of these problems. Mr. Meyer--I was wondering if you could spend a few minutes describing New York's Super- DIME Program? Mr. Amsterdam- -The Super-DIME Program is being discussed within the Office of Administration and the Department of City Planning now. It is an approach that would digitize block corner points, and we are investigating the feasibility of using high speed optical scanning equipment in this area. There are a number of devices that have been developed in recent years that can scan and record what they see on a black and white page -- even a page that is printed in colors -- and re- cord each color separately. They can also do some types of reasoning, analytical and interpretive work, analyzing what it is that they have seen, and not simply do what a camera does in re- producing what it was exposed to. You are per- mitted to make selections of the material. One of the devices we are looking at now has promise of being able to identify numbers so that we could identify tract numbers and block numbers from our maps and record these at the time we would be recording the block. Then we would have the capability to retrieve a specific census block or census tract and work with the lines in that area. We think it will be possible to use this method to merge the data on the map with the data in the Address Coding Guide in some automatic fashion so that a large percentage of the file would then be encoded, digitized and made into a GBF auto- matically. I think there are a number of problems here. You can do a high percentage of the work automatically, but it could be just as laborious, or make even more laborious and error prone, that percentage of the work which is left to be done by hand. Let us say, for argument sake, that 85 percent of the blocks can be successfully digitized and merged by machine, but there are oddly- situated blocks and parkways and other types of situations scattered throughout each map that then have to be done by hand. These could conceivably be more difficult to encode than is the encoding involved in the standard DIME file. These are the problems that we are trying to resolve right now. Mr. Cooke--I have a question which has to do with terminology here. DIME is a pretty well defined term standing for Dual Independent Map Encoding. It describes a geographic base file which is in network form and has been recorded by redundant encoding of incidence matrices des- cribing the network. I do not quite see how a "Super-DIME" file generated by an optical scanner would be a correct usage of this terminology. Could you explain that? Mr. Amsterdam-- Mr. Cooke points out that what we would get as a first step from this scanning operation would simply be the encoding of the block corner points. We would then have a second step to do, and this we would probably want to do if we went into this at all, and that is, to come up with an intersection identification so that there would be a unique intersection number, at least, that could be used in DIME edit programs. We would then have available to us the resourses that are part of the DIME maintenance activities. Mr. Hanssen--I would kind of like to reiterate something you said about coordinates being given up. Coordinates in our system presently are only approximate. In other words, they are relative locations in geography. However, there is a great deal that can be done with just these types of co- ordinates, forgetting for the moment the highly pursued "Engineering Accuracy." Forestalling any development of geocoding systems for the development of a precise set of coordinates, I think would be a great mistake. I think that what you are doing is the proper step to take. Mr. Silver--You indicated that the GIST file has been used. How do file records that do not match and are corrected get back into the file? Is there a system that you have developed to get corrected information back into the file? 79 Mr. Amsterdam-- Essentially right now we do. As I described with the multiple building file, where we are anxious to get 100 percent of these building addresses accepted and identified with the GBF, we do want to make all these corrections, indeed, with any error that we find in the GBF. These have been a particular concern to us. The procedure at present has been that this is work that the Department of City Planning is responsible for. Their maintenance of the ad- dress information has been done generally on a yearly cycle. This means that one of the five boroughs is always being updated. When mistakes are discovered, or new data prepared, it is care- fully noted and put into ledger cards for the next update cycle which could be anywhere from 2 to 8 months later. Now there are times when this has not been fast enough to suit some users, and we are considering changes in this procedure. Generally, this timing has been adequate. Also, it has been the best we could do with the resources that we had available. We realize, however, for some uses we should be responding faster to identification and correction of errors. Dr. Aangeenbrug--What are you using for your incidence mapping, MAP 01, GRIDS? Mr. Amsterdam--SYMAP. This is what we have used for most of our work. The incidence maps that you see are the only such maps that we have done. Generally, we have done contour maps or district maps. I might mention that we have also done another type of mapping and that is flow line maps where we show the direction of movement from a point to another point. The character used to print that line and the width of the line tell something about the volume of flow along that line segment. These are all capabilities that we have developed within the SYMAP system. Mr. Bieda — I am wondering what success you have had with working with some of the utility companies or the utility service companies? How does this relate to your efforts at this point? Mr, Amsterdam--We have contacted all of the major utilities in New York; Con Edision, the New York Telephone Company and Brooklyn Union Gas. There are people at all of them who are familiar with our work. We have had some follow-ups with some of the utilities. They are interested, and I guess when the carousel starts going fast enough they will hop on board. They are very specifically concerned with how we describe the street network. When and if the city moves forward in this intersection file, or this improved DIME file that I have been describ- ing, I think that is when the utility companies would be most interested. Clearly, they are most concerned with identification of their lines, pipes, switches and other resources that are buried under the ground. USES, MAINTENANCE, PROBLEM SOLVING-- SANTA CLARA COUNTY Richard W. Renshaw Santa Clara County as most of you probably know is a metropolitan area located at the southern end of San Francisco Bay. It has experienced rapid and sustained growth since World War II, reaching a population of over one million as of the 1970 census. Its economy shifted from agriculture to manufacturing based upon modern research and development in the electronic and aerospace industries. The county government is sufficiently large and sophisticated to pioneer in information systems, primarily through its Local Govern- ment Jnformation Control (LOGIC) system. We have, however, been handicapped by not having a satisfactory Address CodingGuide(ACG)orDIME file. Most of our work with geographic reference files has been in the development, maintenance, and use of our local Property Location Index (PLI). Consequently, there are some significant differences between what we have done in Santa Clara County and what has been discussed by other speakers. Before I describe our PLI file, I would like to quickly discuss some of our ap- plications to give you a feel for what we have been doing and problems encountered. Later I will cover the PLI and its maintenance and then get to the crux problems and problem solving. Local Applications 1 — Criminal Data System Justice Pilot Program — Baseline One of our most recent applications of ADMATCH has been to match crime records to the Property Location Index (PLI) to obtain the co- ordinate location of the incident, census tract and other geographic information. Actually the pilot program has been matching their records to the PLI to obtain coordinates for some time. This is just the first attempt to use ADMATCH which may offer a more sophisticated technique, if not improved results. This application has some intriguing possibilities since it is representative of the much larger world of operations involving all of the emergency service functions of local governmental agencies. It also poses some of the most cha llenging problems in the use of geographic reference files. If we are not able to meet these challenges in a relatively controlled research environment, what are the prospects for success under real time emergency conditions? Crime and emergencies are not confined to locations which can be neatly and uniformly described by simple addresses. A significant portion of the offenses occur in the streets and must be located by intersection rather than house number. Some locations are more easily identified by a place name, building or business name. Joe's Bar, Sears, Jake's Steak House, the Swenson Building or Eastridge Shopping Center may be the only, or the most meaningful location description available on a report. We would like to think that we know how to deal with these problems, but implementation may be something else. Ironically with as many intersections and place names as were in the last test, it is an interesting com- mentary on the state of the art to note that the majority of records failing to match were due simply to addresses missing from the reference file. 2--Education Department-- Welfare Data Applica- tion The County Office of Education is required to submit a tabulation of children in each school district who are on the Welfare Aid for Families with Dependent Children (AFDC) program. In the past this tabulation has been produced manually. This year the problem was given to RECAP, the Regional Education Center for Auto- 80 81 mated Processing, which provides services to education facilities in Santa Clara and several surrounding counties. The Planning Department assisted RECAP in conducting a test of ADMATCH to pick up the school district code and other geo- graphic references from the PLI reference file. The agencies involved seemed quite pleased with the test results, especially in contrast to other methods which had been tried, and are proceeding with plans for a production run when the next report is due this winter. This application is an example which requires a reference file with school district codes, a local data unit which does not come as a standard feature on the ACG/ DIME files, but is very important for many potential applications. 3- -Voter Registrations In an early attempt to use ADMATCH with a large file we matched 440,000 registered voters against the PLI reference file to identify the 1970 census tract to enable us to tabulate registered voters for purposes of reapportion- ment. Later the results were also used to help evaluate the election results of two transit issues which lost in the 1970 elections. Before this task was completed, we had a much better appreciation for both the strengths and weaknesses of all the files we were working with at the time. 4- - Establishment — EmploymentFile Development Our first use of ADMATCH was to match employment records from our State files to the PLI to pick up census tracts. Later the output from this match was used as input to match the employment file against the unsecured tax roll to identify branch facilities and businesses not covered by the State employment records. Geocoding is only one of the major problems of tabulating employment by small areas. Classi- fication, branch facilities, and other source data problems are also severe problems which must be solved. Since this application involved match- ing business establishment records, address to address, it provided a good shakedown for ADMATCH, and poses some formidable challenges to the technicians working to make this software more suitable for local operations. 5- -Other Applications The Planning Department has been working cooperatively with a local bank on an application that is essentially an extension of our own estab- lishment-employment file. In this case the bank wanted to match a commercial file of business records to our PLI to pick up grid coordinates and census tracts to help evaluate branch bank locations. Banks, utilities, and other private businesses have considerable potential interest in the application of geographic reference files and tools and could make significant contributions to their development, maintenance and use. Some other applications of geographic ref- erence files have included the preparation of a Census Tract Street Index publication and several special indices for local usage. Our 1966 Special Census records and the 1967 Land Use files were cross-coded to the 1970 census tract codes. Considerable effort was made during the ACG Improvement Program to coordinate it with the editing of our own PLI file. The Department has done some experimental work with the GRIDS graphic display program, and now that we have grid coordinate information on our reference file, we hope to do considerably more with this tool. Reference File Preparation and Maintenance As emphasized earlier, we still do not have a usable Address Coding Guide or DIME file. While this has handicapped us in many respects, it has forced us to rely more heavily upon our own local resources. Consequently, we have probably made significant gains in terms of learning how to use and maintain these local files which will help appreciably when the DIME file does become available. The County has a Property Location Index (PLI) which contains records for nearly 300,000 parcel- situs address records in the County with geographic information concerning the Assessor's parcel number, street address, census tract, ZIP and post office, tax area code, grid coordinates for the center of the parcel, and some other references. This file was created several years ago and has been developed and maintained under our LOGIC program. The maintenance for this file is primarily the responsibility of the Assessor's office and Data Processing. The Planning Department has responsibility for main- taining census tracts and other special geo- graphic codes and has participated in some special editing activities. The PLI file will become part of a larger property system. The design of the initial property system has just been completed and implementation is expected to start in the near future. The new system anticipates that the Public Works Department will be brought into the program with maintenance responsiblity for the grid coordinates of parcels. The PLI file has been the basic source used to generate most of the geographic reference files we have used in our work. To reduce the PLI file down to a more manageable reference file suitable for address matching and other ap- plications, we compressed the PLI into "least 82 common records." The "least common record" is a record with parcel number and address ranges for all properties which have common geographic codes and street names. It is es- sentially a block face segment similar to those in an Address Coding Guide except that it repre- sents actual address ranges rather than "theoret- ical" ranges. The file also contains all of the important local geographic codes maintained in the PLI file. We are now carrying a count of the situs addresses contained within the address range for the least common record so we will know how many addresses are represented by each record. It is this file of "least common records," or what I commonly call the PLI Reference file, which has been used for all of our work with ADMATCH and most of our other applications. Problems and Problem Solving 1-- Local Reference Requirements Working with local geographic reference materials and file maintenance operations has given us an excellent opportunity to observe the strengths and limitations of our files, learn where maintenance problems occur and get a feel for how we can apply this information in a wide variety of applications. From our experience, it is apparent that to develop and maintain the geo- graphic reference information necessary to carry out the applications important to the operations of local government, it will be necessary to have a system of compatible files which would include: a) a good base mapping program to provide the primary geographic controls, b) a DIME file to control geographic references in computer- readable form at the level of the street and boundary network, and c) a parcel-situs address oriented file such as our Property Location Index to control and maintain the more detailed geographic units which lie within the structure described by the DIME file, such as parcels, house numbers, buildings, etc. After having gone through several PLI main- tenance cycles, it is clear that the ma jor problem is the lack of adequate control information to en- sure accuracy and consistency of street names, coordinates and geographic units coded in detail for 300,000 parcel and situs addresses. I am hopeful that our base mapping program and the DIME file will provide the added controls necessary to make our PLI file more adequate as a geographic reference tool. At the other end of the maintenance problem, I am convinced that our PLI maintenance program is doing things and providing critical information that could not be adequately maintained at the DIME file level alone. Our Assessor, the Registrar of Voters, and Public Works Depart- ment are processing transactions daily for parcel and situs addresses to update their files. New house numbers, parcel identification, multiple situs addresses, building names, and shopping center references are some of the detailed geo- graphic units lying within the street and boundary network of the DIME file which can probably be maintained and utilized better at the level of our Property Location Index. 2- -Organization Local governmental agencies can provide a wealth of information needed to develop and maintain geographic reference files. The county has good maintenance programs already estab- lished for the Public Works base maps and the Registrar of Voters files in addition to the PLI program in the Assessor's office. There are also many other public and private agencies which would have valuable information which might be utilized in our files. There are many professional and technical people with vital skills in base mapping, editing and maintaining files, systems analysis and programming, applications planning, and knowledge of operations scattered throughout many of these agencies. Development of an organization and procedures necessary to utilize and coordinate these resources is a major problem and challenge to the maintenance of any geo- graphic reference file. Organization and financial problems exist at all levels of government but are particularly acute at the local level where the skills and resources necessary for an effective system are split among numerous semi- independent agencies too small to carry out an effective program by themselves. The inability to coordi- nate and make effective use of these skills and resources is costly in terms of money, but the cost in terms of errors and conflicting infor- mation is equally important. 3- -Making Effective Use of ADMATCH ADMATCH is a highly sophisticated tool with extensive capabilities and potential. However, the existing users manual is inadequate to do justice to either its capabilities or limitations. We found it helpful to do quite a bit of experi- mentation to determine more precisely what we could accomplish with this software. One of our most productive experiments was simply to punch up a deck of cards with about 150 examples of address patterns which were of concern to us, preprocess the deck and dump the results to see what ADMATCH did with each case. 83 The tables used by ADMATCH, especially the pattern recognition table, allow the user to tune the system to fit the local situation and provides a good tool to assist the user in understanding his own data files and problems with address standards and abbreviations. Expanding the "ADMATCH User's Manual" documentation and ADMATCH capabilities, particularly as an opera- tions tool (in contrast to being primarily a plan- ning and research tool), are areas in which I feel the Bureau of Census and other Federal agencies could productively expend some effort. Mr. Meyer has mentioned development of the UNIMATCH program. We will follow the progress of this program with interest. 4--Geographic Reference Standards and Controls Perhaps one of the most important results from all of our work with ADMATCH and our reference files is that it is forcing a review of county standards and controls (or the lack of them) , and it is providing us with some of the critical tools necessary to develop more adequate standards and guidelines. Santa Clara County's problems are probably similar to those in many other areas. Existing standards are generally inadequate and pose conflicts with common practices. Developing new standards will not be easy nor done quickly. Working with ADMATCH has helped identify many conflicting abbreviations, differences in meaning and common usage of city and place names, ZIP codes, and other problems. It has also helped us measure the level for potential problems more adequately, and helped develop some new approaches to standards and guide- lines that may be more effective. Unfortunately, the complexities and depth of this subject will not allow any detailed discussion in this paper. A rough outline has been prepared for review of our current street name and address standards in Santa Clara County with some suggestions as to how we might approach the problems. This outline is being reviewed informally now, and we hope to come up with more specific proposals for discussion in the near future. The URISA (Urban and Regional Information Systems As- sociation) Special Interest Group on Data Stan- dards has expressed an interest in these problems and will probably be doing some work on them. For the benefit of the Bureau of the Census I would like to comment on the limitations of "blocks" as a geographic control unit in the DIME file. A block is, at best, a difficult unit to define and work with in many situations. As the DIME files are extended to the more rural areas, the use of cul-de-sacs, planned unit developments, freeways, linear parks open space and other imaginative urban designs become more common, the "block" will become less and less meaningful as a geographic control unit. Santa Clara County has used and maintained "inventory blocks" for land use information for a number of years. Recently, in comparing the 1970 census re- sults against the counts from a special census done for Santa Clara County in 1966, it was learned that the conversion table we used to translate the 1966 results into 1970 tracts failed to recognize a series of block splits occuring between 1966 and 1967. Searching through a series of as many as eight sets of maps and records to locate and check out each of about 40 blocks which changed during that year is still a fresh memory. That room full of maps and historical documents is an ever present re- minder of the work and problems associated with maintaining a history of block numbers over any period of time. Personally, I believe that future geographic base files should concentrate on maintaining controls in the form of nodes, street and boundary segments and grid coordinates. These units are generally more precise and easily defined in terms which can be more easily controlled. Blocks may still play an important role, but they may become more like the "least common records" created from our PLI file- -a temporary geo- graphic unit which is derived from the nodes, line segments and other geographic control units put into the file. Summary Comments and Suggestions To develop and maintain an effective system of geographic references required to meet the needs of local governmental operations, it will be necessary to maintain: 1. A good base mapping program to provide primary geographic controls, 2. A DIME file to control geographic ref- erences in computer-readable form at the street and boundary network level, and 3. A parcel- situs address oriented file such as our Property Location Index to control and maintain detailed geographic units which lie within the structure described by the DIME file such as parcels, house numbers, buildings, etc. 84 Implementation of an adequate maintenance program requires the development of an organi- zational structure and coordination procedures capable of achieving wide involvement and utiliza- tion of base mapping, file editing and updating, systems analysis, and other skills and resources scattered among many different agencies. Adequate geographic reference standards, guidelines, and controls are essential to make geographic base files effective. This is a matter which should command the early attention of all Federal and local agencies and key professional organizations, such as URISA, which are interested in geographic oriented information systems. Question Period Dr. Stevens--You mentioned that the police are in need of grid coordinates which identify where a crime is committed. What do they do with this information once they get it? How do they use it? Why grid coordinates rather than simply statistical reporting areas or something else? Mr. Renshaw-- First, for operational purposes, the common statistical areas like the census tract have relatively little meaning. Second, the more detailed units like police beats are subject to constant change. The pilot program also recognized that if they start coding their rec- ords by police beats only, they may be stuck with them, and have to spend a lot of time trying to dig themselves out of a box they built them- selves. By using coordinates, they have the capability to link back to police beats, census tracts; or by using ADMATCH, they could carry all of these codes plus tax jurisdictions. Coordinates provide a convenient tool to display information by block face or whatever other unit they want. Some of the most impressive things that we have done so far have been with detailed listings of basic files, that is, where we sort and list them at a map and street address level. Users can look at these records organized in a more familiar perspective. They have the unit record information, and they are able to manipulate and handle these files more adequately. Mr. Silver-- Are the files you have been develop- ing being used in the Bay Area Transportation Study program? Mr. Renshaw- -Unfortunately, I am not in a position to adequately speak on that subject. The Bay Area Transportation Study program is in a state of reorganization. I really do not know what they are doing at the Bay Area level. In Santa Clara County we have a local transportation planning effort. The current activity is limited to the trans- portation zones which reflect 1960 census tracts. Until we get our DIME file, we will not have the capability of updating these zones. Consequently, not a great deal of activity is being done in terms of relating these current geographic tools to our transportation planning system. I hope that changes, but until we make this conversion, there is not a great deal we can do that would be productive. Mr. Fairbanks-- Are you denoting land use in your Property Location Index, or is this strictly a geographic identifier? Mr. Renshaw- -During the 1960's, Santa Clara County performed three land use surveys. Hope- fully, as our new property system develops, we will have better access to land use information on a continuing basis through our Assessor's file. We have done some work with commericial estab- lishments where there are tax records for un- secured business property. However, right now we are still very severely handicapped by the fact that our assessments files are on a "1401" computer system. When our new property system becomes available, we expect to use it for land use information. Mr. Pierce--I think it might be useful, for me at least, if you could enlarge upon your idea that there was a problem with standardization, partic- ularly in the area of geographic base indices. Let me expand on the question because we have had for many years some accepted standards, such as latitude and longitude, State plane coordinates, etc., although they are not necessarily being used. We do have vast numbers of people that are concerned with the identification of land, including survey people. I would hate to think that by any stretch of the imagination we are not including them whole-heartedly in any attempt to develop manage- ment systems which require a base, such as the geographic base file. Mr. Renshaw- -You started out with coordinates. Coordinates are probably the best geographic control that we do have. Although there are other problems, the biggest problem with coordinates is that they are not adequately being used. Many users claim, with some justification, that they are not prepared to use them. There data files are not in condition to allow them to make effective use of them. In Santa Cara County this 85 has been one of our key problems. We have had coordinates and we have had applications that could use them, but it is difficult to get the soft- ware, the file information and the applications together at the right time at the right place, with the right resources. There are greater problems, however, with some other geographic reference controls. We talk about cities, place names, ZIP codes, etc. The definitions and usage of some of these units are not always precise. We really have not adequately identified these conflicting terms, nor have we developed guidelines for their proper usage. For example, "San Jose" can mean different things depending upon the context in which it is used. Unfortunately, the meaning of "San Jose" is not always clear. This outline for reviewing our county standards goes into someof these problems. Abbreviations and address patterns are ad- ditional examples of standard problems. How should we abbreviate "boulevard?" Some people abbreviate boulevard, "BD" or "BLD." Others use "BD" or "BLD" to mean "building." One of our major local problems is the common abbrevia- tion "CR" to mean circle. Our county standards say "CR" means creek. Consequently, we have a major problem with conflicting usage of this abbreviation. Mr. Pierce- -Well, let me enlarge upon the question because I was really getting to the following point. Do you feel that we lack standards, or is it that we lack local as well as general acceptance, adoption and usage of the existing standards? Mr. Renshaw--Yes. Mr. Pierce- -Which? Do we lack standards? Mr. Renshaw--Both. In the case of geographic coordinates, lack of standards is the least problem. The problem there is: What level of accuracy is required for a particular application? For ex- ample, are parcel centroids coded within plus or minus 10 feet adequate? Or do you need parcel boundaries measured within hundredths of afoot? Mr. Pierce- -I do not want to create an argument, but I was concerned about this because throughout these discussions we were talking about a heirarchy of things, a heirarchy of usage. The standard is with us, a Federally accepted standard, regardless of its adaptation or use. Whether it is used by the engineer for a high order of accuracy on a 1"«100' scale map, or whether it is used by a statistician at lower accuracy, smaller scale map, is apart and separate from the idea that there is no adopted standard. Mr. Cooke--I was particularly struck by this last paragraph in your presentation, saying that ade- quate geographic standards, guidelines, and con- trols are essential to make geographic base files effective. I am particularly struck by this because the comments are aimed toward professional organizations, such as URISA, which are interested in geographically oriented information systems, and I am a chairman of the Special Interest Group on Geographic Base Files at URISA. This takes me right back to three years ago in St. Louis, to a paper I gave at the 1968 URISA meeting, where I was struck by this same type of comment. My reaction was to completely rewrite what I was going to say at St. Louis into a paper which I was not terribly happy with at the time. Basically what I said in that paper was that as a data processing professional, I cannot relate to a statement such as this. I cannot define geographic reference standards outside of the context of the specific problem. If you give me any specific problem involving any of these geographic ref- erence standards, guidelines, or controls, I can give you an answer. Whether or not someone abbreviates boulevard, BLVD or BD is totally immaterial outside of the context of the specific problem. It only becomes a material problem in a specific situation, such as address-matching a file where these two abbreviations appear. Then you can identify the problem right down to the point. Does he have the required entries in his "type" table in ADMATCH or SAMS, or whatever he is using? If he does not, then the solution is clear: Put them in the street "type" table, and you have solved the problem. I think we could talk back and forth for a long time on this subject and never get anywhere. I do not believe you can get an answer outside of the context of the specific problems. I think when you do identify a specific problem, it becomes very simple to answer almost every one of these questions. Mr. Pierce--Well, this is what I was getting at. It seems to be a local problem rather than a lack of standards. Another example is the struggle with the problem of employment data. We have a variety of codes used. We have SIC codes, we have DOT codes and a number of others which are used for other purposes by agencies other than the Department of Labor and the Department of Transportation; yet we do have these standard codes. They are not always being used, but we do have standards. I would submit in the case of street identification, the B ureau of the Census does 86 have a set of standards which are nationally accepted. Maybe they are not used; maybe they are not recognized; but there are standards. Our problem is that if we want to make a system work, we are going to have to get the local people, at the operational level, to use these standards rather than try to reinvent some wheels. Mr. Cooke-- You should be talking about making a particular system work, not about a general definition of guidelines. When you talk about making a given system work, the problem becomes very clear. Mr. Renshaw--Well, I agree with you in many respects. However, I think my point of departure would be in the adequacy of existing standards. Mr. Cooke--Do you have any specific examples? Mr. Renshaw--Yes, the outline for a review of our county standards goes into quite a few specifics. It goes into address components and abbreviations which must be recognized and handled within the context of local systems. Keypunch operators and clerks must interpret the meaning of terms and abbreviations used in source documents before they can put them into the information system. We do not have adequate standards or guidelines to tell them how to interpret the addresses they receive. Most of our address problems occur while interpreting and translating information from source documents to our file. The big weakness of current standards is that they were designed primarily for data processing and some special- ized functions. When we come to the more general needs of a title company or a bank generating a transaction for a piece of property, our current standards do not provide the as- sistance they need to write an address on a document that will eventually enter our infor- mation system. Address components, abbrevia- tion standards, city post office and ZIP codes, sorting and address matching problems, map maintenance procedures, etc. - these are just a few examples of where we have specific problems. We could spend a long time discussing the details of each of these; however, we do not have adequate time, nor is this a proper place for a detailed discussion of this subject. I want to thank Mr. Cooke for his straightforward reaction to my suggestions. If this has succeeded in raising an issue concerning the adequacy of geo- graphic reference standards, it will be an impor- tant accomplishment and a first step toward productive discussions on the subject. I also to challenge Mr. Cooke and his Special Interest Group on Geographic Base File Developments to help sponsor consideration of this issue through URISA, since it is perhaps a better place to discuss specific questions on the adequacy of geographic reference standards. GENERAL DISCUSSION Dr. Stevens-- As I indicated earlier, this is one of the most important portions of the conference in view of the fact that we would like to give every- one a chance to make some comments, suggestions or raise questions that come to their mind. Each of you should speak from your own perspective, your job and your organization with regards to what has been discussed at this conference and what we have outlined as the major purpose in regards to communication among users, potential users, past users, those with experiences and those that will be involved in future developments. This will give each of you a good opportunity to make your views known regardless of whether you have specific detailed knowledge about some of these systems. I would also encourage you to comment from your "organizational" perspective so that we can have the advantage of this as well. Mr. Amsterdam- -I would say that the most important topic that these conferences should focus on is application of the GBF's, particularly in terms of their pay-off. What are the localities going to do with the GBF's? Are those uses going to be cost justified? Are the localities going to be convinced at all? In the final analysis, are the mayor, the budget commissioner, or the county executives going to be persuaded that the GBF's are worth maintaining and that they can be used. From our experience in New York to-date, we expect that our Master Building File will be able to pay for itself. There will be a lot of savings from having more events and plans documented for each building in a form that can be communi- cated from one agency to another, such as plans to condemn and take over a property, or infor- mation on activities of private agencies effecting a specific parcel. At the moment, I would say that a building file can be justified. The justification for a GBF, it seems to me, is much harder to put a dollars and cents value on when you consider each application. Ideally, the GBF encourages efficiency; it encourages better communication; it is useful in many different respects. However, to put a precise cost value on its worth to a single department, to say that it is of value to them in their operation and that there are savings to be made out of the GBF, this I think is the case that the local people who work with these files are going to have to come up with. Mr. Johnston- -There are two areas that have not been touched upon, one which I think has immediate application, and a second which has farther reaching application. First of all, I think that one of the major problems in most planning studies is getting a grip on migration trends. I have a feeling that the use of geocoding is going to be a major breakthrough in this area, either through the Department of Motor Vehicle files or utility company files. I would be anxious to see some documentacon along these lines. The other phase is the use of geocoding in the evaluation part of a PPBS system. Now this is a bit more long range, and it obviously cannot be done tomorrow, but I think that tnis is a very important aspect of geocoding. Mr. Snyder- -I think the biggest question I have is that most' of the people at this conference appear to be in program development as opposed to actual use of the GBF. Unless I am totally wrong, ;he Census Bureau was supposed to be through with this project sometime in 1973. I really do not know how all of us are going to get it put back together and functioning. I have no idea how it is going to happen, and we are all going our separate ways and reinventing the wheel, as we have said so many times. What can we do about getting this thing put together, and who is going to take the lead? There are so many agencies involved; it can be COG, State, or it can be some Federal agencies. It certainly has problems in local agencies. How do we achieve some interest from other Federal agencies with funds, or who can we look to for help to get it done? We are using a lot of rhetoric in telling what we are going to do. This really concerns me, since I do not see a product yet. Dr. Stevens- -Do you see a great deal of interest at the local level? Mr. Snyder--We are having a lot of problems with that right now because of the good selling 87 88 job we did in 1968 and 1970. We were not able to deliver then and we still are not able to deliver. Frankly, possibilities of funding at the local level are slim to none. We have a lot of interest. We see a lot of possible uses. We would be thrilled to death to have something that worked. Mr. Shannon- -We have not used the machine- readable GBF's, that is, the DIME files at all. Ours has not been available. I am not even sure if it is available yet. The Houston Data Processing Department has plans to use it in a police statistics system. I am concerned with two things. One is what I think could be adequately described as a total ignorance on the part of local people in our area as to the very existence of such devices as geographic base files. Dr. Stevens--Does the Houston-Galveston Area Council have any active programs for educating them, or participating with the cities to update or maintain interest, or to use the files? Do you see this as a legitimate role for the Council of Governments or for similar regional organi- zations? Mr. Shannon- -I think that we represent the only organization that could carry out that kind of a role. Outside of sponsoring a Census Users Conference eight or nine months ago to familiarize the users in the region with the data items that would be available on various summary tapes, we just have not been active in the area of trying to stimulate local interest. We are not, I would say, too knowledgeable about DIME files our- selves, mostly because we have not had the use of the file. The Houston- Galveston Area Council (HGAC) and the Houston Chamber of Commerce have used the printed ACG quite a bit. I think the uses for GBF's will grow, particularly as some of our more urban counties begin to move into automated data processing activities to a much larger extent than at present. The other problem I am concerned with is updates. As we have mentioned, there are es- sentially no local funds for updating the base maps. We have an extremely active region in terms of the opening of new streets and new subdivisions, and we just cannot keep track of them. No one knows where they all are. There are no local funds available that I know of to provide for continual update of the system, although the post office would be a valuable resource. I think that we are going to have to fall back on the utility companies in the region. They are really the only people that know everything that is going on because of their service areas. They have to know because they have to be able to collect the "use fees" for the meters. We just do not have the finances or the manpower to carry this through. Mr. Lombard — I must say that I really enjoyed sitting in on this meeting. We are at a different level. We are not actually users of these pro- grams, that is, the Geographic Base Files. We are involved in the urban transportation planning studies through the State Highway Departments. Because of this, we are interested in the Address Coding Guide program that was proposed by the Bureau of the Census, later the Geographic Base or DIME file and also the standard package that the Federal Highway Administration has con- tracted with the Bureau of the Census to develop census information for the transportation studies at the transportation zone level. We are very happy to be here and see what some of the other uses are and what are some of the problems. Right now the transportation studies in this region (which will cover five States in just a matter of a month, as we are going to pick up New Mexico) are in the continuing phase, and our problem now is maintaining a good data base for the continuing process. When the Bureau of the Census first came out with their programs we were very enthusiastic and felt that these would be very good tools, both the Address Coding Guide and the DIME files. I guess we still feel they are useful tools. We feel particularly the Address Coding Guide with ADMATCH, which can relate secondary data sources by address into the zone level, will be useful to the transportation studies. We are disenchanted with the Bureau of the Census over the slowness of getting our standard package. Maybe we had some misunderstanding over when it would be available. We are very much dis- enchanted with just now getting these tools back out and making them available for use. We are still talking about a lot of development in them, but we really have not seen many areas that have gotten into the actual application of them. As far as our continuing support, I think we feel they are good tools, particularly the Address Coding Guide and ADMATCH. I think we will have to look at what the local areas are doing with them before we can enthusiastically encourage the State Highway Departments to back and fund maintenance of the Geographic Base File System with HPR funds. I think we need to see what the local people are going to do with them. One of the problems in continuing surveillance is just getting the secondary sources of information located and getting a good ongoing surveillance program underway. I think this has come out in many of the discussions here. Take the problem of getting employment data. Where do we go to find this data? Where should we be looking? 89 We are talking about building a file system here. We have talked a lot about the DIME files and the Address Coding Guide which are really file systems. If we cannot find the secondary data sources to cheaply put information in it, then we are just wasting our time buildingfile cabinets with nothing to go in them. We are looking forward to getting some of these secondary data sources identified and using the ADMATCH program with the Address Coding Guide to relate these data back to the zonal levels, levels that we can use in transportation studies. In that respect I think maybe we need to look at the existing tools that are available. Do not get me wrong. We are all for the continuing development of tools, but we still have the problem of the continuing transportation planning process that needs to respond to local questions as far as transportation goes. We have to use existing tools until the more sophisticated tools are fully developed. I feel that this is one area we need to address ourselves; that is, just getting the practical application going, maybe on a little less sophisticated level, until we can build into it more sophistication. Therefore, I feel right now, for the transportation study, the biggest use will be made of ADMATCH with secondary data sources. Dr. Stevens- -Do you see the Federal Highway Administration as a primary user? Mr. Lombard- -No. The Federal Highway Admin- istration is not a user, but we are deeply interested in the transportation studies which are the direct users. The structure of these studies varies from study to study, and therefore, which group within the study is the actual user will also vary. Dr. Stevens--Do you see your agency in a potential role as encouraging some standardization among regional studies in the use of some of these tools, that, for instance, might be used in North Central Texas but might not be used in other regional studies? Mr. Lombard- -I think that at our level one of our roles is to see that if there is a good tool avail- able and being used in one area that we en- courage other areas to use it, if it is applicable. I think that each area has its own specific problems, and sometimes a system used in one study is not applicable to another area. Certainly, I am concerned about different areas doing the same thing in development work, repeating what some other area has done. I do not like to see another area doing development work when there is an existing package available which can be used. Mr. Downey-- Just one additional comment inline with what Mr. Lombard said. We in the past have encouraged the transportation studies to utilize the Geographic Base File System. I think from what we have heard today that we may have to fall back a little now. It seems that there are so many problems in updating and maintaining the DIME files and that the costs are so great in doing this, that a transportation study alone will not be able to handle this task. We will probably encourage the studies to use the Address Coding Guide and ADMATCH in the continuing phase of the transportation study. I see many uses for the more sophisticated DIME files for all agencies of city government. If other agencies are willing to help fund the cost of maintaining and updating these files, I am sure that the transportation studies will participate and use them. The studies will not be able to carry the entire burden. Dr. Stevens--Do you mean funding or the actual job of development? Mr. Downey--Both, funding and doing the work. In my estimation, there is not sufficient staff avail- able to do the work. Dr. Stevens — I see a little bit of pessimism, possibly as a result of some of the things that have been said at the conference. Do you have any positive or optimistic views that might have come out? Mr. Downey-- 1 really am optimistic about the use of the Geographic Base File, but perhaps, not right now. The task of just getting the errors corrected and keeping them current is apparently very significant. It is going to take an effort from all agencies at the municipal level, not just one agency, such as the agency doing transportation planning. Dr. Stevens- -Undoubtedly it will require a tre- mendous cooperative effort among all of the potential users and developers of the system. Do you see your organization as one of these that would be very much involved in structuring and building or participating at some level even though it may not be funding? Mr. Downey- -I think that the transportation studies, which are local agencies that are re- sponsible for transportation planning within their areas would be very much interested in main- taining the files. Dr. Stevens- -Shouldn't they be participating? Shouldn't they be involved in whatever discussions or organizational arrangements that are developed for the continuing efforts, that must go on in order to eventually develop a tool that is not only useful to them but also to you? 90 Mr. Downey- -Oh yes. I think all agencies at the local level should be involved, but one agency will have to take the lead and be responsible for tying everything together. I am not sure what agency that should be within any given city. Mr. Carter--I work for the Comprehensive Plan- ning Assistance Program, which you might know as "701." We are interested, particularly, in knowing how the Geographic Base File System can be used to strengthen the position of a mayor or a chief executive of a city or county. In other words, I would go back to repeating Mr. Lombard's comments about application. I would like to see more discussion about actual application of the files providing the type of information to a mayor and his immediate staff that they need to be able to compare things like transportation systems, water and sewer systems, or housing, the types of information needed within a city so that the mayor can make decisions on where to spend his money and where to direct his programs in the future. I do not know exactly how close we are getting to this point, so I would leave it with that question. Dr. Stevens --Are you saying you have not received answers as to the "use" and "decision-making" capability of these files or at least, not enough to satisfy you that the potential of the Geographic Base File System is being realized? Mr. Carter--A couple of people got into the subject. Mr. Hanssen was talking about an integrated information system, but he seemed to separate user programs up at the top from the DIME file. It seems to me that they should be more closely related and more directed toward the decision-making process. Mr. Bieda--HUD is rather new at undertaking a program that is directly related to the type of information we are discussing. One thing that I see that is a common thread through the dis- cussions of these last two days, has been what is loosely termed "an efficient allocation of re- sources." In other words, we all do not want to reinvent the wheel. This is also where we stand in the Program Planning Technology Staff in the Office of the Regional Administrator, Region VI. As we see it, our function at the Regional Office is first to determine the need for HUD assistance in our area of concern, in the five- State area, and then secondly, to monitor and effectively evaluate the response to that need by our field office, the area and insuring offices. As we do this, many of the program areas that we discuss require geo- location applications at various resolution levels. Now I also wear another hat and that is with the Information Task Force created by the South- west Federal Regional Council which includes HEW, HUD.DOL, DOTandOEO. We are interested in a common thread basis whereby we do not have to build up the same kind of learning curves for obtaining this information. You may be aware that in 1968 we came out with a regional information systems handbook, a compendium of what was going on in 1968. Here we are in 1971, and many of us are exploring areas that involve committing extensive resources of time and monies. Perhaps when we talk of working funds to get financial assistance to update DIME at the local level, and other types of efforts that are beingdoneby many of you, if a compendium were prepared, as I suggested, it would possibly assist in obtaining these monies. Now I cannot speak directly as to who is going to take the lead, but it looks like from where I sit on this task force that the Office of Management and Budget might be in a pretty good position. Again, I cannot speak for them directly. Essentially, as I see it, and it has been addressed by several persons who have spoken so far, we need to put it all together, and to have to some degree a rather comprehensive listing or compendium of what is occurring. I think I can speak for many of the agencies. More and more of our programs require geo- location information, the particular item of these meetings. Let me give you a specific example, again with HUD; when we build our housing, we need to know now, as it is developing through program criteria and at a high resolution level, that housing is within our 2.6 million units that that we crank out every year before we can build a new area of housing. Essentially, my request then is in regard for a listing or finding out what you all are doing. At least this becomes the catalyst for developing not only what I think HUD's needs will be, but from where I sit on this task force, what their needs will be at that level. Dr. Stevens--Mr. Bieda, I want to ask you another question. Do you see the need and also the potential for Fede'ral agencies agreeing on developing some consensus on a common program that they can stimulate from their own organiza- tional programs? For instance, we have been talking about transportation, and now you have mentioned housing. Here we are talking about some similar programs that have a common need which should stimulate common Federal programs or policies in the local areas. Mr. Bieda-- Well, as you know, the President has proposed two or three organization plans that will probably result in bringing the various programs of Federal agencies under one agency. Let me give you an example. I have been told that there are something like seven different Federal agencies handling the water and sewer programs. 91 There is a very strong consensus to very ef- ficiently . allocate the HUD resources to one channel. In other words, to preclude the type of scattering of funds to accomplish the same objective. This is where I see the compendium coming in and addressing the program area that you are directly concerned with. For example, Mr. Lombard and I were discussing how Mr. William Yerkes, of our office, on an informal basis, sits in on their particular meetings to see that he does not give a grant through the 701 Administration for something that Transportation has already commissioned. Mr. Amsterdam--I just want to make a comment on some of the statements that I made earlier. I would like to clarify my remarks as to the potential use of the GBF's. I think I share the view of all the people that are using these files. There is nothing coming down the road that is going to be a more efficient vehicle for all the municipal information systems and for local planning than the basic GBF concepts. The wheel may not be smooth enough and maybe we have not figured out how to keep it lubricated, but I know that no one is going to invent a better wheel. The GBF is the vehicle to get us all going. Mrs. Kaplan- -I want to say that the meeting has been extremely useful in expanding the imagi- nation of some of us who are thinking about utilization of Geographic Base File System. After hearing about some of the uses and innovative approaches to Geographic Base Files that have been developed, it occurs to me that there is a multiplier effect operating here. In other words, the more that the Geographic Base Files are used, the more new possibilities occur to us for their effective use. Users gain so much in experience and familiarity with the files by using them that their capability for developing additional applica- tions for the GBF's is considerably strengthened in the process. As for the role of the Councils of Govern- ments in relation to the development, use and maintenance of Geographic Base File System, there are at least two main areas in which they can be effective. In the first place, COG's can be direct users of Geographic Base Files. For example, the North Central Texas Council of Governments is already putting them to use in a substansive way in our Regional Transportation Study. Another function which can be performed by COG's is that of coordination. This involves the administration and organization of efforts at maintenance and updating and the dissemination of information about the potential uses and ap- plications of the Geographic Base File System. Councils of Governments can also help to build political and financial support for their use and maintenance. In this connection, the North Central Texas Council of Governments is trying to bring together the cities of Dallas and Fort Worth for the purpose of organizing a maintenance effort in our region. We believe that the DIME file is just one of the elements in a regional information system that will have to be funded on a continuing basis, whether at the Federal level or through some local means, if we are to build and maintain the kind of regional information system necessary to support the number and level of planning activities underway in this region. Mr„ Dunagan--I can see some uses of the Geo- graphic Base File System of which we might take advantage. Population and migration studies would be of value to a health department. There was a little discussion on this subject. We had some discussion yesterday on the use of file data regarding births, deaths and immunizations. I do not know if such use would be practical at local levels. Also, I do not know just how the GBF's would be coordinated or how they would be made available at the local level. I think this is some- thing that needs more study so that better ideas and concepts could be developed in their use from a health department's stand point. Dr. Moore-- 1 might make a few general comments where I think I can speak for the Health Depart- ment on this issue. First of all, Mr. Dunagan hinted the Health Department is in a state of ignorance on the GBF system. I must say that this conference has been quite enlightening to Mr. Dunagan and myself. I can see some possible direct uses in our Health Department in such fields as vital statistics, communicable diseases, environmental health services and also in adminis- tration. Specifically, we would be interested in such things as the geographic distribution of disease incidence, immunization caseloads and other health problems for the purpose of re- locating health centers, relocating boundaries of nursing districts and helping in the projection of where the greatest workload is for our current nursing force. I can see we may have an ap- plication in our planning and budget require- ments to attain a certain performance level in community health services. I personally would be interested in promoting research and analysis of our own along these lines using the GBF's, but doing this in coordination with those other agencies in the city of Dallas who are doing work of their own. I am a strong believer in coordination and I do not like to duplicate work. Also, I am not sure of the level of sophistication with which this research and analysis work can be done. This remains to be 92 seen. I might close by saying that the Dallas City Health Department will maintain an interest in the progress of the GBF for Dallas and for the other major cities. Mr. McCann--I think it might be good for us to look at where we are. Our agency helped to finance the State Highway Department with some- thing like $70,000 to develop the DIME system in many of the Texas cities. Essentially, our job is a twofold job. We administer the Federal Aid to Highways, and we also try to encourage the use of good practices and new tools. I think with data processing we have an oversell to a certain extent. I think that this is also true in the case of DIME files for our purposes. I personally feel that the cities will have to be much more involved in this than the Federal government or State Highway Department. I know the State Highway Department has an excellent data processing shop and has an ex- cellent capability. From what we have heard in this session, this data processing capabilitiy is mandatory to maintain the DIME system. We would like for them to use new tools for land use, population, or in any way it will help us in the highway program. The maintenance of local DIME files, how- ever, does not fall within the work activities of the State Highway Department, and without this maintenance, the $70,000 spent thus far by the State will yield little benefit. The State Highway Department will be, at best, a peripheral user of local DIME files and should share the DIME cost accordingly. I would hesitate to encourage the staff in further involvement with DIME until local areas show considerably more interest in using the DIME system. From what I under- stand, few cities in Texas have much interest in DIME or the data processing capability to handle DIME. Mr. Crellin--There is one point which I think should be emphasized. We should separate the functions of correcting the files upon receipt and that of the continuous update and maintenance of the files which reflect such changes as new subdivisions, street extensions, and annexations. I am very concerned that local participating agencies will be very discouraged with the condition of the file when it is received. In part this might be due to an oversell of the capabilities of the files. I believe the local agencies will be overwhelmed by the clerical and computer processing effort still required to get the file to some acceptable level of correctness. I think this effort will require another sales job to local administrators. I do not believe, however, that the continuing update will be as expensive or time consuming as expected. The SCRIS project in Los Angeles was to design a system to update the Los Angeles DIME file. At this point in time, we have updated the MMS maps and keypunched the corrections. We are in the process of writing and testing the UPDIME programs. The update was a large effort; however, I think the effort required to correct those re- sidual errors in the file will also be a larger effort. I think that local agencies must look at these two processes as separate problems. One problem is to bring the file up to some acceptable level of accuracy and the second is to update the file. As a separate point, I was very interested in the comments on health planning. At SCRIS we are very heavily involved in technical as- sistance to the health- related agencies. Much of this involvement is related to the loss of a major facility, the Olive View Hospital, in the February earthquake in Los Angeles. The purpose of the study is to determine whether or not a new facility is required, and if so, what service should it provide. In conjunction with this project we have as- sembled data on drug abuse, probation, health, reportable disease, and census data both for 1960 and 1970. This data was summarized in fact sheets which were used for public hearings. The information will also be included in a final report to the Board of Supervisors. I mention this because it is an example of combining local and census files which will be used in decision making. Mr. Treichel--I am glad to be sitting next to these two fellows. They admitted to not knowing much about it, and I do not either. There is an old saying in the Pentagon that goes, "When you are up to your hips in alligators, it is a little difficult to remember that the objective was to drain the swamp." I feel a little bit like that, listening to some of these comments on the local systems that are being developed. From the Federal viewpoint, to develop a program, a service program if you will, which will benefit the local community in terms of its emergency planning capability - that is, increasing its capa- bility to deal with a disaster - we run across these many different size wheels that you have to fit to the same axle. I am glad in a sense to see that the programs in which most of you are using geographic base files are rather simple and do not involve determining what size axle to use to fit all these wheels on. 93 I hope that OCD can continue to expand the use of the GBF's in other programs which might in- volve these local files, but I can forsee a hor- rendous problem with a national program in dealing with a few of the 200 entities and their various levels of sophistication. You may go into one city and it may cost you nothing to adapt your program, or next to nothing. In the next city you may have to start from scratch and do the whole thing over again. As loud as the protesta- tion, "We do not want to reinvent the wheel," has been, I have seen plenty of wheels. That is the only problem I see. Mr. Carbaugh--In listening to the comments that have been made about the GBF's as well as other geographic reference systems, such as New York City's GIST or San Jose's Property Location Index, there appear to be at least two distinct user groups. One group, armed with a long list of potential uses for the file's coordinate information, awaits the release of the Bureau's DIME files; the other group, with similar expectations, has been gaining experience through the use of the Bureau's ACG or another locally prepared reference file, in dealing with the problems of address matching and geocoding. The major value of the GBF lies in its capability for relating previously unrelated data files by providing a common identifier, a geographic location. This is not to say that the coordinate information is merely icing on the cake; on the contrary, the coordinates can be used for area and centroid calculations or for input to graphics programs or allocation models. However, it is the street identification (name and address) that provides the key to the file's geographic infor- mation. It is for this reason that I think those users who have had experience with the various address matching and geocoding techniques and have assessed the limitations as well as the potential of these geographic reference files will be much better prepared, and I suspect, more willing to participate in a program for updating and maintaining these files. Mr. Turnock--I have an observation that has been causing me some problems. It seems to me that we have a gap, as usual, between the state of the arts, the GBF's and its potential uses, and what is actually in existence in the various projects that have been implemented. We are still in the demonstration phase and we need to take the step from the demonstration phase to an operational phase. This is going to involve a strong commit- ment to take the steps necessary to insure the integrity of GBF's applications. The steps must be taken in order that people, engineers in particular, can be satisfied with the GBF on a day to day operational basis. A pragmatic example that I do not think anybody has mentioned in the conference so far is the mapping problem. Engineering and planning functions are, in practice, very strongly tied to map work and maps. Appropriate quality standards, updating mechanisms and correction mechanisms must be established and implemented for the graphic aspects of the GBF before it will be operational, particularly the DIME files. In this area partic- ularly, the Texas Highway Department andCOG's public transportation study have been working in close coordination with one another in the use of the DIME file for the spatial allocation of socio- economic data to selected areal units. Mapping problems, however, enter into the construction of the correspondence function between various jurisdictional units and unit analysis; the maps are substandard at this time. They are not the quality that is desired for extensive use for work in transportation planning. The Highway Department, whenever it does do a study, creates an individual DIME type of system with highway networks used for the transportation model. It would be good if we could get a file established in the area with cur- rent enough mapping standards and data that we could superimpose the Highway Department's DIME "network" system onto the system of a regional DIME network. At this time this is impossible. The maps are not of sufficient quality. I think that it is imperative that the various levels and jurisdictions get together to make the commitment of manpower and resources to move from the demonstration phase into the operational phase. Mr. Fairbanks --When listening to the comments of the representatives from FHWA and from HUD, I was somewhat disappointed by some of the statements regarding the need for a stronger look at the financing of the activities of local agencies when using the information-management tool provided by the GBF's. Many local planning agencies participated in the Address Coding Guide Improvement Program as a result of en- couragement from either the FHWA or HUD, or both. Our activities include transportation planning, regional environmental planning, and metropoli- tan clearinghouse reviews, as well as several projects such as our Urban Mass Transit Study. These functions must be done within the guide- lines established by the Federal agencies involved. Our applications for funding must outline what planning information we will acquire and how we will assemble and update it. In the past, we have indicated that the Address Coding Guides, now expanded to DIME files, would be the basis for controlling data acquired within out urban- 94 ized areas. We have also indicated our need to expand these files to include our entire planning area. Our proposed Operations Plan for Trans- portation Planning outlines our intended use of these files within our surveillance work ele- ments. This use will require the maintenance, updating and expansion of a usable, reliable DIME file. In view of the adaptability of the Geographic Base File System to information from multiple planning functions, as well as the money already invested in generating them, I feel the GBF is now my agency's most practical method for interrelating data from our various planning efforts. Our planning activities must reflect the needs and goals of more than 175 units of local govern- ment within the nine counties that make up our Ohio- Kentucky- Indiana region. For this reason alone, I would definitely agree that, at a local level, there has to be a central agency responsi- ble for developing, maintaining and using the files. Since our GBF's should be available soon, we have been putting more effort into bringing local agencies to more completely utilize the capabilities of the files. This leads to some unique and serious funding problems, not only in using the Geographic Base Files, but in using any of our data files. Is the use intended for trans- portation planning, HUD planning, or strictly the benefit of the local agencies involved? Dr. Lundberg, in his presentation, outlined the development of several address-oriented systems such as the reporting of building-permit activity and the housing information system. However, his Institute's research in these areas has not had the acceptance that was anticipated. Both Hamilton County's and Cincinnati's Planning Commissions meet their data processing needs through the services of the Hamilton County Regional Com- puter Center. Another consideration is that the basic unit for data aggregation might not meet their planning needs. My agency, OKI, also uses the services of the Regional Computer Center. Our use of data files and information systems such as those developed by Dr. Lundberg and his staff is tempered by our funding limitations. Our use of information of this nature must be based on the requirements of our transportation and HUD planning programs. The establishment of reliable information systems require both time and money. Since these systems can be used jointly for transportation, HUD, and local planning efforts, it is difficult to split the development cost and do one phase as a transportation planning element, another as a HUD activity, and finally, isolate still another portion as having only local benefit. This is a major problem area in building and using the Geographic Base Files: Where is the money coming from? Without funds, we do not have people, and without people to do the necessary edits and coding, we are pretty well restricted in our efforts. I, personally, would like to see some kind of a joint "pooling" of funds so that agencies such as ours, when developing and utilizing infor- mation management systems, can use highway and HUD planning funds simultaneously. Mr. Sylvestre--LEAA is not an operating agency with a direct interest in these types of programs, but we provide funding to agencies that do have a direct interest. In fact, I think we are the only growth industry in Washington with such funds. There is some work going on now in local police departments in using or introducing the use of the Geographic Base Files or similar geographic systems. You have heard talk about hard sell. There is some hard selling going on - not by the Census Bureau but by other groups that are promising great things. LEAA has also had a reorganization in which we have put more emphasis on the authority of regions in approving grants and programs and also in their ability to provide technical assistance to the States in these regions. We think that as our regions increase in staff, this problem of getting the word out that Mr. Meyer mentioned in his talk yesterday, is going to take place. We will see that our system analysts, who will now be out in the regional offices, know about this program, and that when people talk to them about their geographic file that they consider the application of DIME rather than other systems that might be used. When the representative of the Geography Division first talked to me about this problem some months ago, I had the impression that maybe police departments were the only problem and that except for police chiefs, just about everybody was using DIME. I thought our only problem was suggesting to the police chief that they conform to what other municipal agencies were doing, but we really cannot do that. It seems that there are a lot of other files being developed, partly because of the slowness of getting DIME out. It is one thing to get the word out that LEAA funds, when applied to development of geographic systems for handling law enforcement data, should be applied to the use, expansion and improve- ment of the Census GBF; and that this should be done because it is the most efficient system and will provide the greatest utility to the local users. We will not be very persuasive, how- ever, until the materials, that is maps, print- outs and instructions are in the hands of local agencies. The success of some vendors in sell- ing competing systems is certainly due, in part 95 to the uncertainty about the availability of the Census materials. I do hope that as the materials become available, as much of our effort as possible will go to this type of system, rather than competing systems, where we will not be able to use, compare or match with much other local data. Dr. Stevens--LEAA works very closely and has strong ties to the State governments, the State Planning Agencies in each State. Do you see this as an avenue? Mr. Sylvestre--The major part of our grants are to the States. Eighty-five percent of our grant money goes directly to the States as a simple mathematical function based on population. We do not have any say about that in terms of giving less to a State because it does not have a good information system. The State Planning Agency reviews the applications that are made and presumably, in a planning fashion, allocates funds for various purposes, including information systems or a regional information system that would include a GBF. I think, however, personnel in most criminal justice agencies have not heard very much about this. Mr. Hill, who was here yesterday, had heard about it, but not in much detail. I know he found this session very informative. I think the cities in Texas will benefit from what he has heard so far. He knows enough about it that if someone were to walk in with a problem he could help them or know to refer them to the right places. I think, though, we have a problem with most of the States. I am not sure that most of the State Planning Agency staffs knows about it. I do not think enough city police departments know about it. I think we have to face one other thing, a kind of jealousy of police departments that they might want to stick with or develop their own system. They are going to have to be shown that there are advantages to their participating in a cooperative system rather than just buying their own system from a vendor. It is a selling job that other people in city planning departments or regional planning departments are going to have to do to make sure that police departments do not take off in a separate direction. Some police administrations are not sophis- ticated in this type of thing, but some of them are. Some have been persuaded that they need locations accurate down to 2- 1/2 feet. The people here know about the problems and the pitfalls that a program will suffer during file conversion, editing, cor- recting, etc., and that the GBF has gone through most of this already. In fact, on this basis the GBF's should be well ahead of other geographic systems. Mr. Bohnsack--To put some perspective on the uses that I might suggest, I might say that we are from a department of Management Planning and Systems which has to do with management analysis, computer analysis and data processing. Our prime role, working from the office of the mayor, is to attempt to increase the performance and delivery of services of all departments within the city. In that respect, we have elected that the most probable way to get improvement is through the construction of information data bases for use by all agencies and all departments. In that connection, we have contracts, exclusive contracts, with both the Metropolitan Area Planning Commission and with the Council of Governments so that we talk about information from both the operating departments' and the planning peoples' point of view. Commenting on the GBF in that framework, I have to say that first, the value of it is evidenced by the number of people who are impatient and frustrated in awaiting it and by the number of people who have gone down their own lines in attempting to develop something similar. Never have I heard a greater argument, even from US AC, for the institution of intergrated data bases. Here we talked about many applications. I think it might be unfortunate that we have re- ferred to it as a "file" in the GBF's. Rather, it is a listing of locator or geographic descriptor items which have little value except as applied to some specific data that can be accumulated. Looking at it from the viewpoint of the inte- gration process, I think probably the Census Bureau did not know what box they were opening when they brought this thing up to begin with. I think they are to be commended for following down this line, and we should not kick either them or ourselves because we refer to so many things as "developmental processes" because that is exactly what we are doing. When we start automating something we are developing some new processes, and rather than really coming up with new problems, I hear evidence of the same old problems that have always existed. We have always had mapping problems. We have resolved some of them, in some ways manually; but as we put it on computers we have to resolve them in an automated fashion. So it is the same problems evidencing them- selves in the automated role. We hear comments about who shall take the leadership, and we hear comments about some agency ought to oversee this. We hear many, 96 many comments and long lists of possible ap- plications for which the GBF might be used. Obviously, in any information system, aside from the problem of pulling the data together from the very many functional areas, there are some organizational problems within the munici- pal, State and Federal governments in seeing these kinds of projects through. In the past we have all been organized along vertical, functional lines, such as the Department of Transportation to the State Highway Department, to the city and county engineers or traffic engineers. That is fine if you want to talk about using this thing simply as a tool in transportation. But to get across the lines from transportation, to health, to police and all of the other agencies involved, it takes a great deal more integration in terms of the data files from which we will draw information when we use this GBF, and that takes some coordination all across the municipal level. Likewise, I do not think States, as many have in the past, can look out the window and act like this thing does not exist or there is no problem. Funding is always a problem, of course, and those of us on the local government level have a tendency to look to the Federal government for a dollar sign. The Federal government simply cannot take the position that it is a purely local problem, primarily of local value in the use of the GBF. It was the Federal government that decided that one of the goals of the nation should be transportation planning and that it should be carried on with the State and local governments. It was the Federal government who brought up the housing program, the health programs, and many other programs that local governments must re- late to. Much of the work that we do on the local government scene- -and this includes information services--is in direct response to Federal pro- grams. Many of us- -and there are a great many working on the local level --are attempting to put together these integrated data bases and make some kind of improvement in terms of how we can perform and the cost involved. That is why we stress the integrated nature- -to cut the cost. It is difficult for us to relate to each Federal agency individually while we are trying to put something together in an integrated fashion. Yet, we find little on the Federal level to relate to in an inte- grated way. I think, too, that we need to look at the GBF from the following point of view. Most of us have been awaiting such a thing for a long period of time; therefore, we have a great impatience to put it into immediate use to show its value on some pressing problems that we have. We are frustrated when we find that we cannot use it that rapidly. Instead, we find that it is a development project. If it is to be done correctly without a lot of things to be redone at a later date, we are going to have to approach it that way- - in a very systematic manner. We should not expect overnight success from the thing or overnight completion of it. It is going to take a great number of years as we put it all together. We at the local level get many funds from the Federal government. If we proceed in a way to use this GBF by using the funds from each unit of Federal government in trying to take care of the immediate problem and, at the same time, building in the on- going system of updating the data bases which the GBF can reference, then we might get dual mileage out of those funds. We had the question here earlier, "What is being done to help the mayor in the decision-making process through this file?" Of course, that is a considera- tion. It can be realized in the integrated approach. I think we all tend to be too impatient, and we kick each other and ourselves too much for the time it is taking. It is a tremendous undertaking. It is going to require a great amount of coordi- nation. As was stated earlier, there are doubts that it could be justified from the viewpoint of the use of a single department. That can be said of any integrated data base because it can be defended on cost only through wide use by a number of depart- ments. That thought implies that someone on each level is putting together the over-all file structures across all departmental lines. It also implies that on the Federal level, if this thing is going to be successful, we are going to have to have an integration in information systems across functional lines to economically and correctly put this together so it can be a good operating tool for everyone. But, please, people on the Federal level, you do have a concern in it. You are asking the municipalities and the State people for infor- mation. You are even sending down funds for them to gather data on a one-time basis. Make it easy for them to use that data in an on- going program so it does not have to be done over and over and over again. Those are huge costs involved for very limited use, unless something is done along this line to structure an integrated on-going infor- mation system. Mr. Pierce--We have explored a lot of things and I am pleased that Mr. Bohnsack and others have summarized them so well. I think we should go a little further and get down to charges or suggestions of charges that can be levied on particular agencies to make these things function. I am kind of up to my eyeballs in this total, comprehensive, areawide planning and manage- 97 ment, development, decision-making process. Our area, for instance, extends from the most rural to urban areas with a wide variety of problems. Obviously, sophisticated planning and manage- ment techniques and tools, such as theGBF.are most appreciated and useful in the urban areas. I would like to focus on the related decision- making process because I think this is the business we are in. I would also like to focus on the problem that we have with selling the systems concept and the tools which can allow our systems to work. If decision-making is to be truly effective, then we have to support our decision makers with sound information and strong manage- ment, not planning for planning' s sake. We are dealing everyday with problems at the local level on how people make decisions regarding capital investments which amount to millions and billions of dollars, that include highways, water systems, sewer systems, etc. If we are not providing sound support for our elected official decision makers and local managers in these matters, then we are wasting resources. From this standpoint, then, I think what we are saying is we are trying to find and implement better management tools. The orientation here today and yesterday has been largely focused on the idea of planning. Well obviously, as I spouted off the other day, planning is a very essential part of this manage- ment and decision making process. Planning is vital to good management; but as a part of management, there is a lot more to it then just planning and putting things on a shelf for some- body to wipe off the dust in future years. Most management tools require a sound geo- graphic base such as the Geographic Base File System. We are, however, finding it difficult to interest various agencies in the use of such a standard system. Just in this small group we have seen the attempt to define this same kind of basic system in different ways, to put different terminology on it. One of the reasons is that people are searching for individual recognition and want to do their "own thing" instead of using the systems that we have and improve and develop them. This is a problem which I find very frustrating. At the same time I see some things which are very encouraging. Without belaboring this, I think we ought to talk about the need to improve our channels of com- munication and the charges we collectively or individually might impose upon our Federal agencies. The selling job to the local officials to encourage the use of these kinds of systems is heavily dependent upon good channels of com- munication, both forward and backwards, upward and downwards. The Bureau of the Census has a substantial investment in the Geographic Base File System which is fundamentally sound. It is not being used, yet they have invested many, many dollars in publications, and here we get an indication of phone calls to the Bureau of the Census asking, "can you tell us what is a Geographic Base File System?" These documents that you have here have been widely distributed all over the country. Why the ignorance? Is it that we are so enmeshed in a confusion and perfusion of information that we do not have time to under- stand these fundamental things? I do not know. The reason that I am encouraged is that I do see a lot of things going on at the comprehensive level, regional level, with association of local governments, working together and developing better management and decision making proc- esses. It seems that the present realignment of Federal regions groups-- so that there is logic among Federal agencies on a geographical area basis- -may improve our channels of communi- cation, much the same way that the associations of local governments formed into sub-State areas and got the locally elected officials working together in a comprehensive way. These are, or can be, the channels of communication, but why aren't they effective? This is a charge I would like us to impose on the Bureau of the Census, and other Federal agencies- -even though I know they have tried- - that they introduce, as I said yesterday, into the groups at the Washington level, at the regional level, at the area level, at the sub-State area level, knowledge about these fundamental manage- ment tools which will make decision making more effective. Again, I would like to impose a charge on these people to try to find the mechanism for communication so that we do not have to sit around and share each others ignorance. Mr. Meyer--It has been an excellent, interesting, informative, and worthwhile conference, and I would like to extend the appreciation of the Census Bureau to all those who made it possible. To our host, the North Central Texas Council of Governments, in particular, and to Dr. Stevens for his deft guidance of the proceedings. I would like to re- emphasize two of the points made earlier: First, that funding for the sole purpose of maintenance and updating the Geo- graphic Base File System is highly unlikely. However, where the file is used as a base for a management and information system, funding for 98 updating and maintenance activities can be in- cluded as a part of the cost of the system. The key phrase, as I see it, is "Use of the file." Second, the Census Bureau recognizes that know- ledge of the existence of the GBF and its potenti- alities need to be disseminated more widely, and the Census Bureau will do more in this area. The Census Bureau, however, cannot accept respon- sibility to act as the GBF information clearing house. We can participate, but the primary role can only be carried out effectively by a pro- fessional organization, URISA for example. URISA may also be the organization to com- municate to the young people of today (our re- placements) the knowledge that the GBF is the management tool of the future. Perhaps one way of accomplishing this would be through the universities. Every computer science course could very well incorporate some meaningful instruction on Geographic Base Files, as could the curricula of the schools of business adminis- tration, just to name a few areas where exposure to the GBF concepts could be beneficial but are not now likely to take place. In conclusion, I would like to indulge in some philosophy— though perhaps, poetry might be a more apt description. These are the developing years of the Geographic Base File System. In 1990, when all streets have but one name and all houses have an official street address; when spelling mistakes and punching errors are a thing of the past; when all maps are perfect and all coordinates exact, we will describe this period of excitement and development as the golden age of the Geographic Base File System. SUMMARY OF PROCEEDINGS Donald F. Cooke First of all, I am sure I am speaking for all of us here in thanking Dr. James Stevens, the North Central Texas Council of Governments staff and all our hosts in the Dallas- Fort Worth- Arlington area who have worked to make this conference a great success; also, Messrs. Meyer and Silver of the Census Bureau's Geography Division for their organizational and inspirational work at this conference. I am here as the chairman of the URISA- SIGGBF (Special Interest Group on Geographic Base Files), and I think it is appropriate to tell you something about this organization. First of all URISA stands for the Urban and Regional Information Systems Association. It is an inter- national professional group dedicated to the advancement of information systems related to urban areas. The SIGGBF was formed 14 months ago and currently has 472 members. You do not have to be a URISA member to join, although it helps since much of the Special Interest Group communications are through the URISA news- letter. Also, membership in SIG is free. The SIG is functioning primarily as a com- munication medium. We had two days of working sessions at the September annual URISA con- vention with an average of 45 people attending each session. Sixteen papers were presented, and I have had them published in a 130 page booklet called GEOCODING-71. We have been talking about "Geographic Base File Systems - Uses, Maintenance, and Problem Solving," and I will attempt to summarize what I have heard about "Uses, Maintenance, and Problem Solving" during the formal sessions and in our after-hours discussions. First of all, I am very impressed with the uses described here in the last two days. Geo- coding has come a long way in a hurry in the last few years. The number of agencies routinely maintaining and using large files (250,000 to 1,000,000 records) is a real surprise to me. That is quite an accomplishment. I am sure anyone who has worked with a file like that knows that it is quite different from working with a few boxes of cards. Software has come a long way too. I am very impressed with the GEOPLAN System, and not only for technical reasons. It has been around for several years now, and it looks as though it has established quite a track record in a large number of sophisticated applications, especially the health applications that were described here. The Wichita Falls GBIS system shows us a dif- ferent, but new and exciting way to look at a DIME file; that is, as a storage, retrieval and organizing tool for data in an integrated municipal information system. I think four years ago, in 1967, there were not many of us who would have believed that a DIME file would be used in such a sophisticated application in operation in 1971. The NAPS system which Mr. Johnston described certainly exercises all of the topological and geo- graphical muscle in DIME, using it for distance calculation, connectivity, adjacency, address coding, allocation of data to network segments and computer mapping. There is a lot more to the use of DIME files than just technical sophistication. There is a lot of work to be done in educating potential infor- mation system users and developing a rich base of reliable local data. Dr. Lundberg's presentation brings these points out in his system. Its technical simplicity underlies the importance of developing faith in the technical resources and the operators of the system. Reliable tract- level data is far more useful than unreliable block-face data. It is a matter of being able to walk before you run, and I strongly suggest for anyone going into a parcel system, for instance, to really master the street segment level systems first. Incidentally, if you are using your ACG or DIME file for coding just to the tract level, there is a 99 100 trick you can use to increase the effectiveness of the geocoding process. We were working on a project that involved coding several million individual addresses to the census tract level. To save on processing time, we decided to collapse the geocoding file to the minimum number or records possible. For a street that was wholly inside a tract, we compressed it down to one record, with an effective address range of zero to infinity (ADMATCH users would need two records — one for odd addresses and one for even). If a street crossed a tract boundary, we did as much compressing as we could to minimize the number of records for a given street and to maximize the range of allowable addresses where possible. We found that in addition to the expected decrease in processing time, we got a tremendous increase in the percentage of records matched — from the mid 70 percent level to over 90 percent. We now use this technique wherever possible. To get back to Dr. Lundberg's presentation, I was struck by the tremendous emphasis on business and commercial data files. Similar emphasis was in Mr. Fox's paper, where utility companies supplied data on new addresses. We have to recognize and exploit the great interest which private enterprise has shown in geocoding technology, and as Dr. Lundberg points out, the wealth of data which they maintain which could be of use in an urban information system. I can personally back this up with the fact that our company is operating very successfully with largely commercial clients. Many very large banks are interested in these techniques for understanding the market they are reaching, and also, ir planning the location of new facilities, branch banks primarily. On other aspects of uses, one factor disturbed me slightly, and has worried others here not so slightly, and this is the emphasis on geographic data processing for planning rather than for management . We have got to change this em- phasis, although 1 am not exactly sure how. The thing to remember is that the managers are the ones who pay the bills. If we can influence them, I think the funding problems will become a lot easier. Our second subject is maintenance, and the main presentation was by Mr. Meyer from the Census Bureau. He described the general pro- cedures the Bureau will follow for maintenance, and I am glad to see these stated in concrete terms. However, there is still one fine point in the technical specifications which I have not seen mentioned in any of the literature. This point has to do with renumbering nodes which find them- selves on tract boundaries, when tracts are split, as they undoubtedly will be, before the next census. As you know, there are two main node (point) numbering sequences — one for nodes on tract boundaries and one for points wholly within tracts. The latter sequences of nodes must be qualified by tract number for uniqueness, and they will be affected by tract number changes. Aside from this point, the procedures for DIME file updating are pretty well defined. On the subject of maintenance, there is another technical point worth bringing up. There is always talk about concurrent maintenance of the DIME file and its source map — usually the Metropolitan Map Series for the area. I have heard that a DIME file is a map, in computer form, and that with the proper software, you can reconstruct the Metropolitan Map from the DIME file. The DIME file, in fact, has a higher information content than the Metropolitan Map because of the address information. If all this is true, then can't we just maintain the DIME file, and generate hard-copy maps by computer as they are needed? Someday, I am going to test this idea by burning the source maps for one of our pro- prietary DIME files. If this works, it could save us all a lot of time and money maintaining the identical information in two different forms. To move on to problem solving, what I really want to talk about here is trying to define what I consider the most important problem we are facing right now in terms of "Uses, Maintenance and Problem Solving with Geographic Base Files." In a very real sense, I am standing in for Dr. Edgar Horwood, Director of the Urban Data Center at the University of Washington, Seattle, who was to present the summary statement at this conference but was unable to make it at the last minute. Those of you who know Dr. Horwood know that he refers to these geocoding meetings as the floating crap game that bobs up here and there around the country. The cast of characters varies some, but not an awful lot. I have been dealt into a large number of these games in the past few months. For example, yesterday was the fourth presentation on GBIS by Mr. Hanssen that I have heard in the last nine weeks. I hope he will not take this personally, but the number of players in this geocoding game is part of the problem I am trying to define. Another part of the problem can be defined by a once popular definition of programing expertise which sort of makes me grit my teeth when I hear it. "If you have read the manual you are an expert; when you have written your first program, you are an experienced expert." Those of you who have read the ADMATCH manual know just how far from becoming an ADMATCH expert you are from that much experience. The fact of life is that you cannot learn data proc- essing by reading books. That is another part of the problem I am defining. Incidentally, I would like to comment on Mr. Renshaw's experience with ADMATCH, which I think bears out this point. He said that he did not find the manual all that useful. What he did find useful was a very simple controlled ex- periment to find out what ADMATCH did with a specific address. Again, experience is what really counts. The key to the definition of this problem came to me at the URISA conference in New Orleans this year. There was a concurrent working session to the one I was at on the tranferability of the USAC Integrated Municipal Information System results. In the summary of that session, which was chaired by Mr. Edward Hearle, he made the point that the most important transferable product that will come out of USAC will not be software or system designs, but it will be experienced experts in urban management information systems. Mr. Hearle meant real experienced experts, not just the people who have read the manuals. He meant people who have pitched in, who have gotten their hands dirty, and who have their first 10,000 mistakes behind them; those that have solid experience. This brings me to the definition of the problem facing us now. How many experienced experts in DIME file maintenance and use are there in this country? We can practically name them all, there are so few. Dr. Aangeenbrug and a number of his staff members would qualify as experienced experts working with DIME files. Messrs. Crellin, Farnsworth, and some of the other SCRIS and Census Use Study staff also have had the experience that would put them into this class. I think Mr. William Maxfield and I paid our DIME file dues at Urban Data Processing, Inc. and with the Census Use Study. Messrs. Almendinger, Kevany and Totcheck, formerly of Systems Development Cor- poration, also got their hands dirty in DIME files and have moved on and produced and used files independently elsewhere. I am pretty sure_ that the Geography Division's efforts in these past few years has produced a number of grizzled DIME file veterans also. There may be a few others, too, that I have not mentioned, but who can say: "I can take any one of the Census DIME files, edit it, correct it, check and update the coordinates, make a NICKLE file, preprocess and sort the NICKLE file with ADMATCH, then preprocess, sort and match a local data file, tabulate local 101 data by census tract, make a GRIDS or SYMAP map, or produce a selective listing of the data by census block." I am afraid that there are only 10 or 15 people, maybe 20 at the most, who can make this statement with the authority of experience. That is the problem I am defining. We need at least 196 people (one for each SMSA with a DIME file) who can say that. We need 196 professionals each of whom can maintain and use, and I em- phasize use , a DIME file. Here is why we need such people, and not just a local coordinator who is going to run FIXDIME and UPDIME every six or twelve months or so. First, there is a very good technical reason why "use" and "maintenance" cannot be separated, which also came out in several of the presentations here. I will quote Mr. Prince's report which gave the results of ADMATCH runs using local data and the ACG. They" found missing information, inconsistencies and errors in spelling, invalid census tract numbers, unnecessary abbreviations ..." He goes on to say that, where possible, these problems were corrected. The lesson here, which has been borne out in my experience with DIME files over the last four years, is that use of the files in actual operations with local data is the best and most pertinent indication of problems in the geographic reference file, be it an ACG or a DIME file. If there are "holes" in the address range information, they will not hurt you if there are no data in the missing address ranges. If there are data in the "holes" the missing ranges will be pointed out immediately the first time you try to geocode with the file. Furthermore, if you are using the SAMS address matching system, overlapping address ranges will be flagged as match rejects. To restate this lesson: Use of the GBF is the best way to pin- point areas needing maintenance. A second argument for combining "use" and "maintenance" is that the use of ADMATCH programs, mapping routines, cross-tabulation systems, and other tools is the best training that a potential "experienced expert" can get for DIME file maintenance. In fact, the person who does get his hands dirty working with these files will probably discover a very simple truth. That is, that a DIME file is in fact, just another computer file, and the maintenance of a DIME file is the same as the maintenance of any other computer file; it can be accomplished by any standard file maintenance system. I am afraid that the Census Bureau, and SCRIS in particular, may be guilty of overselling the complexity of 102 file maintenance of these files. If the operator has a general understanding of general file maintenance, it becomes quite clear what he has to do to update a DIME file. Finally, and most important, it is only through the use of these files that maintenance is going to be justified. It will be several years before any city can justify geographic data processing on a clear cost - benefit basis, if for no other reason, just because of the tremendous job of education of users that lies before us. Let me state this truth once more. It is only through the use of these files that a maintenance program can be justified. Mr. Meyer has said it another way, in a form which should be an in- spiration to us, "Where the files are being used and found to be useful, the funds required for the update and maintenance will be forthcoming." This is the problem that faces us: Development of a resource of about 196 people who are ex- perienced in DIME maintenance, but mainly DIME file use. Each SMSA needs at least one person responsible for the DIME file, capable of training other staff members, willing to educate city officials in data use for management and all the time producing useful output with the geographic data processing tools. Of course, 196 people are not enough; the number is more like 1000. I do not know how to attain this goal, but I am going to list a few activities that I feel are headed in the right direction. At the Federal level the efforts of SCRIS and the Census Use Study are very noteworthy. SCRIS's software also is a tremendous aid to people who want to use these files. Also at the Federal level, the USAC cities of Wichita Falls, Charlotte, Reading, Dayton, Long Beach, and St. Paul, are a very good training ground for users of these files, but at a more sophisticated level than is needed right now. There are some professional organizations that can help as well. The activities of the AIP Information Systems Department and URISA are noteworthy, especially since there are plans for orientation sessions for new members at the URISA meetings next year. These will include descriptions of geographic tools that are avail- able. URISA is also doing some soul-searching about its function as a professional organization, and it is considering professional training as a possible function of the Association. In private industry, I can report on several companies which are training local data users in data processing techniques, as well as supplying software for use of geographic tools. Universities can do a lot for training, although I am afraid the classroom academic requirements usually prevent the total immersion which is really necessary for rapid development of the necessary skills. The local government level, however, is from where the bulk of the cost and effort must come. As a minimum, one capable person must be allocated full time to the DIME geographic data processing job. He must be given adequate computer time, and a budget for buying data tapes, software, and acquiring some outside technical assistance. I am starting to get a fairly good feeling for what this means in dollars for a municipality, and I am encouraged to see that it looks as though a municipality can ac- complish quite a lot with less money than you might suspect. There are probably several other approaches, and we must try them all if we are to succeed in building this cadre of local experts. In closing, I hope that this meeting will start us all thinking and working toward this end. A GENERAL SUMMARY OF ERRORS IN THE DIME FILES Prepared by the Geography Division, Bureau of the Census The error statements presented in this sum- mary represent the condition of the DIME files as they existed at the close of the current series of processing cycles. They do not reflect changes in local conditions since the files were created in 1969 and 1970. The Geography Division is now be- ginning a correction, update and extension program on a continuing basis in cooperation with many of the local agencies which participated in the orig- inal DIME program and with new agencies cur- rently wishing to participate. As this program develops, the "feedback" to the Census Bureau by the local areas of their corrections and additions will result in a more accurate, complete and up- to-date DIME file. 103 A GENERAL SUMMARY OF ERRORS IN THE DIME FILES The DIME files (with or without x-y coordinate information) are now becoming available. The files have gone through several computer edit and correction cycles; nevertheless, there still remain a sufficient number of residual errors that these problems should be brought to the attention of those agencies planning to obtain a copy of the file for their area. Due to limited funds available to the Bureau for clerical correction of the files and the demands made on the Bureau to release the files at the earliest possible date, many of the residual errors could not be corrected. In addition, there were many cases, such as in the street name, type, prefix or suffix direction fields, that the Census Bureau did not have the necessary reference materials to make corrections. At other times, computer processing generated errors, which, because of the time limitations, were not subsequently corrected. To assist those who will be working with the files, and to make all users aware of the problems in the files, we have listed below several of the more common "error" situations. 1. High level codes (High level codes are con- sidered to be the State, county, minor civil division ( or census county division) , congressional district, place, ward and annexation codes) - Tract and area codes for each DIME file record were matched to the Bureau's Master Reference File of geographic codes (MRF) from which the high level codes were assigned to the DIME file. If a segment record carried a combination of area code and tract that did not exist in the MRF, the high level codes could not be assigned to one or both sides of the record. Up to 1-1/2 percent of the records may have this type of error. 2. Enumeration Districts (ED's) (Non-mail census areas only)- ED numbers were assigned to each DIME file record by matching the area code, tract and block in the record with the MRF. If a record carried a combination of area code, tract and block number that did not exist in the MRF, ED numbers could not be assigned to one or both sides of the record. Up to 1/2 of 1 percent of the records may be in error for this reason. 3. Block chaining edit program - Each census block was chained from node number to node number around the block to insure that all sides of the block were coded. If a portion of a block side was not coded, was coded incorrectly, or if extra records were coded to the node pair or block number, all records for that block were rejected even though not all the records for the block were in error. Up to 5 percent of the blocks in the file could not be completely chained for one reason or another. The number of error records involved will be well below this level, however, probably closer to 2 percent. 4. Se gment identification - There are a variety of errors existing in the Segment Identification fields. (Name, Type, Prefix or Suffix Direction). a. Key punch errors - This type of error is self-explanatory. However, there are a few cases that should be specifically pointed out. These include inconsistent spacing between words or character groups within the name field; shifting of the name one or two characters to the right or left of the normal position; numerics mixed in with an alphabetic name or vice versa; erroneously re- peated street "prefix direction," street "type" or street "suffix direction" ab- breviations; nonstreet codes entered for street segments and missing or errone- ous nonstreet codes for nonstreet segments. b. Truncation of street names (Mail census areas only) - During the initial proc- essing of the file, and error in the pro- gram logic for shifting street "types" into a separate field from the "name" led to the truncation of the last six characters of some street names. For example, "BROADWAY" would be converted to "BR" with "WAY" in the street type field. This occurred when there was a combination of (1) no separate street type coded and (2) the last letters of the street name were recog- nized as a street type. If this combination of circumstances occurred to a name with six characters or less, it resulted in the blanking of the name field. "Blank" street 104 105 names, as a result of the file sequencing process, are always located at the beginning of the file. c. Type and/or suffix direction field (Mail census areas only) - A flaw in the com- puter program logic resulted, at times, in the street "type" or street "suffix direction" not being shifted out of the "name" field and into separate fields. In some instances the program inserted incorrect codes (this was usually an E or W) in the "suffix direction" field in place of the legitimate entries. d. Variant spellings - No major effort was made on the part of the Bureau to check out and correct the variant spellings or abbreviations (either coded locally or resulting from computer error) for the same street or nonstreet feature. For example, the name GREEN might also appear as GREENE or GREN, or the PENNSYLVANIA RAILROAD might ap- pear as PA RR or PENNA RR. There is no practical way for the Census Bureau to measure or determine what percentage of the features have this problem. 5. One-s ided segments - The rules estab- lished "for^cbnstructing a DIME file required that all records appear as "segments," that is, between any given pair of nodes the record would have geo- graphic codes for areas on both the left and right sides of the feature. The only legitimate exception to this rule was a situation in which the segment was at the outer limit of the coded area, in which case only the "inside" side of the segment should appear. For the mail census areas that partici- pated in preparing DIME files, one- sided segments were corrected and matched to form segments unitl 1-1/2 percent or less of the total number of records remained unmatched. The differing pro- cedures used for preparing the DIME files for non- mail census areas made it very unlikely that illegitimate one-sided records exist. However, there is always the possibility that there may be a few. 6. Address range - Street address ranges in the DIME files for nonmail census areas were used at the tract level in coding responses to the "work trip question" in the 1970 Census of Population and Housing. Consequently, these files were edited and corrected until they contained a 5 per- cent or less residual error in the address range. The DIME files for mail census areas were not prepared until after the 1970 census, and because of limitations of both time and funds, there was no local review or subsequent edit by the Census Bureau of the street address ranges within these files. Unknown, therefore, is the level of error in the files due to the local addition of new rec- ords with unchecked address information and the keypunching of these files during the program. The types of errors that exist in both the mail and nonmail files include overlapping address ranges between adjacent street segments, gaps between the address ranges of adjacent street segments, parity errors where the odd and even number ranges appear on the wrong side of the segment, or where the low and high addresses of the same segment are not both "odd" or both "even." An address range of zero will also be found in the file for a number of street segments. This usually occurred when the local agency preparing the file could not find adequate reference materials to determine the correct range. A zero to zero address range was also used to code segments along interstate highways which normally do not have addresses assigned to them. Also, in the process of resolving block chaining errors, when a segment or segment side was missing, the Census Bureau staff did not attempt to impute the address range. Therefore, these segments will also contain zero address ranges. 7. ZIP code - ZIP code errors were corrected in nonmail census areas until the file contained 5 percent or less residual error. No corrections could be undertaken in the mail census areas, and the level of error in these DIME files (again because there was no local review and subsequent edit by the Census Bureau) is unknown. Records added during the preparation of DIME files in mail- census areas will not contain ZIP codes. However, from time to time spurious codes, such as alphabetics, are found in this field. In addition, there are records where, because of processing failure, the ZIP code is not consistent with the general ZIP code of the area. 8. Coordinates - Each node point identified in a segment record should have x-y coordinate values assigned. If, however, a node point was overlooked, or if the information identifying the node point (map sheet number, tract number or node number) was recorded incorrectly during the original coding, or during the digitizing, the node point was not assigned coordinates. While most of the areas have less than 5 percent of the node points to which no coordinate values were assigned, this percentage may be as great as 10 percent in mail census areas. It should also be pointed out that the accuracy of the coordinate readings in relation to the earth's 106 surface is dependent upon (a) the accuracy of the fied node points so that the coordinate values of a drafting of the features on the map, (b) the place- different part of the map sheet were assigned. In ment of the node dot on the feature (whether it was addition, there were random malfunctions of the "right on" the street intersection or slightly off), electronic digitizing equipment. Those electronic (c) the accuracy of the digitized reading, and (d) the failures that were undetected may have assigned digitizing clerk who may have incorrectly identi- incorrect coordinate values to some nodes. AN INTRODUCTION TO GIST NEW YORK CITY'S GEOGRAPHIC INFORMATION SYSTEM by Robert Amsterdam Under the Executive Direction of E. S. Savas First Deputy City Administrator and Harry Lipton Assistant City Administrator City of New York John V. Lindsay, Mayor Office of the Mayor Timothy W. Costello Office of Administration Deputy Mayor-City Administrator May, 1971 (Revised November, 1971) For purposes of these proceedings, Mr. Amster- Chapters III and IV from what was contained in dam has revised this publication and expanded the original report distributed in May 1971. 107 108 Introduction The Office of Administration of the Office of the Mayor initiated the development of GIST (Geographic Information System) as part of its responsibility to improve the effectiveness and efficiency of New York City's government. GIST is a computer-based system which makes information on the basic physical, social, and economic aspects of New York City more ac- cessible to all city agencies. It is believed to be the first general-purpose information system to be used routinely in the municipal operations of a major city. GIST was developed through the joint efforts of the Office of Administration and the Department of City Planning, with the cooperation of many other agencies. The present GIST capabilities consist of computer files and programs which can aid in manipulating and analyzing a wide variety of city data. Building addresses can be auto- matically reformatted- -thus overcoming the normal variations in writing street names- -to produce computerized files which can be sorted and matched by address against a master address file. With this done, useful geographic identifiers can be added to a data file. For example, any address can now be equated to its tax block number, census tract, census block, health area, community planning district, and ZIP code. In addition, building addresses and familiar locations can be identified by fixed x-y coordinates in terms of a standard grid system. Given these coordinates, it is possible to calculate the distance between any two locations, to find the frequency of a given event within a selected area and to make other types of spatial analysis, such as creating compact routes or drawing feasible district boundaries. Lastly, a system has been established to produce maps automatically, such as those shown in the attached figures. These maps can frequently give extra meaning and clarity to a set of statistical data. To date, GIST has been installed and used at the computer installations of the Bureau of the Budget, Finance Administration, Department of Social Services, Health and Hospitals Corpora- tion and Department of Traffic. Typical ex- amples of the current uses of GIST are discussed in Section III. GIST capabilities are available for use by all city agencies. If any agency has a continuing need for these files and programs, the GIST staff in the Office of Administration assists that agency in installing them and in learning to use and maintain them. For nonrepetitive or unusual uses, the GIST team assists the agency in performing the necessary data processing. A principal support for the tools described here is a master file (GIST Geographic Base File) which must be correctly updated as the city changes. This file is maintained by the Department of City Planning and the Office of Administration. The Office of Administration distributes updated and improved versions as they are completed. Among many other uses, these tools will be valuable for analyzing city data in con- junction with 1970 census results, particularly with census block summary statistics which will be available by late 1971. TheGIST team is working closely with the city's census co- ordinator and with local universities to assure that adequate tools are available for inter- preting this data. II. GIST Operating Capabilities The following pages highlight the char- acteristics of GIST's operating elements: - GIST Geographic Base File (GBF) - GIST Address Preprocessor Program - GIST Address Matcher Program - Map Generator Program (SYMAP) A. GIST Geographic Base File (GBF) A Master Reference File which con- tains one record for each block-face (block- side) in New York City (240,000 block faces, 50,000 blocks) and 330 char- acters of data for each block face in- cluding: . Street name . Range of permissible house numbers (highest and lowest numbers on each block- side) . Intersecting street names . Block centroid coordinates (x-y) . 1960 census tract and block numbers . Tax Assessor's block number and suffix . Health Area . Community Planning District (CPD) . ZIP code Used for summarizing and reorganizing agency data; for mapping, routing and districting; for address matching. 109 B. GIST Address Preprocessor Program Restructures address components in card or tape data files: . House number . Street name . Borough code . Community name . City . County . State . ZIP code Creates tape record output with 100-character prefix containing restructured address com- ponents: . Used for matching or manipulating address records . Required before using GIST Address Matcher Program . Processes 100,000 records per hour (using IBM System 360/40) C. GIST Address Matching Program Matches restructured data file against block- side records of the Address Dictionary (subset of Geographic Base File) on address. Will produce output tape file that contains the input data file plus any combination of the desired fields from the "matched" block-side record, e.g.: Health Area 1960 census tract and block number Tax Assessor's block number Block centroid coordinates (x-y) and others as available from the FIST Geo- graphic Base File Used to summarize, analyze and group data on the basis of address; assists mapping, districting and routing activities. Processes 90,000 records per hour (using IBM System 360/40). D. Map Generator Program (SYMAP) Creates automatic maps of each borough or of entire city using available hardware (IBM S/360 line printer): - District maps: . for Health Areas by borough . for CPDs by borough and city-wide - Contour maps by borough and city- wide: . for air pollution stations . for census tract centroids - Incident maps by borough and city- wide: . pinpoints locations of specific ad- dresses or groups of incidents. - Network maps by borough or community: . uses lines of varying density to show movement from point to point. Produces a one-panel map in six minutes, two- panel map in 15minutes (using IBM System 360/40) Future Plans dictate as demand and resources - Adding other geographic descriptors - e.g. school districts, precincts, 1970 census tract and block numbers. - Developing a DIME - type file with co- ordinates for each street intersection. - Creating a Building/Lot File with one record for each building (900,000 structures.) III. Applications of GIST ' The following isa summary of allGIST projects either completed or in development. A. Office for the Aging A file to 525,000 senior citizens who have been issued half-fare transit passes has been address- matched against the GIST GBF. Ninety- three percent of the records were successfully matched and health area, census tract and block, ZIP code and block center coordinate numbers were added to these records. The resulting file is being used in an origin-destination study of the need for a dial-a-ride service to drive elderly people to health clinics and hospitals. The file will also be used in conjunction with 1970 census data to evaluate the reach of Office for the Aging programs. B. Department o f Air Resources A series of contour maps was generated showing the distribution of various polluting sub- stances throughout New York City. Average measurements obtained from 27 monitoring stations were used as input data for the SYMAP program. C. Real Property Assessment Department In cooperation with the Assessor's office, a tape file is being developed with a record for 'As of November 1971. 110 TYPICAL USE OF GIST PROGRAMS USER'S CONTROL STATEMENTS TYPICAL USE OF GIST PROGRAMS GIST ADDRESS PREPROCESSOR CORRECTIONS LIST OF REJECTED RECORDS -J SORT BORO, HOUSE NUMBER STREET NAME ( SUMMARIZE AND TABULATE USER'S CONTROL STATEMENTS GIST ADDRESS MATCHER CORRECTIONS REPORT DENSITY DISTRIBUTION MAP SUMMARIZED DATA GIST MAP GENERATOR Ill each building in the city and a full street address for each record. There are approximately 820,000 records in the current version of this file. For a study of assessment procedures, a ran- dom sample of 10,000 properties was selected and classified. Block center coordinates were identified for each parcel. These were used to print maps pinpointing each parcel for the purpose of more detailed study. D. Department of Buildings To assist the City Council and the Department of Buildings in drafting new fire regulations covering fully air-conditioned buildings, a listing of all office buildings over nine stories was printed from the current GIST building file. Preliminary work has begun to capture Build- ings Department data on new buildings and demolitions in order to maintain the GIST buildings file on a current basis. E. Department of City Planning The GIST GBF, which was developed in co- operation with the City Planning Department, was used to address match a file of business establishments and identify each record by tax block, census block, and Community Planning District. The resulting file has been used in several small area studies by the Planning Depart- ment to determine the effects of new developments on existing business activity. A Census Bureaufileof 70,000 retail establish- ments was address-matched under Census Bureau supervision to add census tract and block number to each matched record. The overall match rate was 88 percent. The resulting file will be used by City Planning in a variety of small area studies. F. Office of Civil Defense The Community Shelter Plan being constructed for New York City requires the allocation of surveyed shelter spaces to neighboring residential and nonresidential populations. Shelter addresses were address- matched to the GIST GBF to deter- mine the census block and tax block numbers for each shelter. The GIST building file is being used to estimate the population of each structure for use in making specific building allocations. G. Board of Education The file of welfare recipients for July 1970 was address-matched to identify the census tract and block for each case. The resulting file was used by the Board to identify the number of school-age children receiving public assistance in each school district. This was used in allocating Federal funds to each community school board. A follow-up run is being made with the September 1971 recipients file. H. Board of Elections A pilot study was made, using approximately 100 city blocks, in which addresses of registered voters were located at their approximate positions on each block. This was used to redraw voting precincts so that the number of voters at each polling booth would be more uniform. I. Environmental Protection Administration A system is under development to estimate the power needs (electric, oil, gas, telephone) and refuse build-up for each block in the city. It is planned that GIST building data will be used as a basic component of this system. J. Departme nt of Finance The GIST staff has produced a program to develop a building address for each property from data contained in two different record types in the Property Assessor's punched card file. This program has been used by the Finance Department in providing addresses for tax lots involved in court proceedings. The Tax Collector's file of retail cigarette vendors was address- matched to reformat and standardize the address of each dealer. The resulting file was used by the Finance Department to reorganize routes of cigarette tax collectors. For a study of the real estate market, a Geo- space Plotter map was prepared showing every tax block in Manhattan. Selected blocks were density- shaded to show the distribution and amount of high-valued property which had changed owner- ship during the prior year. The map was used as an indicator of market interest and a clue to changing neighborhood values. K. Health Services Administration A system is being developed to address match birth and death records weekly to indicate health Area, Health District, census tract and census block. This will be used in compiling Department of Health reports which are presently summarized by Health Area and Health District. The addition of the census identifiers will permit other agencies 112 to make greater use of these vital statistics data. A saving in clerical effort is anticipated when the new system is installed. The Department of Mental Health is conducting a study of the travel routes and utilization by patients at the city's 500 mental health clinics. They used a printed listing of the GIST GBF to determine coordinate locations of each clinic. To improve the dispersal and response times of ambulances, samples of ambulance calls have been address- matched to obtain the coordinates for each call address. The resulting data were used in simulations which justified establishing satellite ambulance stations which are providing improved service at no additional cost. A map of Bronx Health Areas was generated for Montefiore Hospital showing distribution of their outpatients by Health Area. This was used in an analysis of hospital facilities and planning techniques. L. Housing and Development Administration To assist the start-up of a new rent stabili- zation program, HDA's file of 150,000 apartment buildings was address- matched to identify the tax block for each building. This file is being used to obtain property tax data from financial records which are maintained by tax block number. M. The Mayor's Complaint Office A system is being installed to address- match all complaints for services received by the Mayor's office, coding them to Community Planning Districts. This will assist in following up on complaints which have been referred to operating departments for action. N. Chief Medical Examiner A series of maps is being prepared to pinpoint locations of certain categories of unnatural deaths over the last twelve months. These will include a map of addresses of drug-related deaths and a map of locations of murder victims. These maps are expected to assist law enforcement activities. O. Police Department A comprehensive review is being made of the Police Department's street code numbers to determine how these numbers can best be added ,to the GIST GBF. Through this effort GIST will be able to transmit data to the Police SPRINT system on new street openings. P. Department of Social Services The welfare recipients file for August 1971 was address- matched to identify Health Area and census tract for each record. Out of 428,000 records, 95 percent were successfully matched. The resulting file was used in creating summary reports for use in departmental operations. A map was produced for the Social Services Department showing distribution of cases by Health Area. This was used in determining the location of a new field office. IV. Illustrations The following are a series of computer-drawn maps; they are shown here to illustrate the broad diversity and range of map types which can be prepared by GIST. Each of the computer-drawn maps were prepared by the New York City GIST project, Office of Administration, Office of the Mayor. Figure 1 - On e- Family Dwellings This map shows New York City in white against surrounding shaded area. The map shows the locations of 1,000 randomly- selected one-family dwellings. The study was conducted to examine property tax policy. The various symbols repre- sent 1, 2, 3, 4, 5 (or more) one-family dwellings on a block. As this map indicates, private homes are scattered broadly through the outer boroughs. Concentrations are highest in central and eastern Queens, southern Brooklyn, north central Bronx and northern Richmond areas. The map was prepared on a standard computer printer. Figure 2 - Walk-Up Apartments This map shows New York City in white against surrounding shaded area. The map shows the locations of 1,000 randomly- selected walk-up apartments. The study was conducted to examine property tax policy. The various symbols repre- sent 1, 2, 3, 4,. 5 (or more) walk-up apartments on a block. As this map indicates, tenement buildings are more heavily concentrated in the older neighborhoods of the city. Densities are high throughout Manhattan, particularly in the lower East Side and Harlem. Densities are also high in central and eastern Brooklyn and in southern Bronx. The map was prepared on a standard computer printer. Figure 3 - Elevator Apartments This map shows New York City in white against surrounding shaded area. The map shows the 113 location of 1,000 randomly- selected elevator apartments. The study was conducted to examine property tax policy. The various symbols repre- sent 1, 2, 3, 4, 5 (or more) elevator apartments on a block. As this map indicates, the newer apartment buildings are shown liningthe principal thoroughfares. In Manhattan, heavy lines are shown on the major east side and west side avenues. In the Bronx, the concentration is in the Jerome Ave. -Grand Concourse area. In Queens, apartment buildings are concentrated along Queens Boulevard. In Brooklyn, the density is shown heaviest along Ocean Avenue. These heavy concentrations also parallel main subway lines. The map was prepared on a standard computer printer. Figure 4 - Bronx Health Areas This map shows the Bronx segmented into Health Areas. The Health Area is one type of classification used by the city, State, and Federal government for the collection of social data. At the center of each segment is the number identi- fying that Health Area; it has no relationship to the data. then applying to each a number representing the actual number of welfare cases from that Health Area. The level in which the data falls deter- mines the shading taken on by the Health Area. The map was prepared on a standard computer printer. Figure 6 - Real Estate This map was produced by a computer-driven cathode ray beam. The map shows individual blocks of lower Manhattan. The shading indicates, proportionately the value of selected high- valued properties which changed ownership during a recent one-year period. The darker the shading, the higher the value of the property. This map. was prepared for the New York City Finance Administration to aid in the study of sales trends. The map was prepared by a Geospace Plotter. This is a cathode ray tube device which projects the image of the map onto photographic paper; the paper is then developed. This process shows excellent detail and can display numbers and letters in any size and position. The shading indicates proportionately the health areas from which Montefiore Hospital draws its in-patients. Montefiore Hospital is located in Health Area 4.10, and it can be seen that most of its patients come from that Health Area. The map is computed by first describing each of the Health Areas as separate polygons, and then applying to each a number representing the number of patients from that Health Area. The level in which the data falls determines the shading taken on by the Health Area. The map was prepared on a standard computer printer. Figure 5 - Manhattan Health Areas This map shows Manhattan segmented into Health Areas. At the center of each segment is the number identifying that Health Area; it has no relationship to the data. The shading indicates proportionately the num- ber of people on welfare; Health Areas 16, 21, and 26 contain the greatest number of welfare cases. The map is computed by first describing each of the Health Areas as separate polygons, and Figure 7 - Community Shelter Plan This map shows varying degrees of shelter surplus and deficit in a study area within Queens. The map was prepared for the New York City Emergency Control Board - Office of Civil De- fense. The characters (. - = 1, 2, 3,4) each indi- cate increasing amounts of surplus shelters. The white lines are contour lines. Figure 8 - Voter Registration This map shows a small section of upper Manhattan. Each polygon represents a physical block in the neighborhood. The numbers around the edge of the block indicate the location of houses and the number of registered voters living in each. The map was prepared for the Board of Elections to assist a study of redistricting voters in order to equalize the number of voters at each polling station. This map was produced by a computer-driven pen plotter. Figure 9 - New York City Air Pollution Map This map shows the concentration of sulfur dioxide (S0 2 ) within New York City, averaged over a recent month. The black lines are contour lines. 114 The contour lines separate points falling into the different levels of pollution. The shading indicates proportionately the concentration of the pollutant. For example, lower Manhattan falls under the highest pollution level, and therefore, is the darkest area. Each number on the map is an average reading of the air taken by a pollution measuring station located at that point. There are 27 measuring stations presently in use. Based on these values, the computer program calculates (by interpolation) the value of the surrounding points, constructs the contour lines and fills in each contour area with the ap- propriate shading. The map was prepared on a standard computer printer. In Nashville, Tennessee, similar maps are shown daily on television news programs. 115 Figure 1 - ONE FAMILY DWELLINGS 116 Figure 2 ■ WALK-UP APARTMENTS 117 Figure 3 - ELEVATOR APARTMENTS Figure 4 ■ BRONX HEALTH AREAS Figure 5 - MANHATTAN HEALTH AREAS 119 :: :::: \ SJX- :::: ^M :!:::. ■ Hiiiiiiri:. i life! 81 :!((!.■»!• :::.iiin. _H:!n: ESS am::; i kSiii j mum :: ii 1 ::!: '■ ■ •«•:: ::: "IlUIIwIItJ »*""«'«•»'■*■ iiiiiutulll HI 120 Figure 6 - REAL ESTATE PPP i — in — i i irn CZDO I — 1 1 — I C^TCT?g| — in ) Q □ CD □ CZZl CZI CZD □ Figure 7 COMMUNITY SHELTER PLAN 121 '.!!!!: :.'i::i I !|i |U|ll||ll||ini» t|l 1 'djy?J u .etti^m rreh FDPi R cnmmuNiTv shel QUEENS BDRDUEH 122 Figure 8 - VOTER REGISTRATION iozp 129 vi © vi o rs) L6£l 125 123 I6£l 521 in in o o in in o in 501 CT1 CO 1399 1381 I *■" m / °° a> 1 ^ U) VI ro en R 99£li 111 m in . 61 in 57 49 45 1352 1348 VI 00 ai © 1344 v» o> o «~ BMW® in in in m VI VI NJ N) N o 79 in in V» a> t - o in in VI o 1334 m o in 6LEI iosi © in oo © m 1301 in in in 123 Figure 9 - NEW YORK CITY AIR POLLUTION MAP The Census Bureau presents The Statistical Abstract of the United States - 1971 ■he U.S. Department of Commerce, Bureau of the Census presents the 1971 STATISTICAL UTRACT OF THE UNITED STATES. This 92d of the "Stat Abstract," as it is popularly known, offers the Nation's most sought-after facts and figures in the world of government and private enterprise. Statesmen. ..businessmen. ..scientists... teachers.lmdvstudents alike FIND the information they want on .the social, economic, and govern- mental characteristics of the United States - both current and historical. Statistics are presented in over 1 ,300 tables and charts derived from over 200 government, private, and international agencies. A special introductory table features RECENT TRENDS for selected measures of social and economic change. The GUIDE TO SOURCES lists over 800 statistical publications on 50 major subjects, such as population (including data from the 1970 Census of Population and Housing), government, health, education, agriculture, indus- try, construction, recreation, trade, and scores of other subjects. Whatever your line of work, busi- ness, profession, or avocational interests - when you need the facts, choose the comprehensive and authoritative STATISTICAL ABSTRACT OF THE UNITED STATES, 1971. 1032 Pages (CLOTH) $5.50 To order The Statistical Abstract of the United States— 1971 write to: Superintendent of Documents U.S. Government Printing Office Washington, D.C. 20402 UNITED STATES GOVERNMENT PRINTING OFFICE DIVISION OF PUBLIC DOCUMENTS WASHINGTON. D.C. 20402 PENN STATE UNIVERSITY LIBRARIES OFFICIAL BUSINESS FIRST CLASS MAIL ADDDD7EflS c l t mD ^ £ .MAIL ^ -Jl