C /<3./C / £co - £S~ COMPUTER PERFORMANCE EVALUATION USERS GROUP CPEUG 16th Meeting ^ oe ^ NBS Special Publication 500-65 U.S. DEPARTMENT OF COMMERCE National Bureau of Standards NATIONAL E;j£*£AU Or STANDARDS The National Bureau of Standards' was established by an act of Congress on March 3, 190!. The Bureau's overall goal is to strengthen and advance the Nation's science and technology and facilitate their effective application for public benefit. To this end, the Bureau conducts research and provides: (1) a basis for the Nation's physical measurement system, (2) scientific and technological services tor industry and government, (3) a technical basis for equity in trade, and (4) technical services to promote public safety. The Bureau's technical work is per- formed by the National Measurement Laboratory, the National Engineering Laboratory, and the Institute for Computer Sciences and Technology. THE NATIONAL MEASUREMENT LABORATORY provides the national system of physical and chemical and materials measurement; coordinates the system with measurement systems of other nations and furnishes essential services leading to accurate and uniform physical and chemical measurement throughout the Nation's scientific community, industry, and commerce; conducts materials research leading to improved methods of measurement, standards, and data on the properties of materials needed by industry, commerce, educational institutions, and Government; provides advisory and research services to other Government agencies; develops, produces, and distributes Standard Reference Materials; and provides calibration services. The Laboratory consists of the following centers: Absolute Physical Quantities 2 — Radiation Research — Thermodynamics and Molecular Science — Analytical Chemistry — Materials Science. THE NATIONAL ENGINEERING LABORATORY provides technology and technical ser- vices to the public and private sectors to address national needs and to solve national problems; conducts research in engineering and applied science in support ol these efforts; builds and maintains competence in the necessary disciplines required to carry out this research and technical service; develops engineering data and measurement capabilities; provides engineering measurement traceability services; develops test methods and proposes engineering standards and code changes; develops and proposes new engineering practices; and develops and improves mechanisms to transfer results of its research to the ultimate user. The Laboratory consists of the following centers: Applied Mathematics — Electronics and Electrical Engineering 2 — Mechanical Engineering and Process Technology 2 — Building Technology — Fire Research — Consumer Product Technology — Field Methods. THE INSTITUTE FOR COMPUTER SCIENCES AND TECHNOLOGY conducts research and provides scientific and technical services to aid Federal agencies in the selection, acquisition, application, and use of computer technology to improve effectiveness and economy in Government operations in accordance with Public Law 89-306 (40 U.S.C. 759), relevant Executive Orders, and other directives; carries out this mission by managing the Federal Information Processing Standards Program, developing Federal ADP standards guidelines, and managing Federal participation in ADP voluntary standardization activities, provides scientific and technological advisory services and assistance to Federal agencies, and provides the technical foundation for computer-related policies of the Federal Government. The Institute consists of the following centers: Programming Science and Technology — Computer Systems Engineering. 1 Headquarters and Laboratories ai Gaithersburg, M D, unless otherwise noted; mailing address Washington, DC 20234. 'Some divisions within the center are located at Boulder, CO 80303. COMPUTER SCIENCE & TECHNOLOGY: Computer Performance Evaluation Users Group (CPEUG) Proceedings of the Sixteenth Meeting Held at Orlando, Florida October 20-23, 1980 Editor: Dr. Harold Joseph Highland Conference Host: Navy Data Automation Facility Naval Training Center Orlando, FL 32813 Sponsored by Institute for Computer Sciences and Technology National Bureau of Standards Washington, DC 20234 J *** T ° F % Q %. % 1* a J ^fAU OV U.S. DEPARTMENT OF COMMERCE, Philip M. Klutznick, Secretary Luther H. Hodges, Jr., Deputy Secretary Jordan J. Baruch, Assistant Secretary for Productivity, Technology and Innovation NATIONAL BUREAU OF STANDARDS, Ernest Ambler, Director o Q. 01 Issued October 1980 Reports on Computer Science and Technology The National Bureau of Standards has a special responsibility within the Federal Government for computer science and technology activities. The programs of the NBS Institute for Computer Sciences and Technology are designed to provide ADP standards, guidelines, and technical advisory services to improve the effectiveness of computer utilization in the Federal sector, and to perform appropriate research and development efforts as foundation for such activities and programs. This publication series will report these NBS efforts to the Federal computer community as well as to interested specialists in the academic and private sectors. Those wishing to receive notices of publications in the series should complete and return the form at the end of this publication. National Bureau of Standards Special Publication 500-65 Nat. Bur. Stand. (U.S.), Spec. Publ. 500-65, 3 16 pages (Oct. 1980) CODEN: XNBSAV Library of Congress Catalog Card Number: 80-600155 U.S. GOVERNMENT PRINTING OFFICE WASHINGTON: 1980 For sale by the Superintendent of Documents. U.S. Government Printing Office. Washington, D.C. 2040i FOREWORD In view of the theme of CPEUG 80 — CPE Trends in the 80 's — it seems appropriate to reflect upon changes observed and progress made during the 70 's as a guide to what we may expect in the next decade. As the 70' s began, the trend of spectacular growth in the number of large-scale computer systems was ending. The computer community began looking toward more efficient use of existing systems in lieu of the seemingly endless proliferation of large-scale systems that had characterized the late 60' s. Economic recession in the private sector and the impact of the Brooks Act in the Federal community were prime factors contributing to this change. As a result, increased emphasis was placed on (then) new techniques such as hardware and software monitoring that have since become standard computer performance evaluation tools. Benchmarking became a viable method to test different vendors' systems during procurement of new systems. System simulation and analytical modeling found new applications in computer performance prediction. Throughout the 70's, these techniques were refined and enhanced, new concepts were introduced, and new terminology adopted — system tuning, installation management, remote terminal emulation, system sizing, the ADP life cycle, capacity planning all became parts of the computer performance jargon of the 70' s. The need for a forum to promote the exchange of ideas and information in this newly developing discipline was fulfilled with the foundation of CPEUG by the United States Air Force in 1971. Later, sponsorship of CPEUG shifted to the National Bureau of Standards because of its responsibilities to explore the applicability of standards to computer performance. CPEUG quickly became the major annual event for those interested in the efficient use of computers within the Federal establishment. As CPEUG grew, its audience diversified so that in recent years roughly half of CPEUG meeting attendees were employees of the Federal Government. The remainder represented companies, universities, and consulting firms who either had an interest in the Federal ADP usage or who merely wanted to keep abreast of the state-of-the-art of CPE. Throughout the 70 's, CPEUG programs addressed techniques and issues that were in the forefront of CPE activity. Thus, in the early 70's, CPEUG concentrated on topics related primarily to hardware performance. However, as the uses of computers became more sophisticated and CPE matured as a technical discipline, CPEUG programs became more diversified — computer networks, distributed processing, data base management, computer security, and user satisfaction are topics that have been introduced to the CPEUG program within the past two years and are prominent topics for CPEUG 80. We hope that by emphasizing new CPE topics in the CPEUG 80 program, we may stimulate discussion that will help us all try to visualize the state of CPE (and CPEUG) ten years hence. We welcome all of you to CPEUG 80 and hope your experience here will be one that is remembered throughout the 80's. My sincere thanks go to the many people whose efforts have brought you this fine conference, especially those who have dedicated their time, talent, and effort to serve on the CPEUG 80 Committee. Their names appear elsewhere in these proceedings. In addition, I would like to thank Carole Zerr for her continued excellent assistance. Dr. James E. Weatherbee CPEUG Conference Chairman October 1980 Preface Last year's Conference theme, "The Expanding Scope of CPE," focused attention upon the roles of CPE in a rapidly expanding and changing technology. This year's theme, "CPE Trends in the 80 's," builds upon last year's Conference by emphasizing new roles for the CPE practitioner. Several new and interesting techniques in the traditional CPE areas are presented. In addition, some relatively new areas and their relationship to CPE are addressed. Debate on the relevancy of CPE in these new areas will likely mold the future of this Conference in the 80' s. The central issue to be debated is best stated through a question posed by last year's Program Chairman: "How will CPE, traditionally associated with large central computers, change in an era of smaller and cheaper hardware and improved digital communications?" The technical program for this year's Conference emphasizes new technology areas, procurement issues in the 80' s, and new approaches in traditional CPE areas. Six relatively new areas are presented in this year's program: Data Base Management System Performance, Data Communications, Computer Network Performance, Software Improvements, Human Interfaces, and Security, Fraud and Audit. Some papers have been presented in these areas at past Conferences, but this year's program significantly expands the scope of these sessions. The interest shown in these areas will, to a large extent, mold the structure of the future Conferences. CPEUG is continuing to provide a forum for debating the ADP procurement process within the Federal Government. The keynote address, "Toward a More Efficient ADP Procurement Process in the 80's," highlights the objectives of the program's procurement sessions, panels, and tutorials. Past interest in this phase of the program suggests that CPEUG will continue to explore improved ADP acquisition strategies. The traditional sessions of CPEUG are again technically stimulating and stress better approaches in CPE. New measurement and prediction techniques are presented to expand the CPE practitioner's set of available tools. Actual experiences are also presented to show each of us what approaches should — and should not — be used. The CPEUG 80 program has been prepared by many people. The Conference Committee, session chairpersons, authors, tutors, and referees all deserve recognition for their time and patience. The excellent administrative support of Cathy Casey and Terri Schroeder merit special recognition. All of these individuals have contributed to the success of this year's program. James 0. Mulford CPEUG Program Chairman October 1980 Abstract The Proceedings record the papers that were presented at the Sixteenth Meeting of the Computer Performance Evaluation Users Group (CPEUG 80) held October 20-23, 1980, in Orlando, Florida. With the theme "CPE Trends in the 80's," CPEUG 80 focused on new applications that are expected to grow in the 80 's and changes that may occur in traditional areas during the 80's. The program was divided into two parallel sessions and included technical papers on previously unpublished work, case studies, tutorials, and panels. Technical papers are presented in the Proceedings in their entirety. Key words: Benchmarking; capacity planning; computer performance evaluation; computer performance measurement; computer performance prediction; computer system acquisition; CPE in auditing; installation management; on-line system evaluation; queuing models; simulation; workload definition. The material contained herein is the viewpoint of the authors of specific papers. Publication of their papers in this volume does not necessarily constitute an endorsement by the Computer Performance Evaluation Users Group (CPEUG), or the National Bureau of Standards. The material has been published in an effort to disseminate information and to promote the state-of-the-art of computer performance measurement, simulation, and evaluation. vii CPEUG Advisory Board Richard F. Dunlavey National Bureau of Standards Washington t DC Dennis M. Conti National Bureau of Standards Washington, DC Dennis M. Gilbert Federal Computer Performance Evaluation and Simulation Center Washington, DC Harry J. Mason, Jr. U.S. General Accounting Office Washington, DC viii Conference Committee CONFERENCE CHAIRMAN Dr. James E. Weatherbee FEDSIM/MV Washington, DC PROGRAM CHAIRMAN PROGRAM VICE-CHAIRMAN PROCEEDINGS EDITOR PUBLICATION CHAIRMAN ARRANGEMENTS CHAIRMAN James 0. Mulford International Computing Company Dayton, OH Judith G. Abilock Price Waterhouse & Company Washington, DC Dr Harold J. Highland State University of New York Farmingdale, NY Peter J. Calomeris National Bureau of Standards Washington DC Arthur F. Chantker Federal Aviation Adminstration Washington, DC REGISTRATION CHAIRMAN James G. Sprung The MITRE Corporation McLean, VA FINANCE CHAIRMAN Carol B. Wilson Fiscal Associates, Inc. Alexandria VA AWARDS CHAIRMAN Jeffrey M. Mohr Arthur Young & Company Washington, DC PUBLICITY CHAIRMAN LOCAL ARRANGEMENTS CHAIRMAN VENDOR COORDINATOR Theodore F. Gonter U.S. General Accounting Office Washington, DC Hadley G. Nelson Navy Data Automation Facility Orlando, FL Wayne D. Bennett National Bureau of Standards Washington, DC xx Referees Major Robert Feingold USAF Phase IV PMO/PGY Gunter AFS AL Howard S. White Lawrence Berkeley Laboratory Berkeley CA Larry Ritchard NCR Corp. Dayton, OH Dennis M. Conti National Bureau of Standards Washington, DC Duane Ball FEDSIM/MV Washington, DC Mickey Sutton AFDSDC/DM Gunter AFS, AL Captain Brett Berlin FEDSIM/NA Washington, DC Larry Davis McDonnell Douglas Automation Co, St. Louis, MO Bruce Herbert U.S. General Accounting Office Washington, DC Captain Steve Cristiani USAF/ASD/ENA WPAFB, OH Joel Molyneaux Engineered Systems Omaha, NE Todd Ramsey IBM Corp Bethesda, MD Arnold Johnson GSA/ADTS/CFT Falls Church, VA Peter Petrusch USAF HQ AFLC/ACT WPAFB, OH Charles Davidson U.S. General Accounting Office Washington, DC William Buckles General Research Corp, Huntsville, AL Wayne D. Bennett National Bureau of Standards Washington, DC James 0. Mulford International Computing Co, Dayton, OH Richard F. Dunlavey National Bureau of Standards Washington , DC Major David Schafer USAF HQ SAC/AD Offutt AFB, NE Michael Morris Independent Consultant Washington, DC Geoffrey Goodman International Computing Co, Dayton, OH Byron Griffith AFDSDC/DM Gunter AFS, AL Lt. Linda S. Mehalko USMC SERV. CO. H&S DN ASC Camp Pendleton, CA Lt. Ken Herbert USAF Phase IV PMO/PGY Gunter AFS, AL Karen Gordon University of Maryland College Park, MD Table of Contents FOREWORD iii PREFACE v ABSTRACT vii CPEUG ADVISORY BOARD viii CONFERENCE COMMITTEE ix CPEUG 80 REFEREES x DBMS PERFORMANCE ON THE SIMULATION MODELING OF NETWORK DBMS Jan Aitken / Harry T. Hsu FEDSIM/MV 3 AN APPROACH TO BENCHMARKING DBMS Barbara Anderson Satellite Business Systems 11 SECURITY, FRAUD, & AUDIT EDP AUDITING IN THE 1980'S OR THE VANISHING PAPER TRAIL Richard E. Anderson Performance Systems, Inc . 23 TRACKING POTENTIAL SECURITY VIOLATIONS Robert L. Lehmann Union Carbide Corporation 33 xi SOFTWARE IMPROVEMENTS MEASURING PROGRAMMING PRODUCTIVITY Peter F. Zoll Octopus Enterprises 49 COMPARATIVE PERFORMANCE OF COBOL VS. PL/I PROGRAMS Dr. Paul J. Jalics Cleveland State University 53 DATA COMMUNICATIONS NBS NETWORK MEASUREMENT METHODOLOGY APPLIED TO SYNCHRONOUS COMMUNICATIONS Dr. Marshall D. Abrams National Bureau of Standards Dorothy C. Neiman Commtex, Inc 63 INTRODUCTION TO DATA COMMUNICATIONS SYSTEM PERFORMANCE PARAMETERS Dana S. Grubb National Bureau of Standards 71 COMPUTER NETWORK PERFORMANCE USER-ORIENTED CARRIER SENSE MULTIPLE ACCESS BUS SIMULATOR Marjan Krajewski The MITRE Corporation 79 A COMPARATIVE EVALUATION OF LOCAL AREA COMMUNICATION TECHNOLOGY Ronald L. Lars en NASA Goddard Space Flight Center Jonathan R. Agre Ashok K. Agrawala University of Maryland 87 PERFORMANCE PREDICTION TECHNIQUES - I SOME PROPERTIES OF A SIMPLE DETERMINISTIC QUEUING MODEL Rollins Turner Digital Equipment Corporation 101 A HIGHLY PARAMETERIZED TOOL FOR STUDYING PERFORMANCE OF COMPUTER SYSTEMS Dr. Herman Hughes Michigan State University Ill xii OPTIMAL SELECTION OF CPU SPEED, DEVICE CAPACITIES, AND ALLOCATION OF FILES WITH VARIABLE RECORD SIZE Kishor S. Trivedi / R. A. Wagner Duke University 129 PERFORMANCE PREDICTION TECHNIQUES - II MVS PERFORMANCE PREDICTION USING MECHANICALLY - GENERATED QUEUING MODELS B. A. Ketchledge AT & T Corporation R. J. Feil New York Telephone 139 AN I/O SYSTEM MODEL FOR 303X PROCESSORS Sushil Bhatia / Phillip Carroll IBM Corporation 157 CONFIGURATION AND CAPACITY PLANNING IN A DISTRIBUTED PROCESSING SYSTEM Dr. Kenneth C. Sevcik / G. S. Graham / J. Zahorjan University of Toronto 165 CAPACITY PLANNING FILE ALLOCATION METHODOLOGY FOR PERFORMANCE ENHANCEMENT Sujit R. Kumar Digital Equipment Corporation Robin B. Lake Case Western Reserve University C. Tom Nute General Dynamics Corporation 175 CAPACITY ANALYSIS OF SHARED DASD CONTROL UNITS Floyd L. Pedriana County of Los Angeles 189 A NOTE ON COMPUTER SYSTEM CAPACITY PLANNING THROUGH MATERIAL REQUIREMENTS PLANNING Kasumu Salawu Bell Laboratories 199 STATISTICAL METHODS ADAPTIVE LOAD CONTROL IN BATCH-INTERACTIVE COMPUTER SYSTEMS Samuel T. Chanson / Prem S. Sinha University of British Columbia 207 xm SENSITIVITY ANALYSIS AND FORECASTING FOR LARGE SCALE IBM COMPUTER SYSTEMS: A METHODOLOGICAL APPROACH AND CASE STUDY Carl Steidtmann Mountain Bell 215 MEASURING SYSTEM PERFORMANCE A PERFORMANCE EVALUATION STUDY OF UNIX Luis F. Cabrera University of California 233 I/O PERFORMANCE MEASUREMENT ON CRAY- 1 AND CDC 7600 COMPUTERS Ingrid Y. Bucher Ann H. Hayes Los Alamos Scientific Laboratory 245 FORECASTING COMPUTER PROCESSING REQUIREMENTS: A CASE STUDY Ronald D. Tomberlin Environmental Protection Agency 255 HUMAN INTERFACES DATA PROCESSING USER SERVICE MANAGEMENT Peter S. Eisenhut IBM Corporation 265 INSTALLATION MANAGEMENT PERFORMANCE EVALUATION OF COMPUTER OPERATION PROCEDURES Patrick Drayton Southwestern Bell Telephone Company 279 EVALUATING TOTAL COMPUTER RESOURCES FOR TOP MANAGEMENT Richard L. Fidler U. S. Department of Commerce 289 THE AIR FORCE BASE LEVEL COMPUTER PERFORMANCE MANAGEMENT PROGRAM John Graham, Jr. Air Force Data Systems Design Center 295 xiv PROTOTYPING/BENCHMARKING RTE'S - THE PAST IS PROLOGUE Mitchell G. Spiegel International Computing Company 303 APPLICATION PROTOTYPING: A CASE STUDY C. Wesley Jenkins Congressional Budget Office 311 OPERATING SYSTEM PERFORMANCE METERS PERFORMANCE EVOLUTION IN A LARGE SCALE SYSTEM Richard Brice J. Wayne Anderson Los Alamos Scientific Laboratory 319 xv DBMS Performance On the Simulation Modeling of Network DBMS Jan A. Aitken Harry T. Hsu Directorate of Design Assessment Federal Computer Performance Evaluation and Simulation Center Washington, DC 20330 This paper describes an approach to the simulation modeling of Data Base Management Systems (DBMS) - specifically, those DBMS which are based on the CODASYL DBMS concepts and specifications [1-4] . The modeling approach provides the system designer (or data base administrator) with a powerful tool for the prediction and evaluation of DBMS performance. Modeling language statements corresponding directly to actual data description language (DDL) and data manipula- tion language (DML) statements allow the straightforward definition of data base structures (schemas) and representation of application program behavior. A pre-defined model structure which includes repre- sentations of DBMS control functions (e.g., data base access control, space management, buffer management) and DML operations, facilitates DBMS modeling. The use of a simulation-time "replica" of a modeled data base enables detailed modeling of data base navigation and accessing. The DBMS model supports the investigation of many factors affecting DBMS performance, including: workload composition, logical data base structure, physical data base organization, data base size, data base page size, number of buffers and buffer management strategies, data base loading factors, and secondary storage device characteristics. DBMS performance prediction and evaluation is supported by the automatic collection and reporting of a variety of performance statis- tics pertaining to (1) data base area queuing, utilization, and accessing, (2) run-unit execution time and data base I/O, (3) buffer utilization and queuing, and (4) run-unit DML request queuing. Key words: Data base management systems; simulation modeling; per- formance evaluation; data base; schema; data description language; data manipulation language; CODASYL. 1. Introduction in contemporary computer systems. A DBMS is typically a complex software system with The use of Data Base Management Systems substantial computer system resource require- (DBMS) is becoming increasingly widespread ments. DBMS performance can be a significant factor in overall computer system performance. figures in brackets indicate the litera- Consequently, the prediction and analysis of ture references at the end of this paper. DBMS Performance merits a prominent place within the field of Computer Performance Evaluation (CPE) . The prediction and analysis of DBMS performance is a complex issue since DBMS performance can be influenced by many diverse factors related to logical data base design, physical data base design, secondary storage device characteristics, workload composition and characteristics, and DBMS operation/ tuning parameter settings. There exists a definite need for tools and techniques for DBMS performance evaluation. The CODASYL data model [1-4] represents a popular approach to data base management . Many commercially available data base manage- ment systems are based on the CODASYL speci- fications . In this paper, we present an approach to DBMS simulation modeling for the perform- ance prediction and evaluation of CODASYL- type DBMS. The DBMS model provides the system designer (or data base administrator) with a powerful tool for DBMS performance evaluation and performance prediction. It can be used to: (1) Predict the performance of a DBMS for a given data base design and DBMS workload Section 2 describes the salient features of the DBMS modeling approach. Section 3 describes the DBMS and data base performance- related outputs produced by the model. 2. Description of the Modeling Approach This section presents the salient features of the modeling approach. A detailed description of the DBMS model is provided in a separate document [18] . 2.1 Overview The structure of the DBMS model closely resembles that of an actual DBMS. A simulated Data Base Control System (DBCS) handles the user interface and manages access to the data base. A simulated data base replicates the content and structural aspects of a real data base. A Data Description Language (DDL), similar to the CODASYL DDL, is used to describe a data base. A set of Data Manipulation Language (DML) commands, encompassing all CODASYL DML operations, is used to charac- terize application program data base activity. By using the DML functions provided, simulated application programs can store and retrieve data base records, as well as navigate through the simulated data base. 2.2 Data Base (2) Investigate DBMS performance rela- tive to changing workload, and perform DBMS stress-test studies (3) Evaluate alternative logical and physical data base designs to achieve a reasonable performance-efficient data base design (4) Support DBMS performance tuning and investigate DBMS performance problems. The modeling approach results in a DBMS model which closely simulates the operation of a real DBMS. The fundamental concepts of the model design are essentially the same as those of the CODASYL approach to data base management. Throughout this paper, it is assumed that the reader is already familiar with those concepts. For information on the CODASYL specifications and DBMS based on those specifications, the reader is referred to references 1—11. The DBMS model has been implemented using the Extendable Computer System Simula- tor II (ECSS II) [17]. ECSS II, a special- purpose language for constructing discrete- event simulation models of computer systems, is a super-set of SIM"CRIPT II. 5 [15, 16]. The simulated data base is a replica of the data base being modeled. It is virtually identical to the real data base in both content and structure, except for the fact that it contains no real data. An area (or realm) is a subdivision of a logical data base. For each area in the real data base, there is a corresponding area in the simulated data base. An area in the simulated data base has the same number of pages as the real area it represents. Thus, for each page in the real data base being modeled, there is a corresponding page in the simulated data base. A page in the simulated data base is used to store simulated record occurrences. The structure of a simulated data base page resembles that of a real data base page . Record types are mapped to areas. All occurrences of a record type can be stored in only one area, however more than one record type can be mapped to the same area. Record occurrence placement within an area is deter- mined by record type location mode. The location mode specifies how record type occurrences are to be stored within an area, as well as how they are to be retrieved. A stored record occurrence in an actual data base consists of two parts: PREFIX and DATA. The PREFIX contains all the pointers in the record, representing the relationships between the record and other stored records. The DATA is the useful data which is supplied by the user. For records in the simulated data base, only the PREFIX is stored. Each stored record contains the same number of pointers as the real data base record it represents. The data portion is not stored, and is represented only by an attribute called "data length". All elements of a set occurrence (in- cluding owner and members) are linked to- gether using pointers. The same number of pointers for each set occurrence are stored in the simulated data base as in the actual. For example, if a set member record has a "prior" pointer, then the corresponding simulated member record will also have a "prior" pointer. The size of a simulated page should in general be much smaller than that of a real data base page because the simulated page contains no real data, only pointers. If the ratio of data length to prefix length is very large, then the ratio of real page size to simulated page size will also be very large. The simulated page size is chosen so that a simulated page can accomodate the same number of records as a real data base page. The physical data base consists of one or more files. A file is an addressable segment of secondary storage known to the computer operating system. Each data base area is mapped to a file. A simulated data base can contain any number of areas and files. A file in the simulated data base can contain a single area, a portion of an area, or several areas . For each page in the simulated data base, there is a corresponding block in a file. (3) Determine the physical storage device address (device, cylinder) of the block (4) Simulate a seek operation to posi- tion the device read/write mechanism at the required cylinder (5) Simulate a data transfer operation to read/write the block. A feature of the data base modeling capabilities is the automatic generation in zero simulated time of a simulated data base. Using this feature, a simulated data base is populated with records according to the information supplied in the model's data base schema (i.e., the identification of record types and, for each record type, a count of the number of record occurrences to be initially stored). The resulting data base will have the specified number of occurrences of each record type. Furthermore, set rela- tionships will be established for the stored records. Since a simulated data base is generated in zero simulated time, this is the most efficient way to produce a simulated data base. This can be used to advantage since, generally, the data base being studied will not be initially empty. In addition, once a data base is generated in this fashion, it can be saved, and copies of it used in future simulation runs, thus avoiding the cost of re-creating the same simulated data base . 2.3 Data Description Language The model provides a DDL to describe a complete data base schema. No sub-schema description facilities are currently provided. A schema definition in the model is almost identical to a real schema definition. It provides for the description of files, areas, record types, and set types. The principal differences are: Each simulated file is mapped to a simulated direct— access secondary storage device. A secondary storage device is char- acterized in terms of its capacity, transfer rate, and access time. The simulation of a data base page access involves the following steps: (1) Determine the data base area which contains the desired page (2) Determine the file and the relative block within the file containing the desired page a. In a record type description, indi- vidual data items are not described. Data items are treated as a unit and represented for simulation modeling purposes only by a total data length. b. In the location mode specification for a record type, the DUPLICATES NOT ALLOWED option, specifying whether duplicates are permitted for a defined key, is not supported. It is assumed that duplicates are always allowed. c. In a set description, the ORDER IS SORTED option is not supported. Figure 1 presents an example of the model's DDL. DATABASE MUSIC. DP PAGESIZE = 10S4 BYTES LOADING FACTOR = 0.5 DBFILE MUSIC. FILE AREA MUSIC. AREA ID = 10 PAGES = 100 DBFILE = MUSIC. FILE DATABASE. ADDRESS = 1 RECORDTYPE PERIOD ID = 801 DATASIZE = 33 AREA = MUSIC. AREA LOCATIONMODE = CALL RECORDTYPE COMPOSER ID = 803 DATASIZE = 4 AREA = MUSIC. AREA LOCATIONMDDE = CALC JET STYLE ID = 701 OWNER. RECORDTYPE = PERIOD ORDER = NEXT NEXT. DBKEY. POSITION = 1 PRIOR. DBKEY. POSIT ION = £ SET. MEMBER RECORDTYPE = COMPOSER SET = STYLE NEXT. DBKEY. POSITION = 1 PRIOR. DBKEY. POSITION = £ OWNER. DBKEY. POSIT ION = 3 MEMBERSHIP = MANDATORY. AUTOMAT I C Figure 1. Example of Model's DDL (c) owner, first, last, next, prior of set (d) first, last, next, prior of area (5) MODIFY. RECORD (6) DELETE. RECORD (7) INSERT. RECORD (8) REMOVE. RECORD The approach to simulating each of the above DML operations is described below. The CODASYL specifications use the concept of "currency" to define implied operands for most DML commands. Thus, each run-unit has a current record of the run-unit and a current record of each set type, area, and record type accessed by the run-unit. A real CODASYL DBMS automatically updates appropriate currency indicators whenever a record is accessed. The DBMS model also automatically maintains currency indicators in a similar manner. 2.4.1 OPEN. AREA The OPEN. AREA DML initializes an area for access by a run-unit in one of six usage modes: retrieval, update, protected retrieval, protected update, exclusive retrieval, and exclusive update. The simulated DBCS ' Area Manager (reference Section 2.5.1) is invoked to process the OPEN. AREA request. Depending on the current mode of usage of the area and the requested usage mode, the OPEN. AREA request will either be granted or queued. If the request is queued, further execution of the run-unit is not permitted until such time as the request is granted. 2.4.2 CLOSE. AREAS The CLOSE. AREAS DML terminates a run- unit's access to all areas it has opened. The Area Manager is invoked to process any queued area usage requests which may be granted due to the release of the run-unit's areas. 2.4 Data Manipulation Language 2.4.3 STORE. RECORD The modeling approach provides a set of Data Manipulation Language (DML) operations for use in characterizing run-unit data base accessing. The DML operations provided encompass most of the CODASYL-specif ied DML functions, and include the following: (1) OPEN. AREA (2) CLOSE. AREAS (3) STORE. RECORD for location modes DIRECT, CALC, and via a set (VIASET) (4) FIND. RECORD (a) direct (b) CALC Record type occurrences are stored in the simulated data base according to the location mode (DIRECT, CALC, or VIASET) specified in the schema definition for the record type . If the location mode is DIRECT, then the run-unit must supply a data base key when issuing the STORE. RECORD command. The data base key specifies the data base page on which the record is to be stored. The simu- lated DBCS will try to store the record in the specified page. If insufficient space is available in the specified page, the simu- lated DBCS will store the record in a nearby page determined by the DBCS's Space Manager. After a record is stored in this mode, its correct data base key is returned to the run-unit. For a location mode of CALC, the real DBCS uses the hashing value of a record's data item or data items to determine the data base page on which to store the record. In the DBMS model, since no actual data is stored, a random number is generated for the address of the page on which to store the record. If the distribution of the hashed data item values is close to random, the two approaches should have roughly identical distributions of record occurrences over a specified page range. If there is insuffi- cient space in the target page for the record, then the record is stored in an overflow page. The simulated DBCS' Space Manager (reference Section 2.5.3) is invoked to determine the overflow page. Record storage statistics are automatically collected for CALC records. If a CALC record is stored in its target page, it is considered as a "CALC- fit"; otherwise, it is considered to be a "CALC-overflow". After a CALC record is stored in a page, it is inserted into the CALC set for the page. In the real data base, the position of the new record within the CALC set is determined by the value of the CALC key data item, as well as by the specification in the schema as to whether duplicates are first or last. In the simu- lation model, a random number is generated to determine where within the CALC set the new record should be inserted. If the location mode is VIASET, a record is stored by the simulated DBCS in the same way as by the real DBCS. The simulated DBCS uses the current of the specified set to determine where to store the new record. If the current of the set is a member of the set, then the new record is stored into the page containing the current of set (or as close as possible). If the current of the set is the owner of the set, the new record is stored into the same page as the owner (or as close as possible) if the new record is to be stored in the same area as the owner. If the new record is to be stored in a different area than that of the owner, the new record is stored at a distance into its area propor- tional to the distance of the owner record into its area. Record storage statistics are automatically collected for VIASET records. If a VIASET record is stored in its target page, it is considered as a "VIASET-f it" ; otherwise it is considered to be a "VIASET- overf low". 2.4.4 FIND. RECORD The location mode of a record type, in addition to determining how a record occur- rence is to be stored, also determines how a record may be found. Most forms of the FIND DML described in the CODASYL specifications are supported in the model. If a record is stored using location mode DIRECT, then it can be found (directly) only by using the direct access form of the FIND. RECORD DML command, in which the record's data base key, specifying on which page the record is stored, is supplied. If a record is stored using a location mode of CALC, then it can be found using the CALC key access form of the FIND. RECORD DML command. A random number is generated for the record's page number (analogous to hashing a record's CALC key to yield a page number) . The set of random numbers so gen- erated is constrained to correspond to page numbers on which occurrences of the record type have already been stored. If there is more than one occurrence of the specified record type in a CALC set, a random number is used to decide which occurrence should be selected and thus how much of the CALC set need be searched. If the location mode of a record type is VIASET, then occurrences of the record type can only be found through set navigation. Since set relationships are maintained among records in the simulated data base just as they are among records in a real data base, set navigation can be simulated just as it would be done in the real data base. The set access forms of the FIND. RECORD DML command (find owner, find first, find last, find next, find prior) are used to perform the set navigation. The simulated data base structure supports the simulation of area navigation. The simulated data base and the real CODASYL data base are essentially identical in struc- ture. They have the same number of areas, and each area in the simulated data base has the same number of pages as the corresponding area in the real data base. The area access forms of the FIND. RECORD DML commands (find first, find last, find next, find prior) are used to perform area navigation. 2.4.5 MODIFY. RECORD A record in the simulated data base is modified using the MODIFY. RECORD DML command. Two types of record modification are supported; one involves modifying the CALC key data item, the other involves modifying other data items. If the CALC key data item is not modified, then the buffer page containing the record is marked as "modified". The simu- lated DBCS ' Buffer Manager (reference Section 2.5.2) will, just like the real DBCS, cause the modified page to be re-written to sec- ondary storage when the buffer is to be reused. If the CALC key data item is modi- fied, then, in addition to marking the buffer page containing the record as "modified", the simulated DBCS also simulates the reposi- tioning of the record within the CALC set. 2.4.6 DELETE. RECORD A record occurrence is logically deleted from the simulated data base using the DELETE. RECORD DML command. The deletion of a record might result in the logical deletion of many record occurrences in the data base. Since a simulated data base replicates the structural aspects of a real data base, the simulated DBCS performs the DELETE. RECORD DML operation in the same way as the real DBCS. 2.4.7 INSERT. RECORD A record occurrence is inserted as a member of a set using the INSERT. RECORD DML command . 2.4.8 REMOVE . RECORD A record occurrence is removed from a set using the REMOVE . RECORD DML command. 2.5 DBCS Operations The Data Base Control System (DBCS) is the set of DBMS functions for the management of data base access. The DBCS functions having a significant impact on DBMS perform- ance are assumed to be: area management, buffer management, space management, data base access, journaling, and data dictionary access. The modeling approach provides for the representation of each identified DBCS function. 2.5.1 Area Management The DBCS Area Manager is responsible for coordinating the use of the areas of a data base. An area may be used concurrently by multiple run-units, but certain usage modes restrict or even preclude concurrent usage. The DBCS Area Manager processes a run-unit's area usage request, and, depending on the current usage of the requested area, either grants the request or causes the delay of further execution of the run-unit until the requested usage can be permitted. Area queuing and utilization statistics are auto- matically collected and reported. 2.5.2 Buffer Management The modeled DBCS includes a simulated buffer pool and an associated buffer manage- ment strategy. The number of buffers in the buffer pool is a DBMS model parameter. Each buffer can contain one data base page. The DBCS Buffer Manager manages the buffers in the buffer pool using a Least Recently Used (LRU) technique. If all buffers are full and the DBCS needs to read a new page, the buffer with the page least recently accessed is reused. If the buffer to be reused currently contains a modified page, that modified page is re-written to the simulated data base and a page I/O is simulated prior to reusing the buffer. An alternative buffer management strategy may be used simply by replacing the model's Buffer Manager module. Buffer queuing and page I/O statistics are auto- matically collected and reported. 2.5.3 Space Management Data base space management , in particular overflow handling, is simulated by the DBCS Space Manager. When the simulated DBCS attempts to store a record occurrence in a page and finds that there is insufficient space available in that page, the Space Manager is invoked to identify an overflow page to be used to store the record. The default space management technique provided in the model uses special space inventory pages distributed at intervals throughout the data base to keep track of data base free space. An alternative space management technique can be modeled by simply replacing the model's Space Manager module. 2.5.4 Journaling Journaling involves creating a journal file, i.e., a file containing before-images and after-images of all data base pages which have been modified. The simulation of journaling activity is supported by the modeled DBCS. Journaling is enabled via the specification of a DBMS model parameter. If journaling is to be simulated, a simulated file is designated as the journal file. Each time a data base page is modified, two journal file I/O operations are simulated, one to represent writing a before-modif ication image of the page and one to represent writing an after-modification image of the page. 2.5.5 Data Dictionary Access DBMS operation involves a certain amount of overhead due to Data Dictionary accessing (e.g., to access schema, sub-schema, and Device Media Control Language (DMCL) defini- tions) . A future enhancement of the DBMS model will be to provide for the simulation of DBMS data dictionary accessing. 2.6 Run-Unit Application Programs For each physical file: (a) // reads and writes (b) # seeks and zero-time seeks (c) seek time (d) data transfer time (e) total access time A run-unit application program is repre- sented procedurally. The procedure consists primarily of a sequence of the DML statements describe above. Since the DML operations available in the model correspond in a straight-forward, almost one-to-one, fashion with those of the CODASYL specifications, the modeling of an application program's data base activity can be relatively easily accomp- lished . The use of the model's DML statements is demonstrated in the example application program segment shown in Figure 2. □PEN HRER "1 j : i c hPEh U AGE = EXCLUSIVE. UPDATE FOR = 1 TO N. Da 3TDPE .PEC □PD PE lOPDT' ■PE = PERIOD LOOP FOR [ = 1 TO M. DO FIND. RECORD JEC 3RDTYPE = PERIOD 2 raPE .PEL □ PD PE :ordt •PE = = COMPOSER LOOP FIND .PEC DPD PEC DPDTYPE = PERIOD F I ND .PEC □ PD SET = STYLE REI □PD = FIP'T FIND .PEL □ PD set = STYLE REI .□PD = NEXT F I ND .PEC opd :et = STYLE REI :OPD = OWNER clo: e.hpefi: Figure 2. Example of Model's DML 3. DBMS Performance-Related Model Outputs The DBMS model automatically collects and reports a variety of DBMS and data base related performance statistics, which are summarized below. For each area of a modeled data base: (a) queuing and utilization (b) # pages read and written (c) page access time For each simulated run-unit: (a) total execution time (b) // pages requested (c) // pages read and written (d) CALC records stored (CALC-fit, CALC-overf low) (e) VIASET records stored (VIASET-fit, VIASET-overflow) For the DBMS Buffers: (a) queuing and utilization (b) # pages requested (c) # pages read (d) # modified pages written For the DBMS: (a) run-unit DML request queuing (b) run-unit concurrency level 4. Concluding Remarks The DBMS model has been found to be easy to use. Since the DDL and DML functions available in the model correspond in a straight-forward, almost one-to-one, fashion with those of the CODASYL specifications, the modeling of a particular data base and asso- ciated application programs can be easily accomplished. Furthermore, since the model includes pre-defined representations of the basic DBMS operations (e.g., data base space management, area allocation management, buffer management, etc.), no modeler effort is required in these areas as long as the default algorithms are accurate for the particular CODASYL-based DBMS being modeled. Validation of the DBMS model is in progress. The results of the validation will be reported in a future document. References [1] CODASYL Data Base Task Group, April 1971 Report, ACM, New York, NY. [2] CODASYL Data Description Language Com- mittee, Data Description Language, Journal of Development, Document C 13.6/2:13, U.S. Government Printing Office, Washington, DC. [3] CODASYL Programming Language Committee, CODASYL COBOL Journal of Development, Department of Supply and Services, Government of Canada, Technical Service Branch, Ottawa, Canada. [4] CODASYL Data Base Language Task Group, CODASYL COBOL Data Base Facility Pro- posal, 1973, Department of Supply and Technical Services, Government of Canada, Ottawa, Canada. [5] Tsichritzie, D. C. and Lochovsky, F. H. , Data Base Management Systems, Academic Press, New York, NY, 1977. [6] Ullman, J. D., Principles of Data Base Systems, Computer Science Press, Potomac, MD, 1980. [7] Date, C. J., An Introduction to Data Base Systems, Addison-Wesley, Reading, MA, 1975. [8] Taylor, R. W. and Frank, R. L., CODASYL Data Base Systems, ACM Computing Surveys, Vol. 8, No. 1, March 1976, pp. 67-103. [9] Cullinane Corp., IDMS Concepts and Facilities, IDMS Data Base Design and Definition Guide, IDMS Programmer's Reference Guide, IDMS Utilities, Wellesley, MA, 1979. [10] Digital Equipment Corp., DBMS-11 Data Base Administrator's Guide, DBMS-11 COBOL Data Manipulation Language Ref- erence Manual, Maynard, MA, 1977. [11] Digital Equipment Corp., DBMS-10 Data Base Administrator's Guide, DBMS-10 Programmer's Manual - COBOL, Maynard, MA. [12] Bachman, C. W. , "Data Structure Diagrams", Data Base 1, 2, Summer 1969. [13] Bachman, C. W. , "The Evolution of Storage Structures", CACM 15, No. 7, July 1972, pp. 628-634. [14] Bachman, C. W. , "The Programmer as Navigator", CACM 16, No. 11, November 1973, pp. 663-6581. [15] Kiviat, P. J., Villaneuva, R. , and Markowitz, H. M. , SIMSCRIPT II. 5 Pro- gramming Language, CACI, Inc., Los Angeles, CA, 1976. [16] CACI, SIMSCRIPT II. 5 Reference Handbook, CACI, Inc., Los Angeles, CA, 1976. [17] Kosy, D. W. , The ECSS II Language for Simulating Computer Systems, Rand Report R-1895-GSA, December 1975. [18] Aitken, J. A. and Hsu, H. T. , "An ECSS- Based CODASYL DBMS Simulation Model", (In Preparation). 10 An Approach to Benchmarking DBMS BARBARA N. ANDERSON* Satellite Business Systems 8003 Westpark Drive McLean, Virginia 22102 This paper presents a data base management system (DBMS) bench- marking methodology developed by FEDSIM to support an acquisition of both a computer mainframe and a DBMS. The DBMS benchmarking metho- dology provides procedures for: 1. defining a logical data base structure, 2. constructing programs to generate files incorporating the logical relationships, and 3. constructing batch and on-line benchmark programs which in- corporate the logical DBMS relationships through the use of vendor-independent interface functions. In order to execute the benchmark programs, the vendor must im- plement the logical data base structure in the proposed physical data base structure, by replacing the interface functions with appropriate DBMS commands. The vendor may implement any physical data base struc- ture, as long as the functional requirements specified by the benchmark programs are met. The benchmark data base and programs were implemented and tested by an agency team of programmer/analysts, with the agency team perfor- ming vendor functions. This test of the DBMS benchmarking methodology is described in this paper. Potential problems and benefits of this approach to benchmarking data base management systems are also dis- cussed. Finally, a comparison of the similarities and differences between the U.S. Department of Agriculture's DBMS benchmark methodol- ogy and the benchmark methodology presented in this paper is presented. * The author developed this benchmark methodology with Mr. Kenneth C. Rieck and Mr. Charles A. Self while employed at the Federal Computer Performance Evaluation and Simulation Center (FEDSIM). Mr. Rieck currently works for FEDSIM. Mr. Self works for the Office of Data Management and Telecommunications, Veterans Administration. 1. Introduction The selection of a Data Base Management System (DBMS) to support corporate data base functions is commonly a "hit or miss" decision based on a review of literature from the vendors and discussions with salesmen. This method of selection may or may not provide the organization with the DBMS that has the capability to provide the required support. 11 This paper discusses an approach to benchmarking Data Base Management Systems (DMBS) which was developed by the Federal Computer Performance Evaluation and Simu- lation Center (FEDSIM) . This DBMS bench- marking methodology was developed to support an acquisition of both a computer mainframe and a DBMS by a Federal agency. The agency currently relies on' a DBMS for a large portion of its information processing. Each computer vendor must not only bid the re- quired hardware, but must also propose a DBMS to meet the agency's data management requirements. The acquisition follows federal procurement regulations for competi- tive procurements, which specify that no vendor is to be given undue advantage over another. The data base portion of the benchmark includes both batch and on-line processing. The proposed DBMS must be able to support concurrent access and update by multiple users. 2. Benchmark Data Base The benchmark documentation provided to the vendor includes a definition of the logical structure of the data base. The logical structure is the basis for the functional description of the data base portion of the benchmark. The vendor may implement any physical data base structure as long as the functional requirements specified by the benchmark programs are met. 2.1 Logical Structure of the Data Base The logical structure of the data base was determined through a workload study of the procuring agency's current data base processing. This workload study consisted of processing accounting and terminal log tapes to determine the frequency of execu- tion of application modules which process against the data base. The most frequently executed on-line and batch data base appli- cation modules were examined as to their access paths through the data base and their mode of access, i.e., read, update, or add. The resulting logical structure is a model of a current agency data base. The model contains all variations of the file rela- tionships found in the application data base. The logical structure consists of eight files, named A through H, with the relation- ships shown in Figure 1. Each file consists of a group of related logical records. Figure 1 . Data Base Logical Structure (Throughout this paper, the terms set, owner, member, and occurrence are defined as follows. A set is defined as a two- level tree of files in which there is one owner file and one or more member files that belong to the owner file. Any file in a particular set may be either an owner or a member file in another set. A set occur- rence is a particular related group of records in a set based on the owner record.) In the agency data base, files A, B, C, and E are key-retrieval type files. A particular record from one of these files can be retrieved directly, based on the provided key value. The other files must be accessed through their relationships with the key-retrieval type files. Files B and E can be accessed either directly, based on a key value, or through their relationships with their owner files. Figure 2 contains the data base character- istics. 12 NUMBER OF NUMBER OF NUMBER OF CHARACTERS PER RECORDS PER RETRIEVAL FILE-NAME DATA FIELDS RECORD FILE TYPE A 6 100 6,100 KEY B 8 100 7,500 KEY & RELATIONSHIP C 18 379 32,000 KEY D 9 37 32,000 RELATIONSHIP E 18 268 95,000 KEY & RELATIONSHIP F 6 85 620,000 RELATIONSHIP G 12 120 400,000 RELATIONSHIP H 12 66 1,200,000 RELATIONSHIP Figure 2. Data Base File : Characteristics All files are generated by programs which automatically incorporate the appro- priate logical relationships. The first 24 characters of each record contain the information necessary to determine file and record relationships. The first field contains the file name to which the record belongs. The second field is the unique key number designating the position of the record in the file. All records contain this field regardless of the type of file access. The "top node files," A and C, contain valid information in the first and second fields only. Records for the rest FILES A AND C ONLY of the files contain information about their specific owner record within a set occurrence. The third field contains the file name of the owner record, while the fourth field is the record key of the particular owner of the set occurrence to which this particular record belongs. File D is the only file that has two owners: the information for the second owner is contained in the fifth and sixth fields. Figure 3 shows the schematic of the 24- character information data fields at the beginning of each record. FILE D ONLY OWNER FILE NAME OWNER KEY OWNER 2 FILE NAME _> 16 OWNER 2 KEY T7 "T4 FILES B,E,F,G, AND H Figure 3. Logical Relationship Information Fields Example As an example of the use of the logical relationship information fields, suppose that a record contains the follow- ing values in the 24 leading characters: 6,995 of file B is the owner record. This record also belongs to a set with owner file C. Record 14,495 of file C is the owner record of the second set occurrence. 2.2 Data Base Generation D0014495B0006995C0014495 This would indicate that the record is record number 14,495 of file D. The record belongs to a set in which file B is the owner file. The record belongs to the set occurrence of that set in which record The size of the DBMS benchmark data base is large enough to fill several disk packs, in order to represent more nearly accurately the actual workload. The data base contains approximately 234 million characters. Instead of providing several reels of magnetic tape to the vendors 13 (along with the attendant problems of read checks, etc.)t the vendors are provided with eight file-generation programs; one for each of the eight files. These programs, which are executed at the vendor's site, generate the contents of the data base, including all necessary information concerning logical relationships among records and files. The data fields of each record contain randomly- generated alphabetic and numeric data. 2.3 Physical Structure of the Data Base The vendor subsequently loads the generated files into the physical data base structure using the proposed DBMS. The logical relationship information fields are data fields, and must be loaded along with the other data fields. The DBMS benchmark programs must be able to manipulate the logical relationship information fields in the same manner as the other data fields. All accesses to the data base are performed on a full record basis. The logical relationship information provided in the generated records of each file is not intended to specify a physical file structure. The vendor may load the data base in any manner, provided the functional requirements specified by the benchmark programs are met and the benchmark can be run within the allowed time frame. 3. DBMS Interface Functions A description of the required data base functions within the benchmark programs, such as "adds," "reads," and "replaces," are provided through a vendor-independent DBMS language. 3.1 Purpose and General Description The DBMS benchmark programs were written in ANSI COBOL and do not contain any architecture-dependent code, except for statements such as SELECT statements, etc. Vendor-independent DBMS functions were developed and placed in the programs to represent DBMS access.. The interface functions were inserted into the code in the form of COBOL comment statements. The vendor must replace the interface functions with DBMS commands that perform the re- quired functions. 3.2 Implementation Access to key-retrieval type files is requested on the basis of a key value, which is provided as a parameter in the interface function. When access to a record based on its relationships with other records and files is required, then the desired relationship path is indicated (owner or member record) and the specific set occurrence is also indicated by using the concept of "the current record of a file." The current record is defined as the last record accessed for that file. A current record must be separately maintained for all files accessed by each particular program. Figure 4 is an example of the provided DBMS interface function for a read which depends on a relationship path. Figure 5 shows the translation of the DBMS interface function into IDS-II commands to perform the specified function. The defi- nition of the interface function and "end- flag" processing requirements is described in the next section. Communication of the data content of records is provided through an area in the benchmark programs called the "user-working- area." The vendor's code must deliver the record content to this area when a record is being retrieved from the data base, and accept data from this area when a record is being sent to the data base. 3.3 Interface Function Definitions Variations of three basic DBMS inter- face functions, DB-READ, DB-REPLACE, and D81200 5800-FILE-F-DB-READ-FORMAT-II. 081300*********************************************************** 081400* 081500* DB-READ RECORD OF FILE (F) WHOSE OWNER FILE IS (E), 0816C0* RETURN END-FLAG (END-FLAG-F) 081700* 081800*********************************************************** 08190 58 90-EXIT-FILE-F-DB-READ-II. 082000 EXIT Figure 4. DBMS Interface Function 14 083600 5800-FILE-F-DB-READ-FORMAT-IJ. 084100 FIND NEXT F WITHIN E-F. 084200 IF DB-STATUS EQUALS STAT-OK 084 300 MOVE ZERO TO END-FLAG-F. 084400 IF DB-STATUS EQUALS STAT-FIND-END 084 500 MOVE 1 TO END-FLAG-F. 084600 GET F. 084700 IF DB-STATUS NOT EQUAL TO STAT-OK 084800 PERFORM 9000-ERROR. 084900 MOVE F TO FILE-F. 085200 5890-EXIT-FILE-F-DB-READ-II. 085300 EXIT. Figure 5. Example Translation DB-ADD, are used. No delete function has been specified. There are three DB-READ formats, one DB-REPLACE format, and two DB- ADD formats specified in the programs, each of which provides a certain data base function. The DBMS interface functions are described in the rest of this section. previously read, then the value of End-Flag being passed back to the program will be set to 1 . DB-READ RECORD FROM OWNER FILE (Owner- File-Name) OF MEMBER FILE (Member-File- Name) DB-READ DB-READ RECORD OF FILE (Key-File-Name) USING KEY (Key-Name) This function retrieves a record from a key-retrieval type file by using its key value and places it in the user-working- area. The Key-File-Name parameter is the name of the file which contains the speci- fied record. The Key-Name parameter is the name of the variable which contains the numeric value of the key of the specified record. DB-READ RECORD OF FILE (Member-File-Name) WHOSE OWNER FILE IS (Owner-File-Name), RETURN END-FLAG (End-Flag-Name) This function retrieves a record from a specific member file based on its rela- tionship with the specified owner file and places it in the user-working-area. The current record of the specified owner file determines the set occurrence. The current record of the owner file has been established previously by another function. The speci- fied record of the member file is either (1) the next member record of the specified set occurrence following the current record of the named file, or (2) if there is no current record of the named member file, then the specified record is the first record of the member file of the specified set occurrence, or if there are no existing member records of the specified set occurrence or the last record in that file that is in the set occurrence has been This function" retrieves a record from the specified owner file based on its relationship with the specified member file and places it in the user-working-area. The current record of the specified member file determines the set occurrence. The current record of the specified member file has been previously established by another function. DP-REPLACE DB-REPLACE RECORD OF FILE (File-Name) This function performs an update function by replacing the current record of the specified file with the contents of the user-working-area. The current record of the specified file has been previously established by another function. DB-ADD DB-ADD RECORD TO FILE (Key-File-Name) USING KEY (Key-Name) This function adds a new record to a top-node file by using its key value. The contents of the user-working-area are added to the data base as a new record in the named file with the key that is specified by the Key-Name variable. DB-ADD RECORD TO FILE (Member-File-Name) WHOSE OWNER FILE IS (Owner-File-Name) This function adds a new record to a non-key-retrieval type file by using its 15 relationship with the specified owner file. The contents of the user-working-area are added to the data base as a new record in the specified member file, in particular as the last member of the set occurrence speci- fied by the current owner file record. The current record of the owner file has been previously established by another function. 4. DBMS Benchmark Programs Both benchmark programs and test programs are provided to allow the vendor to test the proposed implementation of the physical structure and DBMS interface function translation. 4.1 Test Programs The vendor is provided with two test programs. The first program generates eight files, with a total of 607 logical records. This data may be loaded into a test data base accessible through the proposed DBMS. The vendor can test the validity of the proposed physical structure with this test data base. The procedure used to load the test data base can be used to load the benchmark data base with necessary minor changes for physical size. The second test program provided to the vendor tests the translation of the DBMS interface functions into the vendor DBMS commands. The program exercises the DBMS commands, using the test data base. A report is generated which shows the actions completed and indicates any incorrect results encountered during the execution. Possible error messages are: XXX BAD ADD - CAN NOT READ XXX BAD REPLACE - CAN NOT READ XXX BAD READ - CAN NOT READ XXX BAD REPLACE - WRONG RECORD XXX BAD ADD - WRONG RECORD 4.2 Benchmark Programs The benchmark data base is accessed by both batch and on-line DBMS benchmark programs. These programs must have all DBMS interface functions translated into vendor DBMS commands before execution of the pro- grams is possible. The vendor must add any code necessary to handle errors that may occur due to system failure or implementa- tion errors, by causing the executing bench- mark program to abort with an appropriate error message. Any Open or Close functions that may be required by the DBMS must be added by the vendor to the benchmark pro- grams. The vendor's source listings will be reviewed by the Government benchmark team to ensure that no changes to program logic have been made. The specific set occurrences that are accessed by the benchmark programs are determined by an input seed number. The input seed number determines the first record accessed by the program and , through the use of an algorithm, all subsequent record accesses. The input seed numbers . used for test purposes are supplied so that the vendor can verify the output produced by the test runs against the sample output listings provided. New input seed numbers are provided by the Government benchmark team for the timed portion of the benchmark to ensure that the vendor does not optimize certain paths within the data base. There are three DBMS benchmark programs: the batch data base retrieval program, the batch data base adds program, and the on- line data base access program. The access paths of these programs are such that the areas of the data base that are to have records added or replaced by any one pro- gram are never read by other programs. This is to ensure that, regardless of the order of program executions, the results of a series of reads through a set occurrence will not change from run to run either in content (changed due to a replace function) or number (increased due to an add func- tion). The effect of simultaneous users on benchmark results is handled through the use of a separate input seed number for each execution of a program that adds or updates a record. This ensures that each execution is adding or updating a different set occurrence. The number of times each of these benchmark programs is to be executed during the one hour benchmark depends on the year of system life of the proposed system configuration that the vendor is benchmarking. This particular computer acquisition involves an eight year system life span. The number of executions of the benchmark programs increases each year of the system life span by the percentage of expected increase in the batch and on-line workloads to reflect the predicted increase in data base workload over time. 4.2.1 Batch Data Base Retrieval Program The batch data base retrieval program retrieves a selected record from file E of 16 the data base based on the value of the input seed number. By following relation- ship paths from the original E record, data elements are extracted from seven of the eight files of the data base and used to produce an output report. Although the exact record retrieved during any execution of the program will be the same for any given value of the input seed number, a different record will be retrieved if the seed number is changed. Figure 6 shows how this program accesses the data base. The following conventions hold for this and all subsequent figures: (1) The arrow indicates the access path direction. (2) The note beside the arrow line indicates the accessing function. (3) A shaded box indicates a file accessed by the program. (4) The unshaded boxes are files not accessed by the program. (5) The files that have the file name underlined indicate key-retrieval type files. 9//V, 7vV READ OWNER \ \ READ- OWNER c_ \ m READ OWNE R * FNTRY ^F.AIV KEY / \ pr an / \ _ firstD> v?w R s ST v 4 READ ALL 5^2 Ws w, I 4.2.2 Batch Data Base Adds Program The batch data base adds program required a file-generation program that is executed prior to the timed portion of the benchmark. This program generates a tape file of 5,000 transactions required as input to the batch adds program. An input seed number is used to vary the data that is produced. Immediately preceding the timed portion of the benchmark, the Govern- ment benchmark team gives a new seed number to the vendor, so that a new set of transactions is generated. The batch adds program adds the transactions read from the tape to file G, based on a program-calculated key value. This key value is used to determine a set occurrence in the set made up of files E, F, G, and H. A printed report of all the input records and their related file E key value is generated by the program. The number of records to be added is determined by the value of an input card read in at execution time. This effectively imposes the required workload that has been deter- mined prior to the benchmark run. An input seed number determines the data base paths accessed during that particular execution of the program. Figure 7 shows the data base access of this benchmark program. ENTRY o READ p KEY Figure 6. DB Batch Retrieves Program Figure 7. DB Batch Adds Program 17 4.2.3 On-line Data Base Access Program The on-line data base access program is executed in an interactive mode through a Remote Terminal Emulator (RTE) , i.e., the program is activated by an emulated terminal user. The program accepts a record ID from the terminal and uses the ID to control accesses to the data base key-retrieval type files. Several data values are accepted from the terminal and records are subsequent- ly updated, added, and retrieved through the DBMS. The on-line data access program interfaces with the vendor's DBMS by the same DBMS interface functions used by the batch data base programs. The interface for the transfer of information between the terminal and the executing program is through COBOL ACCEPT and DISPLAY commands. These commands may be replaced with the vendor's code if necessary. The vendor's code must transfer "information between the terminal that activated the program, and the program's data buffers in the user-working-area of the program. This terminal interface may be either direct or through a teleprocessing monitor or trans- action processor. There are four portions of the on-line data base access program: (1) key record update, (2) logical relationship update, (3) add, and (4) retrieval. Figures 8, 9, 10 and 11 show the data base access for each portion of the program. ENTRY ENTRY READ OWNER AND UPDATE Figure 9. DB On-line Logical Relationship Update Transaction Figure 8. DB On-line Key Record Update Transaction Figure 10. DB On-line Add Transaction 18 Figure 11. DB On-line Retrieve Transaction A. 3 Testing This approach to DBMS benchmarking was tested by a team of one programmer and one analyst, from the Government agency acquir- ing the DBMS. The team was provided with the data base generation programs, the data base benchmark programs, and the benchmark instructions. The team successfully gener- ated the data base and loaded it into a structure which is supported by their current DBMS package. The team success- fully translated the data base interface functions in the benchmark programs into the required DBMS commands and executed each program to produce the desired results. This test cast the agency team in the role of the offering vendor. The analyst of this team had a good background in IDS- I, but no experience with IDS-II, which was the test DBMS package. The programmer had no prior data base experience. The test was accomplished with a minimum of communi- cation between the agency team and the benchmark developers. The team spent approximately three man-months on this project. Taking into account the experi- ence level of the agency team, it is probable that a vendor would be able to accomplish this in a shorter time frame. A. A Benchmark Test Evaluation Criteria The vendor's test results will be evaluated based on the results obtained by the agency's test. The input seed numbers to be used at the Live Test Demonstrations (LTDs) were input to the benchmark programs during the agency's test. The reports generated during the test and samples of the updated files were saved for benchmark test result comparison purposes. The reports generated during the benchmark tests will be compared for accuracy against those generated during the agency's test. At the conclusion of each benchmark test, the vendor will be required to retrieve records from the data base based on a list of record keys provided by the Government benchmark team. This list identifies the file name and unique key of each requested record and includes records that should have been added, updated, read, or had no access made to it during the benchmark test. The contents of these requested records will be compared against the agency's test results. The* vendor must successfully match the results of the agency's test, as well as not exceed the one hour benchmark test time constraint, to meet the evaluation criteria. 5. Potential Problems/Benefits At this time, the agency trying to procure the computer hardware and the DBMS has not released the Request For Proposal (RFP) ; therefore, there has not yet been any vendor reaction to this DBMS bench- marking approach. The vendor reaction will be crucial in determining whether or not this approach becomes an accepted way to benchmark Data Base Management Systems. This approach is very expensive in terms of the in-house costs to develop the benchmark and the costs to the vendor to prepare for and run the benchmark. This approach is not appropriate for organi- zations acquiring only a DBMS, because the relative cost of a DBMS is low compared to the cost to the vendor to prepare for and run a benchmark that uses this approach. The vendor should be willing to spend more in this area if it is a requirement within a large-scale computer system acquisition, where the cost-benefit ratio becomes more attractive to the vendor. The organization should do a risk analysis study to determine the tradeoffs between the cost to develop and test the benchmark and the cost associated with 19 acquiring a DBMS which does not meet the organization's needs. The organization should currently have a DBMS, so that there is an available DBMS to test the benchmark programs or have access to another machine for test purposes. The risks associated with sending out untested benchmark programs are too high. This approach does not take into account either the cost of conversion to the new DBMS or the ease of use to the program- mer. The capabilities of a DBMS query language or an on-line update language are also not considered. An additional problem associated with benchmarking DBMS' s is that DBMS performance is affected by the rest of the workload mix that it must compete with for computer resources. Also, the DBMS workload may severely impact the performance of unrelated on-line activity. This problem points out the importance of accurately representing all portions of the workload so that the benchmark test will test the ability of the proposed hardware and software to meet the organization's expected workload require- ments. This approach to DBMS benchmarking appears to have some promise for organiza- tions that currently have a DBMS and need to acquire computer system hardware along with a new DBMS. The appropriateness of this approach must be determined by the indivi- dual organization through risk analysis. 6. A Comparison of the FEDSIM and USDA DBMS Benchmark Approaches The FEDSIM benchmark approach is an expansion of the Department of Agriculture (USDA) approach which has been used suc- cessfully in several procurements. The DBMS interface functions of the FEDSIM and the USDA approaches are quite similar in con- cept. The USDA logical data base structure is a hierarchical structure of three files, each at a different level within the hier- archical structure. Thus the USDA logical data base structure has a depth of three files and a width of one file. All file relationships are one to one. In contrast, the FEDSIM logical data base structure represents a CODASYL network structure, which is necessary tp accurately represent the procuring agency's data base processing requirements. The FEDSIM logical data base structure has a depth of five files and a width of three files. There are one to one file relationships as well as one to many and many to one file relationships repre- sented within the logical structure. Another major difference between the FEDSIM and the USDA approaches is the difference in the manner in which the vendors can access the data base file structures. The USDA approach allows for three separate sets of files: one set for query activities, a second set for update activities, and a third set for addition activities. The FEDSIM approach stipulates that all data base processing must occur against one set of files. This requirement ensures that there will be concurrent on- line and batch queries, updates, and addi- tions against the data base. The con- current processing requirement is essential to effectively model the procuring agency's actual processing environment. Reference (1) United States Department of Agri- culture, Automated Data Systems, Kansas City Computer Center, "Live Test Demonstration Manual", 1977. 20 Security, Fraud, and Audit 21 EDP Auditing in the 1980s or The Vanishing Paper Trail Richard E. Andersen Performance Systems Incorporated 30 Courthouse Square Rockville, Maryland 20850 This paper is designed to provide an overview of (1) traditional auditing methodology and (2) the reasons why these methods must be modified and adapted in order to cope with the "vanishing paper trail" inherent in today's (and tomorrow's) increasingly on-line, data base-oriented, distributed processing, EDP environment. The three basic types of audits (general, administrative, and applications) will be discussed. Examples of flagrant, and not-so- flagrant, computer crimes will be presented. Practical and proposed solutions for the reduction of these types of DP crimes in the 1980 's will be examined. The major thrust of the paper, however, will address the impor- tance of the applications-oriented audit. The paper will discuss the concept of Computer Management Technology (CMT), and show that this is in reality a type of on-going auditing task. Alternatives will be examined for measuring, managing and controlling key functional areas that have a definite interactive effect on the availability of re- sources and the performance of system components (or both) in a data processing installation. Key words: Auditing; audit-in-depth; internal controls; security; auditing aids. This paper discusses some of the aspects of data processing that, while not in the realm of the data processing exotic, will, I believe, attain ever increasing impor- tance for us in the years of the coming decade. No level of technology has found itself above the ingenuity of a clever, albeit dubious mind.... not even the computer. One positive, although dubious, distinction in the evolution of man has been his ability to commit criminal acts, no matter how difficult the circumstances. He escapes from escape-proof prisons, he tampers with tamper-proof devices, he burglarizes burgular-proof establishments. These examples merely serve to prove the validity of "Danziger's Law", which holds that as soon as something is invented, someone, somewhere, immediately begins trying to figure out a way to beat the invention. I'm sure that all of us have heard of myriad examples of computer abuse and/or crime in our own experience. But I think that very few, if any, have valid statistics as to just how much of this subversion goes on. From all indica- tions, a lot more goes on than is ever detected. A lot of computer crime that 23 does go on, moreover, is never publicly announced. Most security experts have collections of incidents that they have investigated, but which have never been reported to the authorities. Further- more, some banks and other companies, candidly admit that when an incident is discovered, the corporate victims usually try to avoid the embarrassment and loss of confidence that the publicity might bring. DP security specialists generally tend to agree that about 85% of detected frauds are never brought to the attention - of law enforcement officials. What often happens is that the offender, once de- tected, is required to make restitution and then is sped on his way... of ten times with severance pay, and sometimes with letters of reference. One consequence of this is a circulating population of un- punished, unrepentant, and unrecognized perpetrators going from company to company. Probably a more serious consequence, how- ever, is the suppressing of general re- cognition of the extent of computer crime; and that in turn can lull both makers and users of computers into minimizing it as a threat. Since the basic topic of this paper is EDP auditing in the 1980 's, it might be well to get back to basics and examine some basic terminology: AUDITING Source data, methodology, and report con- clusions and sums are checked for accuracy and validity through the use of 'studied techniques' and information sources. AUDIT IN DEPTH A procedure wherein a detailed examination of all manipulations on a transaction or a piece of information is carried out. AUDIT TRAIL The 'trail' left by a transaction when it is processed... the trail begins with the original documents (the transaction entries, the posting of records) and is complete with the report. Validity tests of records are achieved by this method. An audit trail must be incorporated into every procedure. Provision for it must be made early so that it becomes an integral part. In creating an audit trail, it is necessary to provide (a) transaction documentation that is detailed enough to permit the association of any one record with its original source document, (b) a system of account- ing controls which provides that all transactions have been processed and that accounting records are in balance, and (c) documentation from which any trans- action can be re-created and its pro- cessing continued should that transaction be lost or destroyed at some point in the processing procedure. The sub-title for this presentation is "The Vanishing Paper Trail". Speci- fically, the inference to be drawn here is that we will arrive at a consideration and discussion of on-line accounting systems. To cite an example of computer crime to make a point: Outside of the world of EDP, most of the present concern among laymen about the latent problems of computer security seems to have emerged since the widely publicized Equity Funding Insurance swindle. While this was really more a case of old fashioned fraud than an instance of computer manipulation, the Equity Funding ripoff could hardly have reached the magnitude it did without the computer's adroitness in fooling auditors from four major accounting firms. The case pretty well demonstrated that conventional auditing procedures were all but helpless when confronting intent and knowhow involving deceit of computers. Auditors are losing their paper trail... that plethora of documents, indeliby inscribed orders, invoices, bills and receipts that the men in green eyeshades used to plow through on the trail of auditing irregularities. There appears to be at least one group that has perhaps derived benefit from revelations such as those of the Equity Funding swindle. This is the relatively small group of specialists that appear to be able to write programs to make the computer do the auditing. Auditing through the computer rather than around it; that is, to perform various types of cross checks within the com- puter, and to throw up warnings when certain suspicious transactions occur. This sort of auditing, however, is only as dependable as (like everything else that goes on inside the computer) the computer itself. Unfortunately, com- puters can be programmed to lie or conceal as easily as they can be programmed for truth. 24 Rather than throw up our hands in despair and exclaim, "A lock only keeps as honest man honest!", we should continue to search for viable ways to audit through the computer.. .to continue to modify tradi- tionally ingrained techniques to cope with this vanishing paper trail. A nominal amount of research in the preparation of this paper revealed, among other things, a promotional brochure distributed by a company which is quite famous in our industry for the fine refer- ence material it publishes. Herewith, a few pertinent quotes: "Auditing through the computer-rather than around it-will only become a reality if you fully understand the confusing maze called 'data processing'. EDP AUDITING , the only bimonthly updated reference service, provides you with timely, compre- hensive information on the DP function, plus the latest methods and techniques enabling you to work with the DP department to improve your performance. Designed for internal and external auditors, and DP managers, this in-depth 'how to' reference contains the most current information and approaches to the EDP auditing function on the market. And because it is regularly updated to provide an ongoing collection of reference ma- terial, its benefits far outweigh those of a bound book. Initially with EDP AUDITING , you'll receive a rugged loose leaf binder containing more than 40 information-filled portfolios. Then, every other month your service will be supplemented with 4 or more portfolios, guaranteeing that you have the most timely facts, methods, and advice at your command. Let EDP AUDITING help you close the gap between the EDP function and your auditing tasks. Let it enable you to audit through the computer - plus much, much more . " It is in no way my intent to demean this promotional offering regarding EDP Auditing. Quite the contrary... because in the great deal of research and reading undertaken in the preparation of this paper, it is one of the very few practical approaches and suggestions I encountered to help the EDP Auditor with the tasks that are going to be facing him or her... and the more involved and difficult auditing tasks facing him or her in the coming years of the 1980 's. This approach suggests, perhaps, the linking of two separate specialities as a result of our 'technological progress'. This technological progress is usually guaranteed, for a time at least, to create a demand for specialized skills that will be in very short supply. I believe that such a field will be com- puter auditing, where for all the growth that has occurred over the last few years, there is still a huge gap between what is, and what is needed. I believe the reasons for this are relatively simple. Computers happened too quickly for the accountancy profession, which is accustomed to a more leisurely rate of change. New "marriages", however, tend not to be traumatic for the computer profession. Data processing people are used to seeing their speciality branch out into new fields, but for the account- ancy profession, the arrival of the computer has, I believe, produced a bit of a trauma, and in some cases, a nasty shock. We have all witnessed use of on-line systems and Base Management Systems in applications, both in the 1 sophisticated companies, as applications in the smaller While these systems are wel they still present many new challenges to the computer the increasing real time Data accounting arger and more well as many companies. 1 established, and different auditor. Again, in the old batch systems it was still possible for the auditor to pretend that the computer didn't exist, and audit "around" the computer. He or she simply checked or matched what came out with what went in. Some of these "batch" tools will continue to be used, and with efficiency... the old "hash total" springs to mind. However, we see a move toward concentrating efforts on reviewing data flows within computer systems, in order to establish the degree of reliability that auditors can place on their company's internal controls. These controls are merely a set of procedures and checks designed to ensure that valid transactions, and only valid trans- actions, are processed, accepted, and recorded completely. It is often more economical, and better auditing practice, to use the computer to extract exceptions, reports, and records of all abnormal entries (and samples of transactions and balances). 25 All incorporated companies must have their accounts audited. In order to express their opinion on the accounts (in order to report that they are "true and fair"), the auditors must achieve some reasonable assurance. If the internal controls are sloppy, the auditor has to spend more time checking what lies behind the figures. And the more time he or she spends, the more it costs the clients. Extraction of samples by computer program, even very large samples, needn't be expensive. It is the verification of the data in the sample that costs money, since someone has to go and see that the source document exists, or that the stock did in fact arrive, etc. In reviewing real time systems, we see that they have re- vealed a wide range of additional problems relating to weaknesses of internal con- trols, and weaknesses in the development of interrogation software. An obvious example is inadequately supervised terminals. If anyone, not just authorized people, can get to a terminal, risks of that abuse are increased. Although many installations use passwords, it often happens that the password stays in use for so long that it loses its confidentiality. Even the most elaborate password systems will come crashing to the ground if the operator wanders away from an unattended terminal without signoff. This is often compounded by the fact that terminals tend to be located in large open areas and offices where there is little or no physical security. Again, in reference to on-line systems, accounting data tends to be keyed in without control totals being first generated. This increases the danger of mis-keys and mis-reads going into main files without detection, at least until they turn up as meaningless stock orders, or astonishing paychecks. Guarding against these eventualities means stringent program validation checks on data... more so than in the batch environment. Usually, only those systems that require keyed data to be input by two separate individuals will provide a uniformly high level of input accuracy. Controlling the input and resulting accur- acy of master data files is much more difficult than in the old batch systems because the main files are seldom read through in their sequential entirety. This control can often only be achieved through the use of control totals of movements in all key fields. Fairly frequent sequen- tial passes of the main files will have to be built into the systems, and checks made that the sum of the brought forward control totals plus the movement control totals equals the sum of the individual values of the files. Additionally, sufficient transaction history obviously must be kept, so that if the two sets of totals fail to agree, rapid recovery is possible from a sequential pass. In practice, there is a tendency to reduce the amount of transaction history held on on-line files, especially compared to the amount held on most batch systems. The most important difference introduced by on-line, and Data Base Management sys- tems, from an audit point of view, is that the functions of main data file security are being absorbed by the Data Base software itself. This concentration of information means that the Data Base Administrator is becoming a very impor- tant person. I don't mean to imply that DBA's are natural fraudsmen, but they are in a unique position of having an excel- lent knowledge of a broad range of information in the company, and of the relationship of the pieces of informa- tion... one with another. Since a great deal of audit thinking on control centers on a division of duties, classically, traditionally, historically... so that no one person has too large an area of influence, the auditors' nervousness about Data Base Administrators is under- standable. An old audit maxum is to get as many involved in each process as you can so as to make collusion more diffi- cult. While it is rare that a Data Base Administrator has actual custody of a company's assets, if he is allowed to run software, he has the ideal opportunity of making un-authorized amendments of one sort or another. Therefore, specifically and parti- cularly in view of the considerations that have to be taken into account as a result of on-line systems, the auditor has a key role to play in the design of intended new systems at the earliest possible stage of the planning. This will enable him or her to assure that the proposed system will incorporate suffi- cient audit (and good) basic controls. Early involvement of the audit function in the design of new systems cannot be stressed too strongly, because the internal and external auditor are con- cerned that these control standards and procedures in effect in a DP installation are functional. In addition, the auditor of the 1980 's is going to need to test the reliability of the system in pro- cessing various types of transactions. 26 The auditor's evaluation of the sufficiency of internal control procedures will also affect his or her determination of the extent necessary in order to satisfy him or herself regarding the reliability of those systems. DP has had a major effect upon audit- ing, due primarily to the loss of this visible paper trail (the vanishing paper trail), and this increases the importance of internal controls. ..which in turn underscores the need for early involvement of the audit function. While good system design controls do not in themselves eliminate the need for auditing, they do significantly reduce the need for testing necessary to the audit function. Again, to re- iterate the concept of auditing around the computer, the first reaction to DP by early auditors was to attempt to perform their audits with the printed records and output provided by the system, the theory being that if a sample of the system output was correctly obtained from the system using the system's input, then the processing itself must be reli- able. This was a reasonable approach perhaps ten or fifteen years ago when the knowledge of DP auditors was extremely limited. However, the increasing diffi- culty of applying this approach to the vanishing paper trail, and the development (and continuing need for development) of better auditing methods, have combined to discredit this old approach of auditing around the computer. The only alternative to this is auditing through the computer. Some examples of this might be (1) the use of test decks using sample trans- actions, either correct or intentionally incorrect, or (2) the use of specially written computer programs for interrogation and control. What about the aspects of data security? The growing reliance of many companies on DP to process their transactions and store their business data has created, and continues to create, new vulnerabilities and concerns. This is especially true in the on-line environment. A tremendous amount of computerized data, including business forecasts, employment data, payroll records, etc., has to be kept confidential; but maintaining that confi- dentiality has become increasingly more difficult with today's DP methods. Advances in direct access and mass storage systems and technology, coupled again with improved and simplified terminal usage, have given numerous people throughout the company access to sensitive business data. Information that was once limited to viewing by top management might now be subject to the prying eyes of employees, or even com- petitors. Even more of a threat than intentional security violations, however, is the damage to sensitive data through well intentioned error. The great majority of data losses is due to the fact that, simply, people make mistakes. The same technology that makes DP such a boon to business can also greatly amplify the effects of human error. We all have probably experienced examples of that. Irreparable damage to critical company records can be caused by honest employees who inadvertantly misuse data. Addi- tionally, because of the dramatic rise of, and projections for increase in, computer crime, as we will touch upon later, companies and Federal Agencies are demanding increased security, privacy, and control in DP operations. To meet these demands, EDP auditors of the 1980 's must develop a working understanding of DP. As our brochure indicated, they must be able to work with DP personnel, to secure and control this valuable information stored in DP files. The DP auditor of the 1980' s will have to know how to audit through the computer rather than around it. And this implies that he or she will have to be able to do numerous things. A partial list of those things follows: . He or she must be able to deal with the important issues such as the reconstruction of data files, physical security, and the theft of services or data in the system. . Also, he or she must be able to take an active role in developing policy on auditability, testing, standards, and general controls. . He or she must be able to provide management with independent assessments of DP decisions, and their impact on the company in general . 27 He or she must be able to make certain that the alternatives or risks for a project have been carefully considered, and that the technical hardware and software solutions are correct and reason- able, and that the costs are reasonable. Sheer volume alone makes controlling sensitive data more difficult and all the more critical, because as the amount of sensitive data increases, so does the need to safeguard it. Hundreds and even thou- sands of jobs are run on computer systems each day. The task of monitoring which jobs are using what data and which re- sources is beyond the capability of any human operator. The only feasible solution to the problem of protecting data is to let the computer itself monitor and control data access. This in turn will require data software security systems that will combine data access control with computer access control. We might well ask at this point what should be the goals for the 1980' s for a data security system that takes into account auditing considerations. As I see it, there are three fundamental goals: (1) To determine and detect improper access attempts (2) To prevent such attempts from succeeding (3) To record successful accesses and unsuccessful accesses to data, on an audit file To these ends, a security and auditing system has to be able to (1) prevent all unauthorized access to all forms of busi- ness data, including such things as system libraries, application libraries, and business forecasts, (2) provide authorized personnel easy access to their data... ideally without them even being aware of or concerned about the security and auditing mechanism, (3) immediately notify an operator that a security violation attempt is in progress, (4) identify and verify every system job and user, (5) provide an audit trail of every successful and unsuc- cessful access attempt, being all the while transparent to the user, and finally of course (6) add minimum overhead to the system, be easy to install and use. Such security systems and auditing aids are on the market now, in varying degrees of completion and efficiency. But in this area, much more needs to be done... and much more education needs to be brought forth to the DP community in general. This is not a one way street... a street where only the auditor needs to learn new techniques. I believe that there is good news and there is bad news when one considers computer security and auditing for the 1980' s. I think that the good news is that computer security systems will become better than ever. There will be help for the EDP Auditor. However, I fear that the bad news revolves around the fact that probably organized crime will make its true data processing debut in the 1980's. It is conceivable that the mob will make use of career criminals trained in DP by prison rehabilitation facilities (perhaps along with top DP technicians) recruited, sometimes un- willingly, from the ranks of otherwise honest DP personnel. I believe that the incidence of large computer crimes will go down, but that the size of the indi- vidual crimes will rise. Recently published figures indicate that the national average of $450,000 per computer crime will skyrocket. Professional criminals will be much better at their craft than the DP hobbyist, who is now still the major perpetrator of computer crime. Financial institutions, which have been working for years toward safety of their data, will probably be better off than most DP users. Many so-called "friendly" information systems and interactive word processing systems provide little if any security provision. Additionally, they actually help the computer crime perpetrator find what he is looking for by putting the information in an easy- to-read format. In the case of information management systems on mini -computers, these offer little at the present time in the way of protection. I believe that security precautions for smaller systems will be one of the greatest needs, and should be one of the highest priorities, of the 1980's. Companies will also be faced with the dilemma of deciding when their computer systems are safe enough. If a company goes too far in the area of security control, no one will be able to use the computer, and production will fall off. 28 Mr. Donn Parker, an internationally recognized leading world expert on computer crime, abuse and fraud, suggests that the only completely safe computer is one that is not used. He maintains, and I believe correctly, that any security system can be breached with the proper combination of know-how and intent. He suggests that by coupling more frequent auditing with more sophisticated security systems, the inci- dence of the small DP crimes could be held to a minimum. However, because laws vary from state to state, it is conceivable that a computer thief can even avoid prosecution by committing his crime in a state without computer crime laws. This is an aspect of computer crime that will have to be addres- sed in the 1980' s also. In our business at Performance Systems Incorporated, we repeatedly come into contact with top management personnel who express their frustration in not being able to establish or bring about specific management controls on their DP depart- ments. We find usually that one of the main reasons that DP is undermanaged, undercontro 1 1 ed , underdocumented , ... and misunderstood... is that historically the DP manager has usually reported to a comptroller or financial officer of some type. Additionally, he has been only loosely responsible to some type of user steering committee for DP. But now we are moving into an era where "corporate direc- tors of DP and Telecommunications acti- vities" (for example) are being appointed who have a DP background as well as formal indoctrination and training in management principles and philosophy. This is an encouraging trend, hopefully to be main- tained and even accelerated in the 1980' s. I also believe that these individual companies that have been founded and revolve around DP security concerns are in the vanguard of a major contribution to our DP way of life in the 1980' s. Finally there is an element of seman- tics involved in the term EDP Auditing. Sometimes there is a conflict between the purely financial aspects of auditing and the computer aspects of the work of (for lack of better term) "the computer systems audit staff". Among the services that we provide at PSI is something that we call the Data Center Review. It certainly is not a new or revolutionary concept, but we think that our approach is a bit innova- tive. Initially it was suggested that this activity be referred to as an EDP Audit. I believe that wiser heads prevailed, and in discussing the conceptualization and image provided by the term EDP Audit, it was generally agreed that while this was meant to be a systems audit as opposed to a financial audit, we should provide a descriptive phrase that was more meaning- ful. Therefore, the term EDP Audit was replaced with the term Data Center Review. The mechanics of the Data Center Review merely embrace another type of audit; and this is a facet (and only one facet) of Computer Management Technology (CMT). This Data Center Review takes into account that in a sophisticated DP environment there exists a necessity to develop highly specialized skills. Departments or branches are formed having well defined missions that are essential but narrow in scope and responsibility. Perhaps this is due in part to the traditional and historical splitting of functional involvement. ..again, in line with our old audit maxim. Even the elements of vendor support (e.g., systems engineering and customer engineering) tend to align themselves with these various groups according to their indi- vidual expertise and experience. In many cases, management decisions are made on the basis of the recommendations of specialized technicians whose performance is measured in narrowly defined responsi- bilities. Such decisions could have an adverse effect on other equally important functions in the organization. In the process of conducting Data Center Reviews in the past, our experi- ence has shown that several key areas within an installation have a definite interactive effect on overall system performance. The methods and procedures used to manage and control these areas are vital in meeting and maintaining performance objectives. We have develop- ed and documented techniques for review- ing these areas within an installation. I believe you need an organized and documented method of determining how efficiently and effectively your data processing systems are functioning and operating. The Data Center Review provides you with this capability by identifying the data available from each of the Data Center's functional areas using simply a set of questions, or checklist, to analyze procedures in each of these areas. All data sources are presented as a "shopping list", and the checklists are modular. With this format, an experienced analyst can easily 29 tailor a data source and checklist package to an installation under examination. However, too often we see many in- stances of similar "guidelines and check- lists" available, used, filled out, ratings applied,... and yet no analysis... no follow- up... no qualitative reporting is done. Many times, we feel, that is where this approach falls down and is lacking, because all the collected data in the world is of no use unless it is subjected to rigorous analysis directed toward determining the implications of all this collected data. Perhaps the most effective means of em- phasizing this point is to quote Johann Wolfgang von Goethe who wrote in 1810: "The modern age has a false sense of super- iority, because of the great mass of data at its disposal; but the valid criterion of distinction is rather the extent to which man knows how to form and master the material at his command". So we see, after all, that there is an element of sameness in the world of rapid change in which we live. We have covered a wide range of topics under the general heading of EDP Auditing, Security, and Computer Fraud. However, no discussion of EDP Auditing, Data and Systems Security, Controls, Checks, and Balances, would be complete, I believe without pondering some questions, even if only briefly: What of OVERcontrol, 0VER- auditing, OVERsecurity paranoia, and other potential abuses? I would therefore like to leave you with a little food for thought, in the form of my only two overhead projector trans- parencies. I am indebted to Professor Dennie L. Van Tassel of the University of California for his inspiration and (in part) format: 30 CASHLESS SOCIETY NATIONAL DATA BANK DAILY SURVEILANCE SHEET CONFIDENTIAL 3ULY 9, 1997 SUBJECT: NAME EMPLOYER ADDRESS SEX AGE MARTIAL STATUS OCCUPATION/TITLE PURCHASES: WALL STREET JOURNAL .25 BREAKFAST 2.50 GASOLINE 22.00 PHONE (328-1826) .15 PHONE (308-7928) .15 PHONE (421-1942) .15 BANK (CASH WITHDRAWAL) 120.00 LUNCH 3.50 COCKTAIL 1.50 LINGERIE 26.95 PHONE (369-2436) .35 SCOTCH (1/2 GAL) 12.85 NEWSPAPER .25 **** COMPUTER ANALYSIS FOLLOWS 31 COMPUTER ANALYSIS *** OWNS STOCK (99% PROBABILITY). HEAVY STARCH BREAKFAST - PROBABLY OVERWEIGHT. BOUGHT $22.00 GAS. SO FAR THIS WEEK HAS BOUGHT $38.00 GAS. OBVIOUSLY DOING SOMETHING BESIDES JUST DRIVING TO WORK. BOUGHT GAS AT 0857. SAFE TO ASSUME HE WAS LATE TO WORK. PHONE NUMBER 308-1826 BELONGS TO 'FRANKIE THE LAYOFF. FRANKIE ARRESTED 1978 FOR BOOKMAKING. PHONE NUMBER 308-7928 - EXPENSIVE MENS' BARBER. SPECIALIZES IN BALD MEN OR HAIR STYLING. PHONE NUMBER 421-1931 RESERVATIONS FOR LAS VEGAS (WITHOUT WIFE.) THIRD TRIP THIS YEAR TO LAS VEGAS (WITHOUT WIFE). WILL SCAN FILE TO SEE IF ANYONE ELSE HAS GONE TO LAS VEGAS AT SAME TIME, AND WILL COMPARE TO HIS PHONE NUMBERS CALLED. WITHDREW $120.00 CASH. VERY UNUSUAL SINCE ALL LEGAL PURCHASES CAN BE MADE USING THE NATIONAL SOCIAL SECURITY CREDIT CARD. CASH USUALLY USED ONLY FOR ILLEGAL PURPOSES. IT WAS PREVIOUSLY RECOMMENDED THAT ALL CASH BE OUTLAWED AS SOON AS IT BECOMES POLITICALLY POSSIBLE. DRINKS DURING LUNCH. BOUGHT VERY EXPENSIVE LINGERIE NOT HIS WIFE'S SIZE. PHONE NUMBER 369-2346 - MISS SWEET LOCKS. PURCHASED EXPENSIVE 1/2 GALLON SCOTCH. HAS PURCHASED 3-1/2 GALLONS SCOTCH IN LAST 30 DAYS. EITHER HEAVY DRINKER OR MUCH ENTERTAINING. *** OVERALL ANALYSIS *** LEFT WORK EARLY AT 1700, SINCE HE PURCHASED SCOTCH 1 MILE FROM HIS OFFICE AT 1710 (OPPOSITE DIRECTION FROM HIS HOUSE). BOUGHT NEWSPAPER AT 1930 NEAR HIS HOME. UNACCOUNTABLE 2-1/2 HOURS. MADE THREE PURCHASES TODAY FROM YOUNG BLONDES. (STATISTICAL 1 CHANCE IN 78). THEREFORE PROBABLY HAS WEAKNESS FOR YOUNG BLONDES. 32 Tracking Potential Security Violations R. L. Lehmann Union Carbide Corporation South Charleston, WV 25303 Abstract: Security concerns not only involve providing restricted access to computer resources, but frequently require investigative studies to track suspicious utilization for potential company violations. For such studies it is important to have pertinent information for the entire job in order to trace what is being done. Sometimes very little information is available to use as a key for isolating the jobs in question. This paper describes a tool developed at Union Carbide Corp. to enhance such investigations on IBM MVS systems. Keywords: Audit trails; computer fraud; computer security; job tracking; security violations, SMF retrieval. 1. Introduction Frequently security violations involve those who are authorized or have access to the sensitive data of concern. Current security measures are of little help in pre- venting such unprincipled people from obtain- ing their objectives. Indeed RACF, SECURE and ACF2 minimize access to such areas but to the authorized or knowledgable even they must yield to the master's will. Acknow- ledging that a completely fool proof system is neither practical nor possible this paper seeks to address the problem once violation i3 detected or suspected. Present products may aid in warning of possible violations but what does one do then? Accusing one falsely could have serious con- sequences. Ability to track jobs based on varying bits of information would be of great help in gathering needed data to determine the nature of the suspected infringement. Since thieves and saboteurs tend to cover up their tracks, sufficient flexibility must be provided to track their use of computer resources. Of course the supersleuth may eradicate all traces of his presence and leave you helpless, but even that condition restricts your concern to only those few 'sharp' individuals capable of such system internal manipulation. Various questions immediately surface when considering one suspected of violation. What information was accessed by the indi- vidual? What computer resources did he or she use? What did the individual do with the information accessed? When was the in- fringement made and how could it have been avoided? In order to aid in answering these ques- tions UCC developed a means of accessing computer generated data to piece together in- formation needed In resolving these concerns. Five basic design steps are of importance in describing the resultant facility; [l] What form of requests want to be made? [2] What information is desired? [3] What data is available to supply this information? [4] How can it be retrieved economically? [5] How is it to be reported? Detective work traditionally obtains bits and pieces of information, with which it must collect all relevant data, such that all con- ditions of fact are met. Tracking security 33 violations has many similarities with respect to the meager amount of information one may have at the start. In order to accommodate such tracking, basic clues have been classified into nine major areas. Any combination of these nine variables may be requested to define retrie- val conditions. Multiple requests are also permitted in any given run. This is accom- plished by ANDing the various conditions de- fining each request and ORing the number of requests. 2. Form of Requests Nine basic parameters were isolated to define the primative set of request identi- fiers. These may be grouped into four basic categories: 1 Job Identification a. PGMR - Programmer Name b. PGM - Program Name c. JOB - Job 2 Data Set Identification a. DSN - Data Set Name b. VOLSER - Volume Serial 3 Time Period a. DATE - Date b. TIME - Time 4 Accounting Concerns a. CC - Charge Number b. RIN - Remote Number Any combination of these request identi- fiers may be made, depending upon the amount of information known and restriction of out- put desired. The syntax related to each is shown below. 2.1 SYNTAX Description Programmer Name : Perhaps all that is known is that programmer X appears to be violating company standards. It would be desirable to obtain all the information avail- able regarding that programmer. But program- mers invariably spell their names differently from one job to another. At times they may use their initials whereas other times simply the last name. Some may separate their ini- tials with blanks, others with periods and a few even use the underscore character. At times abbreviations are even witnessed, esp- ecially when the name is long. Consequently it is desirable to be able to simply request a character string that if found anywhere within the name field a retrieval will be accomplished. Such a request is made by simply stating: PGMR=' character string 1 Program Name : In addition to the pro- grammer name, frequently one may need to determine the users of a particular program. When checks run against output from highly confidential programs reveal problems, a report of all individuals using that program may be obtained by specifying: PGM='name' Job Name : Jobs may be traced by simply specifying the job name: JOB='name* Data Set Name : Suppose suddenly one day a critical and sensitive data set has been destroyed. Immediately you wish to know who had been using that data set. A request of the form below will produce a report showing every user for the period of interest acces- sing the data set having the exact name as specified: DSN= ' name ' Sometimes, however, the exact name with all its qualifiers may not be readily avail- able. The unique identifiers on the other hand are known and it is desired to just specify these to track the users. Similarly, it may be desired to identify those who used any of a family of datasets all having a common string of characters somewhere within its name. This may be accomplished by: DSNX=' character string' Volume : Disk or tape volumes containing sensitive data may be accessed in ways other than through standard data set name requests. Circumstances may arise where it is necessary to know who used such a volume and whether it was used for input or output. This re- quest is specified by the statement: VOLSER='name* Date; Frequently one's concern is re- stricted to a specific period of time. To eliminate data outside the period of interest the date option provides the ability of set- ting day boundaries. This also illustrates the facility of specifying multiple condi- tions, which will be described in more detail later. The date is specified in Julian form as the following examples show: DATE=80123 DATE>80119 & DATE<80120 DATE>80112 & DATE<80116 & Date-»=80114 34 Time : But the period of interest may be concerned with only a portion of the day. Thus a time option is also available to set the boundaries of time within each day of interest. This condition is in the form of hours and portion of hours as shown below: TIME>20.0 TIME>8.0 & TIME<17.5 the SAS code necessary to retrieve the conditions required. These macros are written to a data set which is then concat- enated with the SAS routine. Each error message describes the problem encountered as well as the actual request that was made. An example of the diagnostic messages are illustrated below. The selection types MAJ and MIN will be discussed later. Charge Number : At times infringement problems may involve users who manage to direct personal computer use to charge num- bers other than their own. Ability to track specified charge numbers is also available by requesting: CC=' charge no 1 Input Requests: MNR OSN='K.CACI.SIMLIBB< & VOLSER* • KDD071 • I PGMR»»CANDY« MAJ MIN DATE««80091< & VOLSER*'KRR002» & RIN««R7.RD1» MIN PGMR*LEHMANN i. TIME>«8." t. TIME<12. MIN PGM=«A9500TFST' fc CC»'95000U' MIN J0B='ARLLPM1TT' fc PGMR*«R.L.LEHMANN» I VOLSER«'KDO0l»» MIN RIN=»RD10_Rni2' & CC='9000S' Input Remote Number : One last request type was designed to enable one of monitor users at a given remote station. Suppose violation is suspected at a given site during the weekend. Activity at that remote station could be retrieved by: RIN= ' r emo t e numb er ' 2 . 2 Summary Nine different condition types are pro- vided such that any combination can be used to uniquely define the conditions represent- ing the information known. All except DATE and TIME have character values. In addition, DATE, CC and TIME may utilize the operators =,> or <. The remaining types may only use =. Exception reporting may be accomplished by using the "i operator in conjunction with the >, < or = operators. For example the following request: DATE=80246&TIME<6 . &RIN= ' R99 . RD1 ' &CC= ' 9500U ' &PGMR =' JONES* will retrieve all jobs submitted from Remote 99 from midnight to 6:00 a.m. on day 246 us- ing charge number 9500U except those of Jones. 3. Implementation To provide flexibility and ease of main- tenance it was decided to use a PL/I program to process the input conditions and SAS to do the data retrieval. PL/I was chosen because at the time SAS did not have the string mainpulation capabilities that are now available. The PL/ I program processes the retrieval parameters, checks for syntax and other error conditions producing either dia- gnostic messages, a condition code which terminates the run, or SAS macros generating Diagnostics: erroii invalid selection type. must be h»j or min conditioni eboo-ll missing operator in selection condition. condition! errobi invalid syntax- date non-numeric conditioni err07i invalid syntax- missing ouotes c0n0it10ni err09i invalid syntax- time non-decimal conditioni errioi invalid syntax- program > 8 characters conditioni err16i invalio syntax- charge > s characters conditioni err06i invalio syntax- job name > 8 characters. conditioni err13i invalio syntax- pgmr name non-alphameric conditioni err1si invalio syntax- volser non-alphameric conditioni err22i invalid syntax- remote « » 8 characters conditioni crr2tl invalio syntax- remote « has non-ren char.scondi t ioni erriti invalid syntax- charge non-alphaheric conditioni OATE"'B00»Ii PGHRblEHMANN TIME>i».i PGMXAVS0OTESP CC^VSOOOU' j08«>aRLLPM1TT' PGMRaiR.L.LFHMANN* VOLSER«'KOO0H' RINziROlo R01J« RIN.>R01o>01J> CC"'90U II MACRO GP1C0N0 • MINOR CONDITIONS! % MACRO GP2C0ND • MINOR CONOITIONSI IF(DSN='SYS1.LIB01M & ROST= 1 THEN OUTPUT GP2TABLEI IF( POS -.= 0&VOLSER='K19762') & RQsT* THEN OUTPUT GP2TABLEI MACRO MINALL • MINOR DATE TIME AND JOB CONOITIONSI % MACRO % MACRO * MACRO MINTYP*. MINOR PROGRAM CONDITION) MINTYP5 MINOR PROGRAMMER AND CHARGE CONOITIONSI MINTYP1* • MINOR INPUT VOLSER AND DSN CONDITIONS! IF DSNr'SYSl.LIBOl' THEN DO»RQST= 1 IOUTPUT OUTlfct OUTPUT J0BTA8LEIENDI PTR= INDEX (DSN, 'TOPS') I IF PTR-=0 THEN DO! RQST* 2» OUTPUT JOBTABLEl POS=PTRlOuTPUT 0UT14IEN0I IF V0LSER=»K19762' THEN OUTPUT JOBTABLEl RQST= 21 % MACRO MINTYP15 • MINOR OUTPUT VOLSER AND DSN CONDITIONS! IF OSN.'SYSl.LIBOl' THEN D0IRQST= 1 IOUTPUT 0UT15I OUTPUT JOBTABLE lENDl PTR=INDEX (DSN, 'TOPS') I IF PTR-<=0 THEN 001 RQST= 21 OUTPUT JOBTABLEl P0S=PTR|0UTPUT 0UT15IENDI IF V0LSER='K19762' THEN OUTPUT JOBTABLEl RQST= 21 % MACRO MINTYP26 • MINOR INPUT REMOTE NO, CONDITIONI * MACRO « MACRO MINTYP35 MINOR TSO CHARGE NO. CONDITIONI SORTYP SORT PRINTOUT CONDITIONI BY J0B_0AY RECTIMI PROC PRINTI BY JOB DATE NOTSORTEDl ID PGMRI TITLE SECURITY RETRIEVAL REPORTI 7. Implementation As previously mentioned, to customize the SAS routine to enable flexibility for request- ing any combination of the nine conditional parameters, the input requests are processed to generate SAS macros. In this way, only that code necessary to fulfill the given requests is generated. This further reduces processing costs from that of a generalized routine that would have to be concerned with all possible combinations. If any errors were detected in the input requests, these macros are not created and the run is aborted. This request will retrieve all jobs run on day 91 which used a data set SYS1.LIB01 or had the character string TOPS somewhere in the data set name for data sets residing on volume K19762. User Requests: MIN PGMstUHPROGM* MIN JOB=»ACBPEMt & TIMEM0.& TIME<12, SRT »J0B» ID »DATE» A few samples of the SAS macros generated from the user requests are shown below: User Requests : MIN DSN=«SYS1.LIB01» MIN DSNX=»T0PS» & V0LSER=»K19762« MAJ DATE=80091 37 Macros: MACRO MAJSEL • MAJOR CONDITIONS! I* MACRO GP1CONO • MINOR CONOITIONSI IF(PGM«»UHPROGMM THEN OOl RQST« II OUTPUT GPlTABLEl ENOI IF(JOB««ACBPEM«4TIME>10.4TIME<12.0> THEN 001 ROST* 21 OUTPUT GPlTABLEl ENOI i MACRO GP2C0N0 • MINOR CONOITIONSI % MACRO MINALL • MINOR DATE TIME AND JOB CONOITIONSI IF JOB»»ACBPEMi THEN OUTPUT JOBTABLEI IF TIMEMO. THEN OUTPUT JOBTABLEI IF TIME<12.0 THEN OUTPUT JOBTABLEI % MACRO MINTYP4 • MINOR PROGRAM CONDITION! IF PGM«»UHPROGM« THEN OUTPUT JOBTABLEI ft MACRO MINTYPS • MINOR PROGRAMMER AND CHARGE CONDITIONS! ft MACRO MINTYP14 • MINOR INPUT VOLSER AND OSN CONDITIONS! « MACRO MINTYP15 • MINOR OUTPUT VOLSER AND OSN CONOITIONSI % MACRO MINTYP26 • MINOR INPUT REMOTE NO. CONOITIONI « MACRO H INTYP3S • MINOR TSO CHARGE NO. CONDITION! % MACRO SORTYP • SORT PRINTOUT CONOITIONI BY JOB TIME RECTIMI PROC PRINT! BY JOB NOTSORTEOl 10 DATE! TITLE SECURITY RETRIEVAL REPORTl ft Includes the SRT option which provides SORT- ing and report options, as discussed in the next section. User Requests : MIN RIN*»R108.RD1» i PGMR -**»SYSTEMS» MIN CC«»1233P» 8, DATE*80092 SRT »JOB» ID »RrN» Macros: MACRO i* MACRO MAJSEL MAJOR CONOITIONSI GP1C0ND MINOR CONOITIONSI IF(RINa»R10fi.R01«iPOS» & POS-=.) THEN 001 RQSTi II OUTPUT GPlTABLEl ENOI IF(CC"<1233P«IDATE=80092) 21 OUTPUT GPlTABLEl ENOI THEN DOl RQST« MACRO GP2C0ND • MINOR CONOITIONSI MINALL MINOR DATE TIME AND JOB CONOITIONSI IF DATE"80092 THEN OUTPUT JOBTABLEI MINTYP* MINOR PROGRAM CONDITION! ft MACRO ft MACRO MACRO MINTYP5 • MINOR PROGRAMMER ANO CHARGE CONOITIONSI POS-INOEX(PGMR.iSYSTEMS»> I IF POS" THEN OUTPUT JOBTABLEI OUTPUT 0UT5I IF CC»«1233P« THEN OUTPUT JOBTABLEI ft MACRO % MACRO ft MACRO ft MACRO MINTYP14 MINOR INPUT VOLSER ANO DSN CONOITIONSI MINTYP15 MINOR OUTPUT VOLSER AND DSN CONOITIONSI MINTYP26 MINOR INPUT REMOTE NO. CONOITIONI IF RIN=«R108.RD1« THEN OUTPUT JOBTABLEI MINTYP3S MINOR TSO CHARGE NO, CONDITION) IF CC»U233P« THEN OUTPUT JOBTABLEI « MACRO SORTYP • SORT PRINTOUT CONOITIONI BY JOB TIME RECTIMI PROC PRINTI BY JOB NOTSORTEOl ID RINI TITLE SECURITY RETRIEVAL REPORTl « 8. Reporting Facility Of major concern to the user, of course, is the end report. Here again, significant flexibility is desired to provide ease of analysis and ready observation. Though gen- erally, job integrity is desired for report- ing, sometimes it is desired to sort the resultant information on one or more of the conditional parameters. Furthermore, it would be helpful to readily separate major sorting breaks and be able to specify what variable to position in the left most column of the report. All of these functions are provided by the input selection type SRT. Along with the MAJ and MIN types, the user may specify a sort request indicating what conditions he desires as sorting parameters and the output variable desired in the left most column. For example: SRT 'JOB.PGMR' ID 'DATE' This request will cause the report to be sorted by job and programmer and provide major reporting breaks by these parameters. In addition the date will occur in the left most column. If no sort condition is specified the default is set to: SRT 'JOB, DATE' ID 'PGMR* No JCL or control cards are necessary to accomplish this as it is accomplished in the SAS routine by generating the proper macro resulting from the input specified. Samples of the output produced from the previous in- put requests follows: 38 Example 1 . User Requests: MIN DSN='SYS1.LIB01' MIN DSNX=' TOPS' &VOLSER='K19762' MAJ DATE=80091 ~ — oo o o o — MM oo o -o c\j o o> a cr h» iom o o o * r- — — — — » o>o o o o ►- — o o o o o o > o * > •o — ■c o o in o — • o _j (J. 0> o oo o ~— o o o > IfK KKK > > o a o o t- a o > o • 2 O a ►.-. « • o < a. «r n (\j • ooo -» •» o ■» •» HE < < a ooo oo u ooo -< CD flD CD < < < • • • OOO OOO oc a a a a a mm oo oo a a o »-»- a >> o z 2 o i/) in in a. a a. a a a. O-O o o o o oo > * * » # • ] o o I o o : a a . a. a u> V) ■/> V) clo. a. a. a. oo ooo O ►.<-! -. a. a. MAI o o ■» .# ■» ■» ciociooooo nonooooo oooooooo "O ^ "O "C "O >fi ci ci ci o o o o o o o o 39 Example 2. a User Requests: MIN PGM='UHPROGM' MIN JOB='ACBPEM'&TIME>10.& TIME<12, SRT 'JOB' ID 'DATE' •-• mOOm •-• ►-«-■ Nrt«K ^ « ■* ftjru M O tflM 1/1 O o o o o -J o o •* o o o S-inia a ruin WW) a a —a a — a a a *: x: »c * 3£ sc *: * * u o o M •— u o o _1 -J a • 1/1 1/1 3 a. o. > >" O Z z 1/1 1/1 in 3 3 • O O oo in • z z 3 1/1 MO J-t *~ _l 3 3 < X X a _i JO o .o. o a. -j < < b. M M b. 1/1 • oo Z O. 0.O Z a •- Ml/) ►-.o o < oin 3 3 • K- t- o >-u _) _l o • . _1 o • >- a. a. OWIflU o in • Or C t> • a a- in i/ii/i a ao m i* • 6(51 a. a>a a a • ISO oo icxieii: * x »- ►- 1- O M>-i-.o mOm m m •- m O O m —• •c t~u\ a> -a o « —. »-• cr> O O CD — • >0 t/1 oc Ifi o m or a — — ~ o«fy com vi »-or o x K K no4« -. — r- o O vO o a o o a in »* m — i\j ac a or «-or •-•■-.ac > CO O > O O 004 (0 O *> O O > > m ►" > > in > > ■» 0: M 1/1 in 0- 4IHO »-o> O»o m o- ■* OO O "(9 -* —• O • •-•o Z O • m OOl- t-o OOI- 00 • »3 a • «I • 3 n no •n en in r> •* UUI- _IO »- uu a a a •— • MCK a a. < < -•a: »- ac « « 1- *- i/i < >- to *~ in ►- •- ►- • • • • 0000 OO CD 000 000 b. -* r\J~* no: »-i a -■ IMMU. z < j ** < © O Z i-iri so ■ _l CD CD ■ O CD CD CO CD CD CO * CD CD SC CD CO U -* o — <-* ~ o Q. M M O • O • O -^ • (\J • «*> o> o> o> o> o> «\J <\J *\J CM CVI 0000c rvj ou oj (\i oj O 0"" .* 000000000000000 Oj(Mrvj(\jfMOJfM<\Jf\jOjOj(Mf\IOjnj cinnnnnnnnnnnnn f*i coooooooooooooo rvr\ir\joj(vic\jr\jnjf\jAjiM(\inj<\)(v O* C7^ C^ C^ CT* O* O* C^ C^ 0^ O* C^ ^ fr ff 1 000000000000000 000000000000000 trtriniri/iirtririrLriririr)u a iiriririirii/iinir CT 1 C^ C^ C^ C^ 0^ C^ C^ C^ C^ O* 0^ C^ O* C7 1 " C^ C7* 0^ C^ 0^ (t* C^ 00000 .0000000000000000 (\jr\j{\ir\jc\jr\i<\jt\jt\j r>jnj(\JOJt\jr\jr\jiMtMt\irut\jr\j 0000000000000000000000 oooooooocoooooocoooooc CC o — MO) o M in i/i eo ru in in a o — — a c il£ St * * St r- o- in in o co r- n fflo o -J r- — ~ r- — m r- .* in i/i — — m — a oemoj a — — ,-a— a ocmm it st sc it st * it st st st oai Mn-4 - in - »in uj -o k r- ■£> aj in ,» r» m — ■ o ffffo(Mpaoo« oftjN — — o o> ruin a ~ ru <\j — c\j a ~ »- vi co a m ^ m mo a—— mimm m m* a cc a or ru m sex it it st stststscstsststst *: * * e o o o > > -» in Ukl _l_l uu >■ > uu uu • • as o o is M fU • > M in n © © M © © «*nj © o CC M ■a < ac — • • *— «n m co • > cd a> st in ) m o M • < < O in • • 4T sD © 1/1 U O O U U ~> o a o o a. a. < H- < 0C CK < < o © o o o > o o © © o © © tr o > > > > © © > m © »-r» > > ■> o > o o on c*oo © n o o o * o>in ■» o o o o OMMlM M • ru (9 o O O mm ID X M • • o o • U 19 1- u u i/s • • in z • 3 1- Kki^iniii 3rs o a a or uu a a u u >- »- u a a. u co a. i- cc tr or < < a Z3 < a i/i no.t-1-o. a i- • • • • • go o o o © © © © © o o © © w ri » o o -i zoo © © © © © u o z u © © o M* * Z CD 2 © © ux o © © o © © © © a jo a. © © a. • o — — ^.^ ^^^ ^0 CO es as mm «* — ct < <« <<<< • CO GD CO CO CO CO CO "3 >■ • CD CD It V) <_> OD CD CO CO CO CO CO h- St in u St CD CD *■ o _l cc o a. a.' 00 CD z z < < a a 3 3 in o> < < > a. a. x o z>u X u o z o u oe oi a. m x u tfiifiiru^i^ii~>^ir>iri^uitf>u^ifii/iijniX-tritf^i^ O^O^O^O^O^O^O^O^O^O^O^O^O^O^C^O^O^C^O^O^O^O^O^O^O^O^C^O^O^O^ intjftinLnuiu^i^ijninu^irii/iu^u^u^u^unuiijnuiinLninunifiiriu^^ oooooooooooooooooooooooooooooooooooooooo ojruryroftjf\jr\jrvjf\jfvjnjPjrur\jrururu<^r\j^ O* O^ O^ O^ 0^ O* C? 1 O* C^ 0? 1 O* 1 O^ 1 O* O^ 0^ O* 0^ CT 1 O^ O^ C^ 0^ 0^ O^ 1 0^ 0^ O^ O 1 0^" o^ 0^ Q* ^ O* 1 0^ 0^ 0^ O 1 CF* 0^ oooooooeoooooooooooooooooooooooooooooooo ooooooooooooooooooooooooocoooooooocooooo CO CO CO CO CO CO CD CD CO CO - in (vi O * O — o © C © © © © o © © © © © ©no O l\J O •* © o © © -I a n _i tuvntaj (VI _l a -— a— i a— — xxxxxxxxxxxx — — <\J © 1(3 © CO© a: o _i a o a. XXX — (VI in r-0> n (X or •C -1 •o o a. m^ in a a out ■ uj a X 3 < or ui • mo • o • »o « •< < «t o o < oino oo — — o O-O _l o _l a o. _i r-n a o l» o r- no •c I- o L^O •c -c o m< at z in a in ma xua — a: a. cc cc a •« x • • < • a a • . . o o • <_> • «o nno o no m no » o- u 8 0> O o- o- o co ao ui o. co in 00 © l/l oo • • o • OO • XXX XXX XXX t-o or r- 0<0 a in UJ CD a cd • «o O < < — o. o © O _i nno ^ * o minor oo oo a. a. < • . . o n no o> o> o oo od in oo • XXX — • o • o < < o o O _l O I- o u. o a ooi/i xxx o t-t- «•- a o. i- * a or a •in _j-> a ao < ui aD i- >or o . . . . o < in in in < o o o o o O O O © _J r» n r- n o >D (CiO io o in in in in et c coca. < a. a a • . . . . o n n n in o 0> fl> c o> o 00 0D 03 00 l/l oooo • X X X X X • o < < o o U _l no •o o in or o o> u OD l/l • O ■a < a o U -J O i- o U vO o z in or — a. ft. • < • O •<_> o no oc ao 0L CD © o > oo oood o o r» oo >>a> > > co >> in *» n cc n n *c & o> ■ OU.li. - n < < > *o — — i in © © > 00 ~J v» : J« ■ • < < r < • • i o oo ) o u. u. - n < < i vo — — 1 IT. o © :•«..*.» [ «i j o o o : x t- t- © © o <» •» • © © -> n n < n r- < -£• *C — in in © a a. 4 < < *» • • < nn » o> o* 1 l/l oo oo a. o o o x x *- o o o o • »-»•• 1 o © b. "5 < r~ r- < < — ^> ^> — — © in in © © 4 (0ID4 ^ JUHD J J 4 ♦ ♦ < < • 4 < • » O OO O O ") OO u. "> < f^ p- < < — *« — —< © in in © © .j ac a, j st J < < sf J < • • - o» ■o ■» in (-■ co in x 3 • to o < • < o in o k n o «r- o in < or a>o a. « x • • »o n n o o*- o o co au/i o o • xxx -u « O or or a a r- r- 3 O O OO r .» o o o i^ cc -o a in x a> Da < -o o: in o 03 in a. z J o ON or •£ o. in 1 co 3 a ooooooooooooooooooooo^oooooooooo©© o©oooo©ooooooooooooooooooooooooooo O 1 o> 9 O^ 0^0 k h h C^O > 0^O x k O x C^C/ , CI*0^0 x h 0^0* &" O^ O^ CN O* O* CN 0^ CN O 1 o^ o^ rti*'. ni r )nnr)neyt*itinP)nt r i{ r tiir)t r if*)nf r )P)i*)r}f r )t*)rtf r )i r )i*yf*)t*it*) Pint^t^inp)i r }ptPit r inf^p\i^t^)t r )i^tt*\f r tr^Pit r if r iPiPit r int r it^t^i r int^i r i 0000000©000000000000©0000©000©©000 iviV'rufvjftjftjivjrxjrvjfyrvjAjoofVj^jcvjf^iV'ivjrurvjfXJiVfvf^rvjfvjfvjfvjfNjfviviVfv ^CI fc C^0 x CNC^O x CI K CI*CI*0*C!'O > C^O^CNCNCN0*o > G*Cl s 0'Ct , O l x C*C^ O0000000000©00000©00000©0©©©00©©00 O©00000000000000000000000000©©©0©0 CD It CO CO C OD CO CC 00 Cfi CC CO CO CO CO CO S S CC 03 CC ffi S3 CO CO CD CO 00 0C CO tc ff JOOOOOOOOOOOO© O O O- O O O I I © O © o © o © o o o o o ru rvi cvj (vi fy rvj (Vi OJ fy ry (vj fy evi (Vi nj ry (v» cvt r« ^O*O'0 x «>O x h 0>0 x fc O x l>C*'O N c*c* c* c* c* o©o©©©©©o©©©©o 22SSS oo©©©©©©©©©©©© 2S2SS cocraocococoaoaooocococococc a: eg m uj b 42 Exani r, le 3. a User Requests: MIN RIN=*R108.RD1'&PGMR-'= I SYSTEMS' MIN CC='1233P'&DATE=80092 SRT 'JOB' ID 'RIN' ooooooooo ooooooowor oooooooocor ■a ct o o o o o o o • is • ••••• «in • OOOOOCCUUO ffi CO ff> A CO CD 0_ CC O"C^(>ffC*0'C^-O k uouuuoozu • ••••••UJ* *0'0 ruru x: x: UJ UJ X X in in UJ UJ z z — w- o « o »-0 -* o Cr- l\J 0> l\J CO st r- « r» .0 m «o r» » ru ru so in co in a ■* in o- ruco <0 i» in in — o © r» o r> aru o f- cr- n e ru O (VJ O fu o- ©> m — r-l\J -» ru ru ru CU — ru --. ru — i- •* ►- — SC X. x x * x X St * * v x X X x x o o > © > © > © > o > o > © •* IS o -C IS r- is r~is r- is — IS mo n • O 19 n • CD • CD • m • m • r- • o o *- i\j _i ru j — o Oct IS o m uj e o is a. iso. IS«J IS < a. => -*r> l- _l _■_!_■ ■ O J or _i •-_! o _J 19 _l o a. a a a a on. ■a a. m =5 00 rioo 3 CC UJ oc _l ►■• _li-i_li- _j>-i _!►" j»i _J •-• m »- a -i 1J1J a -i a -i CL_I 0. -J in _i • o • © • o • o • o • O • o 3 o 00 -o CO U3 CD U> CD -ltt •- in a. m -J o _J o -J © JO JO -JO J o • o o < O < © •* o < o < o< o < -J < o • •o • •D • U3 • • in in in m mm mm mm mm m m z in o cr- © a- o o» © a- o cr © ct> o o- or o- 1 to < CD < CD < CD < CD < CD < CD 19 CD • IS • IS • o • o • O .19 • o • 19 in x in x m x m x m it in x: m x © X o> • cr- • o> • o» • o> • CO X OX CDU CD X cd x: odx: cd x: co x: O rU is ru is isi is ru is ru is ru is ru o ru X M XNXN *: ru x: ru x: ru x: ru x: ru i\jry(\j(\if\)(\j(Mj0 vD , *0 ^ "VC "cO *C ^>D 'VC '•C ««C OOOOOOOOOOO in in o o in tn UJ UJ a a < < >oj(Vjc\j(\jNr\jnjr\jojnjnjf\j ooooooooooooooooooooooooo ooooooooooooooooooooooooo 00 QD 00 (D CO CO 00 CD CO CO CO CO CC CO CO CO CO CO 00 CO CC CO CD OC 00 CO CO (D CO CO CO CO CO CO CO CO CD CO CO CD QD 00 GO CO (E 00 CO CO 00 CO CO CD CO CO CO CO 00 CO 00 00 CO CO CO CO CD CD 00 CO CD CO CO CO CO CO CD OOOOOOOOOOOOOOOOOOOOOOOOO 0000000004 ooooooooooooooo 43 Example 3.b oooooooooooooooooo— OOOOOOOOv* -Jf -. _ ~ j ,-. o .-•-*.-< —> cicimfimcjcimoooooonooc o ojojojojojojojojo: orooooojoooor ^^^^^^^ rtQC oroooo^ooaor i; « i: y. ?: ^ ^ t y: •. ^ z i it Jt ii it ^ ii i- se CD let a uj o a a o »- I »- t- )- b_ I O • 2 >»o en a> a * > > i »-«ujmor .J *-■ MHWh oj a -3 3 a w st r> < o ve in < o ^m< . • CD (TO * O > X CD 2 > O «~ 2 * ¥ •■* a o a. 3 k: => v o KU< CJ < CO 4(D t CD • St o o o O X »- >uw 2 in a •- * -> a a o * SC ** UUN < < 4 CD CO i/) • • it co a • o> CO X CO o o o *• *: *: f^ O O. own ** ** st OJ (\J CJ < < 4 in in to * * * • • x: in in a o t> > CO CD O. O O UJ st se •-• OMMO -* CO CO CO CD o o a _j ic st \c it •-« ooooooooooooooooooooo»-i •-«-■ OOOJO*D*C<£OOOOt4)^)tDOO'D^^)%DnjO oot-ooooooooooooooooot— o a a or m o o Ciaaaacoow^ccccQQ aoroorociCJoraorttoooctaciooooQ: ^^unstscsrseisrstsrststsrstsrKScsr^vri/i*: r- oj — o o o a in a a it: * -XL CD SjC t- ("- * CD * o-22vincDX a_iorse5es£:sc-i_ior • o OUJ X ") U UJ in or • a * UJ »- 1- a a x o < • > > X X 2 2 UJ X *-*•-. or h- «-« —i o sor«-o~*oo<-»%or^o • O -* ~- -« O _l oj oj oj oon O* 1 s O* 1 Q_ 0? 1 O* 1 (VI O 1 o^ o x o^ COCCCDXCDCCCDCOCDCOCCCDCCCOXCCCOr-CCCCCOCD OOOUJQOOCiCQOClOOUJOOoOOO-J OJ o -* U. OJ 2 < —. en * st o • o in or o a cd • o a a o o CO CD 4 4 UJ UJ 2 2 a a st ir o o UJ -* -* OJOJOJOJOJOJOJOJOJOjOJOJOjOJOJOJOJOJOJOJOJOJ o^o x o>o x o^o >l O N o^o , o*o x o x o*'0 > '0*•^7 , o > •o>o > 'C^o^o , * OOOOOOOOOOOOOOOOOOOOOO ooooooocooooooocoocooo cocdcocdscocoqocococdcdcoqococococococococd *C •£ *C \0 *£.' -i> ■€ >£• *C v£ X ^ X \£ *£ -i) * >£ ^ *£> -C >C 0*'C3 N C^0 x x > 'O ,l 'O v 0^0 >, O > 'Cy'O s O*'0*'O^O*'O x C^0 ,v 0*'O*' inmininininininintnininininmininininininin OOOOOOOOOOOOOOOOOOOOOO OJOJOJ OJOJOJOJOJOJOJOJOJOJOJOJOJOJOJOJOJOJOJOJOJOJOJ Iji o^ s O^ O* Ch O^ O* O* 1 O* 1 O* 1 O 1 O 1 O 1 * Of 1 O* O* O* 1 O^ O* 1 O* O* O*' O* O* 1 o^ oooooooooooooooooooooooooo ooocoooooooooooooooooooooo cococococoscococdcocdcocococococococo®cococoooco& nnnnnnnnn nonoofio^nnpor^enndnn cc g. a oL' a: cc a a. a oooooooooooooooooo* 44 Example 3.c o o •- >-oo i • o o o o o o >-• ■o~ r- cm *> * -* f- M - *- a a O bJ "1 o O UJ -> o O UJ *■ UJ • a • a • or o »- or o • O O o * o a o • i- x ■ I X I I I X X X X X in > x U I o u U I u u <_> X or z x w *- W W in h- in in in i- ->»~«i- O.J o o o _J c o O _J *> ~*_l CM O cm cm CM O CM CM CM O C CMO ■* o o -» ^ •* O o .* -* ■* o -» •* O Ma b. cm (\i cm a u. CM CM cm a CM CM 0. 4 4 2 4 4 < < z < 4 < 4 4 4 4 in in <-tinv) in in in in in in in in in 1£ S£ • ¥ it: ic it * k: it it: x *: 5c: • • o in in o in in in in o in in in in in m in o a- a oo cr- o> or o* o> o o cc inO'<£>CMco«~eDr>»*~-iCM o o * 0> ^* CO o o o o ^ ^ • • o o u. u. r- r^ 4 4 >c ^ — — in in o o CO co -» *» x _j >» >» • •44 4 4 • • o o o o o o u. r- h- 4 JD vO — in in o o> in co cd a. ooo X X »- O O ID h- f^ < ,0 ^ -^ in in o co CD -* < < ^r < r- r- < P4 ^> iO ^ o in in o ^ CC CD ^ ■j n • in o> a* m O- CD co a. o oo o i- •*. X t- o > O f- > CD ^ n co « a o • •* ** "1 o o < h- r^ ^^ * o in in <* co co u > tn DO- O o co r> o o> f*- ^-«-» in OOO • • t4 CD u_ ~5 o *n < < ^ f> « — »o -* o o in r- » it: r> < < • wo oo o mo b_ -> U *> _l < < r^ do o o in < or 44ICUfL ^* * < * • 4 4 • •<_) • • m m o in in o o> o a a co co in o o o o • »- t- ic ic: it: a. a in in CO CO K X 4 4 CD CD O O OO a o < ^ or in O CD in a. x ~t O o O r- QT sO a tn X CO 3 CL fMCMCMfMCMfMfMrMCMrMCMfMCMIMCMfMCMCMfMfMCMCMCMCMCMCM C^C^C^C^C^C>0^C>C^C^C^0>O*C^Cf'C^0 x v x C7*0 > C^0^CT'C^0 N oooooooooooooooooooooooooo oooooooooooooooooooooooooo CDCDCDCCCOCDCOCOCOCDCOCOCOCDCOCOCDCOCDCOCOCDCDCDCDCD (DCOCO(DGDQ0CO©COCDCOC0©VCOC0 COCOCOCDCOCOCOCOCOCO oooooooooooooooooooooooooo ininininininininininintnininininininmininininininin oooooooooooooooooooooooooo oooooooooooooooooooooooooo fMCMfM(MCMfMCMCMrMCMCMCMrMCMCMfMfMCMCM o^c>o s c>o>o > c^o>o x o^o>c>o A c>o x o , 't>o , '0 4, ooooooooooo cooooooo ooooooooooooooooooo CDCOCDCOCOCOCOCOCOCDCOCOCDaDCDCOCOCOCO X> -f> ^D ■£} «£)■£' vD -^"-' X 1 O ^C -D ■■D ^ £■ -if ■£■ ooooooooooooooooooo ooooooooooooooooooo 45 9. Conclusions Frequently it is desired to analyze com- puter utilization to determine whether and/or to what extent security violation has occurred. This paper has described one approach to pro- vide a flexible means of retrieving pertinent information and producing relevant reports to aid such evaluation. Free form inputs of three types were dis- cussed, to provide subsetting the original data base, sorting and formatting the output reports and specifying the retrieval condi- tions that must be met. These specifications may occur in any order with multiple requests permitted within a given run, except for the sorting type. Only one sort specification is permitted per run. Any combination of nine retrieval para- meters were outlined to use as search condi- tions to obtain the information desired. Once retrieved, the complete job profile is available for reporting according to the sort conditions specified. In addition to the usefulness relevant to security considerations, this facility has been applied profitably to aid in performance and billing problems. 10. Acknowledgments I wish to acknowledge the Computer and Data Security Department for their contribu- tions to the design and utilization feedback of this tracking capability. The efforts of R. I. Pettersen have been especially valuable in this regard providing early foresight for critical design specifications, as well as continuous discussion and recommendations during development and implementation stages. 46 Software Improvements • ..: • I:', :....."':", . ' ■■ •'.,,.,, 47 Measuring Programming Productivity Peter F. Zol 1 Octopus Enterprises Post Office Box 126 Geyserville, CA 95M1 The paper begins by characterizing the current era as one of decreasing hardware costs and increasing software costs. These trends make the improvement of programmer productivity a critical consideration. The paper procedes with a review of the historical methods of measurement that use statistics such as programs, lines of code, limited lines of code, program correctness, data division lines plus verbs, and project control history. The consequences of imposing these various measures are considered from the aspects of generation of quality programs and impact on programmer morale. The second section of the paper advances the position that available methodologies depend on the assumption that programmers should be evaluated on their ability to produce a program, as opposed to a product such as a report or a screen. An operational i st metho- dology suitable for metaprogramming is defined with the key consti- tuents being a data elements dictionary and a set of software routines that generate sections of programs. The paper concludes with some suggested methods for measuring the efficiency of metaprograms ' output as well as measuring the programmer's productivity in the new milieu. Key words: Metaprogramming; operational ism; programmer measurement; programmer productivity. 1 . The State of the Art The introduction of the IBM 4300 ('E') series hardware and its plug-compatible competition has effectively quadrupled the data processing buyer's dollar in terms of equipment. This has been almost entirely offset by the more than doubling of the salaries of programming personnel in the last five years. Currently, in the San Francisco Bay area it is not uncommon to find trainees commanding twenty thousand dollars per year, with contract programmers engaged in nothing more complex than COBOL coding earning, if such is the word, sixty thousand dollars annually. Monitoring facilities to report the increased productivity of the new hardware have grown in both number and sophistication- Alas, the quantitative and objective mea- surement of a programmer's productivity is still deficient. Historically, the first (and still most prevalent) programming measurement methodology employed is a carry- over from techniques used to manage some of the more prosaic occupations. Typically, the manager draws up a list of tasks, estimates the effort involved and assigns personnel. The programmers are evaluated on their ability to complete tasks on time and under budget. (1) While economy and time- liness are critical and praiseworthy goals, their attainment depends heavily on the 49 ability of the manager to estimate accurately while simultaneously assuring a quality product. Failure on the part of the manager often results in programmers working odd and long hours. This was of some short-term benefit to the company, largely because the projects were completed within bounds. The steadyerosion of trained personnel seems to be regarded as harmful in the long run. Worse, information systems implemented under the methodology discussed above can be readily indent ified by some or all of the following features: o Little optimization of system resource consumption o Very conservative technology (to the point of obsolescence) o Numerous specification changes due to inadequate allocation of time or resources for design and analysis o A significant number of trivial programs o A substantial number of highly specialized programs o A high level of post- implementation maintenance o Coding that emphasizes expediency at the cost of quality In response to a groundswell of complaint to the effect that a trivial program ought to be weighted less than a complex one, and that more eloquent (or verbose) coders should not be penalized, counting the lines of code was introduced as a measure of potential program goodness. Lines of code has become a fairly popular basis for productivity measurement . (2) Like many other common mea- sures, such as IQ, the exact meaning of a line of code is not always clear. Because persons who administer and evaluate program- mers tend to be paid more than programmers, the prospect of having a manager count lines of code does not have economic appeal (not to mention the understandable reluctance of the manager to engage in such a mundane activity). Accordingly, a number of attempts have been made to use the computer to count the lines of code. Presumably the irony of using the programmer's tool to measure the programmers had some appeal. Those fami liar pi lers will be qui c compi ler in many ca of Data and Procedu This means that the however commendable The Remarks section Division was also s Those installations a source library ma with modern COBOL com- k to point out that the ses provides the number re Division statements. use of COPY statements, , skewed the results, of the Identification lighted, as were comments. fortunate enough to have nager such as PANVALET or LIBRARIAN could consider using the library software to count the number of statements. The problem with this approach is that a line containing a trivial amount of code (such as 'ELSE') would be equal to a line containing 50 or 60 characters. Two divergent schemes were developed to resolve this problem. The limited lines of code approach discards all lines that do not contain a certain minimum number of char- acters (with fifteen being a common choice). The second approach calculated a ratio between spaces and characters and termed this the program's density. It was asserted that program density correlated well with a programmer's efficiency. A fairly thor- ough search of the contemporary literature has failed to uncover any publications that vindicate this assertion. At any rate, several ingenious (if not wholly ethical) ideas were employed to counter the measure- ment programs. Blank cards, formerly used to offset paragraph names or comments, were changed to contain an asterisk in column 7- More extreme examples of this were the use of the literal SPACE or even asterisks in columns 7 through 71- An acquaintance of mine coded all elements of a structure in the general form of e-name, where name was the lowest level qualifier. Using library features he was able to change all es to a twenty character prefix. To add insult to injury, all the names were full qualified, Naturally, he sported an enviable efficiency rating and is now a manager. The next improvement on the lines of code approach counted both the identification and Environment Division statements in addi- tion to comments, referenced Data Division statements and Procedure Division verbs. The qualification of referenced Data Divi- sion statements was deemed necessary to prevent the inclusion of superfluous COPY statements (or their equivalent). This methodology was found to discourage many aspects of structured programming. Data- base programs were favored, while the results on Report Writer and indexing were mixed. Work on a compensation factor relating bytes of object code produced per verb was abandonned in the face of generally fruitless results. Scarred veterans of the programming wars will be certain to point out that all the counting of lines is no substitute for having the manager or chief programmer inspect the code. Worse yet, the counter is forced into an ongoing guerilla warfare with the countees (who, in their worst moments, have been known to take delight in 50 altering the counting program), Pinal ly, the lines of code counting fails to observe the most fundamental law fo programming -- any program that works ?s better than a program that does not. Lines of code counting does not assure that a program will work, as the (necessary) advent of structured tactics shows; it merely insures that a large number of lines of code will be produced, The most shattering blow of all comes from users, !t seems that they are, as a group, more aware of and concerned with reports, screens and so forth, than they are with how many lines of code were produced, |n short, lines of code counting does not distinguish between a product and a process, (3) 2. Operational ist Metaprogramming A recent publication indicates that any hopes for a high improvement in programmer productivity are likely i 1 1-founded. (4) If the traditional measures and concepts are retained, or even adjusted by error refine- ments (5), this gloomy conclusion is still likely to hold. One may, however, take some solace in recent thoughts by Gerald M, Weinberg (6) to the effect that unusual tasks or work in programming are quite rare, Accordingly, an operational ist approach involving metaprograms may be of some promise. Many metaprograms exist today. The most common example is the commercially available report writers. Here all a user (programmer or not) need do is to define the data elements and relationships, and report? may be gener- ated with relative ease. Operational 1st metaprograms exists today to extend this function to cathode ray tube screens and transaction editting. The user need only declare the class of screen element (.literal, database element, program variable) and define its position on the screen. The attendant MFS macros (in the case of IMS/DC) are then generated automatically. These are accompanied by appropriate Working Storage and Procedure Division statements. In the case of a transaction, the user need only define the input format and the target data- base elements. This process of metaprogram- ming does indeed add yet another language to the Babel (7) existing today. The reduction in monotony and increase in productivity seems to be adequate compensation. The programmer is then measures on his or her ability to design a database system. The production of file definitions and data elements dictionary entries becomes the critical measure. It can, of course, be abetted by the measurement of additions, deletions and changes. These measures are readily derived from the data elements dictionary. The routine audfts now performed by the database administrator to determine aspects of element access and usage can also be used to measure the goodness of the design and the thoroughness of the analysis. The change is perspective from lines of code and other measures which are used to monitor the clerical functions of computer programming to inspection of design in the milieu of operational jst metaprogramming requires some reorientation. It implies a higher usage of the hardware to reduce the non-creative tasks associated with infor*- mat ion processing. This has been a trend throughout the brief history of computing. Programmers will have the option of becoming either metaprogrammers (writing the routines that generate the code) or into designer- analysts. The former will still be aware of and concerned with the reasonable use of hardware, while the latter will gain more ability to a|d the organization in its imposition of computer-based Information processing and structures on the less desciplined areas of the business environ- ment, References (1) Comper, F.A,, Project Management for System Quality and Development Productivity Proceedings of the Application Development Symposium. (2) Evans, B., speech at the 10th Confer T ence on Computer Audit Control and Securi ty (3) Bradshaw, W.R., 'Application Develop- ment Productivity Within IBM Informa- tion Systems' Proceedings of the Application Development Symposium (A) Jones, T,C, 'The Limits of Programming Productivity' Proceedings of the Application Development Symposium (5) Albrecht, A, J., 'Measuring Application Development Productivity' Proceedings of the Application Development Sympos i urn (6) Weinberg, G.M>. 'The Psychology of Change in Development Methodology' Proceedings of the Applications Development Symposium (7) Ehrman, J.R., 'The Babel of Application Development Tools' Proceedings of the Application Development Symposium 51 Also of interest: Boehm, B.W. et. al , 'Quantitative Evalua- tion of Software Quality 1 Proceedings of the Second International Conference on Software Engineering Jones, T.C., 'Measuring Programming Quality and Productivity' IBM Systems Journel , Volume 17, Number 1, 1978 also in Proceedings of GUIDE *f5 Kendall, R.C., 'Management Perspectives on Programs, Programming and Productivity 1 Proceedings of GUIDE k$ 52 Comparative Performance of COBOL vs. PL/I Programs Paul J. Jalics Cleveland State University Abstract The comparative performance of COBOL versus PL/1 programs on the IBM/370 is studied on the basis of a substantive benchmark test which has been implemented in both languages. The benchmark program was written so as to use identical data-types and language facilities where they exist, and close approximations elsewhere. Measurement results from each of eleven atomic tests are then used to gain insight into the relative merits of code generated for each language in that particular area. A number of surprising results were found, which in turn necessitated the running of additional experiments to explain the results. These, then, all contribute to give substantial insights into the relative performance of COBOL versus PL/1 Programs. Introduction The comparative execution time perform- ance of COBOL versus PL/1 programs is exam- ined on the basis of a substantive benchmark test which has been implemented in both languages. The original benchmark was writ- ten in COBOL and is called BNC1. Subsequently, three PL/1 versions were created: BNC6 is the original PL/1 translation with COBOL nu- meric COMP-3 items being interpreted as PIC '9' items, with PERFORM' s being interpreted as PROCEDURE calls; BNC7 which interprets COBOL numeric COMP-3 items as FIXED DECIMAL and enumerates multiple occurrences of sec- tions of code PERFORMed in COBOL rather than use PROCEDURES; and finally BNC8 which is similar to BNC7 but uses PROCEDURE calls to implement PERFORM. BNC6 can be thought of as a conservative COBOL programmer's translation and BNC7 and BNC8 as more favorable transla- tions for PL/1. The benchmark breaks down into 11 sepa- rate tests each of which is timed separately using an assembler subroutine called XTIME which returns the CPU execution time in units of 10 milliseconds (0.01 sec=l time unit). There is a repetition factor for each of the 11 tests which is set so as to give the pro- portion of each category of tests to reflect actual business programming usage (as inter- preted by the benchmark's designer). We shall not be concerned with the weights and proportions given each test here, however, since we want to focus our attention on the relative performance of COBOL vs. PL/1 in each of the 11 test categories separately. Measured Data The following table summarizes the results for each of the benchmarks run on an IBM 370/158 with maximum Compiler optimiza- tion (using VS/COBOL release 2.2 and the PL/1 Optimizing Compiler Version I Release 3.0): 53 Test Number 1 2 3 4 5 6 7 8 9 10 11 12 Test Description Internal Subroutine Call GO TO, GO TO Computed COMP-3 or decimal arith Binary or COMP arithmetic Character MOVE's Compare's Table Indexing Numeric Editing of Decimal Numeric editing of binary INSPECT, char search & repl; Table Search TOTAT BNC1 (COBOL) BNC6(PL/1) BNC7(PL/1) BNC8(PL/1) CPU time CPU time CPU time CPU time 12 143 138 146 3 3 2 3 42 160 107 114 9 17 28 29 120 86 82 80 36 116 141 77 1 2 1 1 17 46 32 35 22 39 44 45 ice 78 14 15 14 47 32 29 33 387 658 619 577 A detailed description of each of the tests and the results obtained for each now follows: Test 1: Internal Subroutine Call This test was originally designed to test the speed of execution of the internal sub- routine facility in COBOL, namely the PERFORM statement. A sequence of nested PERFORM' s is executed; the nesting is three levels deep and the first two levels contain TIMES options for looping. In PL/1 the simplest way to implement this is via PROCEDURE calls without parameters. The looping in the first two levels is done with DO loops. This test shows the most dramatic differ- ences between COBOL and PL/1 with COBOL being 12 times faster than PL/1. The PERFORM in COBOL is implemented very efficiently whereas a PROCEDURE call is by its nature more complex with dynamically allocated local storage, etc. One can gain some insight into the factor 12 difference by looking at the table below which outlines typical overhead of the vari- ous subroutine mechanisms (as they are used in this benchmark) : 54 COBOL PL/1 simple PERFORM 6 instructions to get there 2 instructions to return PERFORM UNTIL (with simple condition) 6 instructions to get there +11 instructions on each looping simple CALL 23 instructions to get there (+possible library call to allocate more memory space) 5 instructions to return DO 1=1 TO N BY 1; 11 instructions to get there 12 instructions on each looping PERFORM TIMES 11 instructions to get there 11 instructions on each looping DO 1=1 TO N BY 1; CALL SUB; END; 34 instructions to setup & get there (+possible library call to allocate more memory space) 17 instructions on each looping Test 2: GO TO, Computed GO TO This test has a large number of GO TO statements and one computed GO TO per execu- tion of the test (test is repeated 35 times). A glance at the code generated and the numer- ic result indicates that both COBOL and PL/1 perform quite well in this area and the code generated is very similar. Test 3: Decimal Arithmetic This test consists of arithmetic (add, subtract, multiply, divide) to data-fields of various sizes with overflow detection and rounding on 27% of operations, overflow detection on 9% of the operations. This test is implemented via COMP-3 packed deci- mal data-fields in BNC1. IN BNC6, it was implemented via PICTURE '9' variables and in BNC7 and BNC8 it was implemented with FIXED DECIMAL. Thus the results of BNC1, BNC7, and BNC8 are easily comparable but BNC6 is not since it uses another data-type. The results of this fairly important test were puzzling and a good number of additional experiments had to be devised to adequately explain the results. Since BNC1, BNC7 and BNC8 all use the same data-types, it was incredible to find that BNC1 was 2.7 times faster than the PL/1 versions. Looking at the code itself did not give great insight into the performance differences: BNC1 was slightly smaller with 119 instructions total vs. 146 for PL/1. The COBOL code looked like quite a different style from PL/1 for identical operations but the types of instructions were similar. In fact, the lengths of most of the temporary data-fields involved in the computations for PL/1 were often smaller than for COBOL. Rounding seemed to be handled more effi- ciently in COBOL (4 instructions vs. 7 in PL/1). The use of REMAINDER was more effi- cient in COBOL whereas it had to be computed with additional arithmetic in PL/1. State- ments relating to ON SIZE, ROUNDING, and REMAINDER were temporarily deleted from the test to see if that would explain the huge performance factor of 2.7. Once this was done, the number instructions in COBOL was reduced to 84 and in PL/1 to 85 instructions. But the CPU time did not change appreciably. Finally, the culprit was found to be the PL/1 error interrupt handler. PL/1 always runs with all hardware overflow interrupts enabled. If a certain abnormal condition occurs, a machine interrupt takes place and the PL/1 error handler is given control. Then if that particular condition is enabled by the PL/1 programmer, the appropriate ON statement (s) are executed. If the condition is not enabled, then the error handler still gets invoked but he quickly decides to ignore the condition. Thus, when the ON SIZE statements were removed as described in the paragraph above, the overflows still occurred and the exces- sive performance factor of 2.7 did not improve. Finally, the statements causing overflows were removed (10% of arithmetic statements) the CPU times for both COBOL and PL/1 settled down to about 41 time units. The main conclusion from the above is that PL/1 error interrupts are quite time consuming with a minimum of about 500 micro- seconds of CPU time per such interrupt. Moreover, it doesn't really matter whether the interrupt is to be ignored by the program or acted upon; the overhead has to be paid regardless. 55 The big question that remains is: Why wasn't this same overhead observed in COBOL. COBOL, in general, tends to be more lax about overflow and other arithmetic disasters: they are ignored in the usual case. Thus COBOL usually runs with the hardware condi- tions disabled. In looking at the code gen- erated for the ON SIZE statements, they were detected by doing the arithmetic with a larger number of digits in temporary storage and then additional instructions were gen- erated to detect non-zero values in the upper digits. This contrasts sharply with PL/1 where one does not see any additional code to detect overflow: it is all done by the hardware (COBOL generates an additional 11 instructions to detect the overflow in a typical case) . An additional experiment was carried out to be able to compare the test 3 results for BNC6 (uses PICTURE '9') to BNC1 results. BNC1 was temporarily changed to use USAGE IS DISPLAY which would then be comparable directly to the PICTURE '9' data types of BCN6. Results of this modified BNC1 showed a CPU time of 39 units, which is slightly less than the 42 gotten originally for BNC1 using COMP-3 data-type. This slight decrease was quite surprising when a substantial increase was expected since DISPLAY data types are first PACKED into COMP-3 (packed decimal) then the arithmetic is done, and finally the results are UNPACKed into DISPLAY format. Thus it is surprising that with all the extra work to be done, the CPU time is smaller than for COMP-3 where the arithmetic can be done directly. Previous measurements have indicated that COMP-3 arithmetic is twice as fast as DISPLAYfl]. Looking at the generated code, it became clear that the extra PACK and UNPACK opera- tions were also used to move the data to the temporary workspace and back (this was done with costly ZAP's in the original BNC1) . The comparison with BNC6 still didn't work out well because of the error interrupt prob- lem noted earlier. In summary, the decimal test gave a num- ber of interesting insights into COBOL vs. PL/1 differences. The code generated by PL/1 appears to be just as good as COBOL in most cases (exception: ROUNDING, REMAINDER) but the style of code generation is quite different (for example: a simple add A=B+C generates the sequence of instructions ZAP, AP, MVO, ZAP in COBOL and MVC, AP, MVO, MVN in PL/1) . The most dramatic discovery is the very high cost of program interrupts in PL/1 regardless of whether the interrupt is to be ignored or acted upon. This, then, is in sharp contrast to COBOL error handling which relies more on additional erated to detect the error using ON SIZE in COBOL can cally whereas in PL/1 it i critical sections of code, tion, then, was that COMP- ables performed equally we is normally expected to be DISPLAY. instructions gen- condition. Thus be done economi- s to be avoided in A final observa- 3 and DISPLAY vari- 11 whereas COMP-3 twice as fast as Test 4: Binary or COMP Arithmetic This test is identical to test 3 which we have just examined except that COMP data- type is used in COBOL and FIXED DECIMAL for BNC6 and FIXED BINARY for BNC7 and BNC8. The CPU times are also not directly compar- able with test 3 since the number of repeti- tions of the test is smaller than for test 3 (650 repetitions for test 3 and only 100 repetitions for test 4). The same curious effect showed up here as was observed for the decimal test (test 3). The PL/1 CPU times are 3.2 times as much as the COBOL (ignoring BNC6 which is not a comparable data-type) . The number of machine instructions generated is very similar for both COBOL and PL/1. Again the culprit is the error interrupt handler which was invoked a number of times (some are caused by the implementation limit of 32 bit in PL/1 vs. 64 bit limit in COBOL where the benchmark originated) . By eliminating the statements causing overflow the CPU times were reduced in COBOL to 6 time units and in PL/1 to 3 time units. So PL/1 is actually significantly faster than COBOL for the binary fields of the mix in this benchmark. The reasons for this include the tendency of COBOL to use more library routines for partial word multi- plications and division; also more of the PL/1 code is done in binary whereas the COBOL code frequently converts operands to packed deci- mal, presumably to get proper decimal trunca- tion (since we specify COBOL COMP fields in Hecamal digit ranges vs. bits in PL/1). We can compare CPU times with test 3 by multiplying the CPU times for test 4 by 6.5. Thus in COBOL the 9 unit result multiplied by 6.5 becomes 58.5 which is about 50% higher than for COMP-3 fields in test 3. This is somewhat surprising since COMP is commonly twice as fast as COMP-3. The explanation for this is that half of the data-fields are over 10 digits where COMP fields are by far the most inefficient of all. Test 5: MOVE Test This test consists of MOVE's of character fields which vary in size from 1 to 1024 56 bytes. The sending and receiving fields always have the same length. This is one of the few tests where PL/1 clearly outshines COBOL, rather surprisingly since COBOL is supposed to be specializing in fixed length character manipulations. One big reason why PL/1 looks better here than it might in real-life situations is that the MOVE's are all of successive fields of one record to successive fields in another record and PL/1 is smart enough to recognize this and instead of generating an MVC for every MOVE, it generates one large MVC for up to 12 individual MOVE's. This test was repeated with PICTURE 'X' variables in PL/1 as well as CHARACTER vari- ables and the performance results are iden- tical. In summary, although PL/1 looks better on this test, the code generated and the performance of MOVE operations on character variables is likely to be identical for both COBOL and PL/1. Test 6: Compare Test This test does a sequence of compares of various data-fields of numeric as well as character type and of various sizes. The results show COBOL to be at least twice as fast as any of the PL/l's. The great vari- ation among the PL/1 versions deals with the specific data-types of the variables which are different for BNC6, BNC7, and BNC8 in a number of cases. COBOL gets some advantages from its additional language facilities like testing NUMERIC and ALPHABETIC which generate Translate and Test machine instructions rather than the much more overhead of testing it manually in PL/1 (involving multiple tests). The code gen- erated for 88 level condition-names is very convenient for the programmer but the code generated does not seem any more efficient than the corresponding manually constructed sequence in COBOL or PL/1. Most of the IF statements generate a similar number of machine instructions in both COBOL and PL/1 and the factor of 2 per- formance appears to be due strictly to the advantages gained in COBOL class tests for NUMERIC and ALPHABETIC. Test 7: Indexing Test This test was intended to measure the speed of execution of USAGE is INDEX data- fields in COBOL. INDEX is a rather unique feature of COBOL in which the COBOL programmer SET's the "subscript" which he will later use to access an element of a table. The actual value of the index is the byte offset which points to the chosen element. As a result each manipulation of the INDEX variable reciuires a multiplication bv the element size. Thus the COBOL code is twice as large as the PL/1 which has no eauivalent language feature but instead does analogues operations on "subscripts" which have a data-type identical to INDEX variables. One would expect the CPU time for COBOL to be twice as much as for PL/1. This is not observed, instead the results are about the same and fluctuate quite a bit for separate runs, probably indicating that the total time is so little in relation to our time unit that we are not getting sufficient accuracy. Test 8: Editing of Decimal Numerica Data This test involves taking the numeric results calculated in test 3 and moving them to edited fields so that they can be printed in a "pretty" format such as might be expected in a business report. Both COBOL and PL/1 accomplish this editing with editing PICTURE'S like PIC Z, $, B etc. Here again, BNC1 can be compared with BNC7 and BNC8 but not with BNC6 which uses PICTURE '9' for the numeric fields rather than COMP-3 (or FIXED DECIMAL for BNC7, BNC8) . The code generated looks about the same for many of the edit moves and in fact COBOL often generates a few more instructions than PL/1 where they are not identical (example: COBOL 9, PL/1 6). In spite of this, the CPU time for COBOL is typically only a half of PL/1 execution time. This can be attributed directly to the fact that PL/1 calls library routines in 3 out of the 17 edit moves where- as COBOL generates in-line code for all of them. When the three statements generating library calls were removed, CPU time went down drastically from 39 units to 9 units which would make the remaining somewhat faster than COBOL. In summary, the COBOL editing of deci- mal numeric data is about the same or slightly worse than corresponding PL/1 edit- ing for the cases where PL/1 generates in- line code. For the remainder, PL/1 performs much worse. Thus PL/1 code generation in this area is less polished. Test 9: Editing of Binary Numeric Data This test does editing identical to test 57 8 above except that the source field data- items are binary results calculated from test 4. Results here are somewhat analogous to test 8 above: COBOL editing is twice as fast as PL/1. The size of the code generated is about the same but both COBOL and PL/1 each call 3 library routines. Two out of the three library calls are for different move edit source statements in COBOL vs. PL/1. It is again clear that the factor of 2 in per- formance for PL/1 comes from the library calls because when they are removed, CPU time drops from 45 to 17 units. Test 10: INSPECT verb; on COBOL indicated that the threshold at which SEARCH ALL becomes more efficient is with tables of at least 50 elements, this should not prejudice the results substan- tially. The results indicate that PL/1 code is about 50% faster than the COBOL code. The PL/1 code consists of 102 instructions total versus 142 for COBOL. In addition, COBOL generates a library call for SEARCH ALL whereas all of the PL/1 code is in-line. Certainly the results would not be so favor- able to PL/1 with a larger table but then one could without much difficulty write search loops that implement a binary search (SEARCH ALL). Character Search and Replacement This test consists of scanning a char- acter field of 40 characters in five differ- ent INSPECT statements some of which scan in testing for certain characters, others count the number of occurrences of certain char- acters, while others do selective replace- ment of characters. Since PL/1 has no facilities like INSPECT, loops are setup to perform identical manipulations on identical size and type data-fields. Somewhat surprisingly, COBOL is 4 to 5 times as slow as the equivalent PL/1 code. Looking at the code will very quickly explain why this is so : COBOL calls a library rou- tine to implement each of the INSPECTS. The problem surely comes from the fact that INSPECT is such a general statement with so many possible combinations of functions that the library routine which implements it must be very generalized and parameterized and therefore probably not the most efficient. The parameter list for this library call is so complicated that it takes more instruc- tions in COBOL to prepare the parameter list (110 instructions total) than to perform all of the actions necessary in PL/1 (90 instruc- tions total). This difference can easily account for the factor of 4 to 5 difference in performance. Test 11: Table Searching This test searches a table with 30 ele- ments in six different ways, three times using a linear SEARCH, the other 3 being a binary SEARCH ALL. Since neither the SEARCH nor the SEARCH ALL is provided in the PL/1 language, manual code was written to perform identical functions. In the PL/1 code each of the 6 table searches is done via linear searching and since earlier measurements [1] Conclusions Certainly one cannot expect to get a complete overview of the execution efficiency of two extensive programming languages like COBOL and PL/1 from any one benchmark. The benchmark used is impressively good, however, and serves to give substantial insights into the performance aspects of COBOL and PL/1 pro- grams on the IBM/370 (excluding input/output statement performance) . The following points summarize the observations: 1. Except for the overhead of the PL/1 PROCEDURE call which is substantial, there appears to be no basic grounds why PL/1 programs have to be any less efficient in execution than corresponding COBOL programs. 2. PL/1 performance depends a great deal on the choices made by the programmer in selecting language features, data-types, etc. Thus considerable care needs to be taken if one is to achieve a performance level as good as COBOL. Without such care, it is very easy to get a PL/1 program which runs two to three times as slow as a simi- lar COBOL program. 3. PL/1 code generation is not as good as COBOL in a few areas like editing of numeric data- items and partial-word binary arithmetic among others. The code size is, in general, com- parable but more use is made of library calls than was observed for corresponding COBOL programs. 58 4. PL/1 error handling has a tremendously higher overhead than COBOL in the cases observed, simply because PL/1 makes more use of hardware interrupts to signal such conditions whereas COBOL generates in-line code to identify such situations. Also, it is dangerous thing to specify that error interrupts are to be ignored since the programmer will never know that such errors occur on a regular basis (they should be elimi- nated instead by additional checking) and serve to degrade performance a great deal. References [1] Jalics, P. J., "Benchmarks for Measuring COBOL Implementation Performance", Data Management, Vol. 18, No. 4, April 1980. [2] Jalics, P. J., "Improving Performance the Easy Way", DATAMATION, April 1977. [3] Jalics, P. J., "Comparative Performance of COBOL Programs on Mini vs. Large Computer Systems", Proceedings of the First Annual Symposium on Small Systems ; New York, N. Y. , August 1978 (ACM/ SIGSMALL) . 59 Data Communications 61 NBS Network Measurement Methodology Applied to Synchronous Communications Marshall D. Abrams Institute for Computer Sciences and Technology National Bureau of Standards Washington, D. C. 20234 Dorothy C. Neiman Commtex, Inc. 2411 Crofton Lane Crofton, Maryland 21114 This paper focuses on the application of the NBS Network Measurement Instrument (NMI) to synchronous data communication. The suitability of the underlying Stimulus - Acknowledgement - Response (SAR) model to support the implementation of this methodology permitting quantitative evaluation of interactive teleprocessing service delivered to the user is described. The logic necessary to interpret SAR components and boundaries depends on character time sequence for asynchronous data communications traffic but entails protocol decomposition and content analysis for character synchronous data traffic. The decomposition and analysis rules necessary to evaluate synchronous communications are discussed and the level of protocol violation detection which results as a byproduct is cited. Extensions to the utility of the Network Measurement Instrument (NMI), deriving from additional workload profiling measures desirable for character synchronous communications, are also presented . Key Words: Data communications; protocol validation; synchronous; teleprocessing service evaluation . 1.0 INTRODUCTION In 1979 th Standards co the incorpor Measurement into a porta Instrument Commtex 's extend ing performance evaluation synchronous protocols. e National Eureau of ntracted with Commtex for ation of the NBS Network System [1] methodology ble Network Measurement (NMI) [3] . One part of implementation entailed the NBS concept of measurement and to deal with character data communication This paper focuses on the comparison of data gathering for synchronous and asynchronous data communication and the suitability of the underlying Stimulus - Acknowledgement - Response model to both methods. 2.0 THE STIMULUS ACKNOWLEDGEMENT RESPONSE MCDEL 63 The NBS approach to the measurement of interactive computing focuses on the qualitative and quantitative aspects of the service provided to the terminal user. The assessment of service delivered by such an interactive teleprocessing system is accomplished through the Stimulus - Acknowledgement - Response (SAR) model [2] which views user-network communications as a sequence of transactions composed of a stimulus, and the consequential acknowledgement and/ or " response from the network/ system. Recognition of SAR components from data traffic transmissions requires specific interpretation logic for each transmission mode and protocol to be analyzed . In order to produce measurement results consistent with subjective human judgment of system performance, it is necessary to differentiate operational feedback (acknowledgement) from information transfer (response) within network output to the user. 2.1 Asynchronous Communications Characteristics In asynchronous communications the acknowledgement is defined as a fixed header, a known string of characters which occurs at the beginning of data transmission from the network, and all non-printing characters which immediately precede or follow the header. In the absence of a fixed header the acknowledgement consists only of the non-printing characters at the beginning of network output. 3.0 CHARACTERISTICS UNIQUE TO SYNCHRONOUS COMMUNICATION Unlike asynchronous communications where the unit of exchange between user and network is a character, synchronous communications protocols assemble multiple characters into transmission blocks which are transmitted at the maximum network transmission rate. Only when acknowledgement- response boundaries are coincident with transmission unit boundaries is the psychological function of acknowledgement fulfilled. For synchronous communications, acknowledgement is therefore defined as all transmission units containing a fixed header which occur at the beginning of data transmission from the network. A synchronous transmission unit can be identified as acknowledgement not only as a function of content but also as an occurrence in time. In IBM 's binary synchronous communications for example, the "limited conversational mode" allows for the substitution of a text message for the link level acknowledgement to user stimulus. This message which often contains format control data is accepted as an acknowledgement . Although the NMI monitors all communications on the circuit under test, the link control sequences which constitute overhead to the user are not defined as SAR components. An argument can be made for the differentiation of information content and link level control sequences within information transmission blocks; however, accurate information identification would require recognition of the various levels of application dependent protocol. Since the NMI generally strives to be application independent, this type of analysis is not a standard measurement feature and must be accomplished uniquely for given applications. 3.1 Measurement of SAR Model Parameters Interactive communications are described by the SAR model in terms of the number of characters, transmission time, and delay time associated with each SAR component. Stimulus delay time, for example, equivalent to user think time, is directly measurable by the NMI in asynchronous communications as illustrated in Figure 1. Response time, defined as the elapsed time from the input of the last character of the stimulus until the receipt of the first character of response, is simply derived from the measured variables as follows: RESPONSE TIME = ACKNOWLEDGEMENT DELAY TIME + ACKNOWLEDGEMENT TRANSMISSION TIME + RESPONSE DELAY TIME 64 riME Message Group 1 I I ts 1 I [ *da | *ta | ■* H-" — •- I ■*■ Mr u tr ' *ds . *ts •• «• *•' i_i # characters # characters § characters t. = stimulus transmission time t . = acknowledgement delay time t.. = acknowledgement transmission time ta t. = response delay time t t = response transmission time tj = stimulus delay time Figure 1. Asynchronous SAR Model Although the SAR model is equally applicable to synchronous communications as to asynchronous communications, measurement of the model parameters is complicated by the local processing and buffering characteristic of synchronous terminals. Because input usually is not buffered at an asynchronous terminal, what is measured as stimulus transmission time corresponds to operator typing time. As illustrated below, several of the important measures must be estimated, rather than measured, when dealing with synchronous communication. Figure 2 illustrates the SAR mapping to synchronous communications. Input is traditionally buffered in synchronous communications, and its presence is not detectable at the circuit interface prior to transmission. As a result the NMI is unable to measure operator typing time and stimulus waiting time. The stimulus only becomes available for transmission when the operator presses the "enter" key; it is not actually transmitted until the terminal is next polled. 65 User Network :Poll: :E0T> \ Last Input Key Communications With Other Devices , ^1 I Communications With *-^*/ Other Devices J actual stimulus waiting time measured stimulus waiting time ws t * ws t ** = estimated stimulus waiting time t ( j a = actual acknowledgement delay time *da* = measure d acknowledgement delay time u da "ts delay time = measured stimulus transmission time Figure 3. Estimated Acknowledgement Delay Time STIMULUS DELAY TIME (estimated) = STIMULUS DELAY TIME (measured) - STIMULUS WAITING TIME (estimated) 3.2 ARQ Effects A major difference between asynchronous and synchronous data transmission is the general presence of an error detection and correction system in the latter class. The most common such system is the ARQ (automatic request for retransmission) mechanism which is invoked subject to the results of a cyclic redundancy check made for each data transmission block. Proper measurement of service delivered to a user terminal requires reconciliation of the resulting retransmission sequences . Since information transfer is a function of successful user-network communications, rather than unsuccessful transmission attempts, only correctly received transmission units contribute to SAR character count variables. Stimulus delay time ends and stimulus transmission time begins with the first transmission attempt of user input. Stimulus transmission time, as measured by the NMI, is an indication of the time required to successfully transmit the entire stimulus and therefore includes the transmission of unsuccessful transmitted units. The beginning of acknowledgement and response, from the user's viewpoint, is equivalent to the first successfully received transmission unit defined as such from the network. As illustrated in Figure 5, unsuccessful transmissions at the 67 User Network :E0T> 7F Last Input Key t ** /Kt W ws w_ _ _i_ _ m tS J/_ _ L da : Polls "0 1 Acknowledgement 1> ~ ~ ~T cACK> I t = response time t . ** = estimated acknowledgement delay time t. = measured acknowledgement transmission time (j r = measured response delay time Figure 4. Response Time beginning of acknowledgement and response contribute to acknowledgement and response delay time; retransmissions of intermediate transmission units contribute to acknowledgement and response transmission times. 3.3 NMI Implementation Characteristics The NMI SAR interpretation logic is implemented as a finite state machine [3]. The state table which defines this machine is a function of the characteristics of the communications protocol to be measured. For synchronous transmission procedures, the inputs to the SAR analysis are data acquired from the circuit under test which have been separated and grouped by user terminal device address. Each link level transmission must be time stamped and defined by type (e.g. f poll, information transmission, etc.). Content analysis is performed and acknowledgement boundaries are designated. Data resulting from this preprocessing is input to the SAR statistical analysis. 3.4 Protocol Violation Detection Synchronous transmis include a substan overhead data beyond content. The NMI identifies synchrono sequences by type is as a finite state protocol decoder observes data link t sion sequ tial amoun the inform logic us transmi also implem machine . dogmati ransmission ences t of ation which ssion ented This cally s in 68 User Network :Poll- ^ u da ta ~L :ACK> :E0T> 5:} Communications With Other Devices cACK> < Acknowledgement 1> \ V Acknowledgement 1> l dr \ r 1 — ~- HI :ACK> "Til Communications With Other Devices :Poll; I Figure 5. SAR Model Definition for Retransmission Sequences terms of a given set of rules governing interaction; it will necessarily reject non-compliant transmission sequences. A level of protocol violation detection is therefore a natural byproduct of the design. 4.0 NM1 APPLICATION EXTENSIONS Syntatic protocol checking is insufficient when applications dependent information is needed. Semantic analysis is also required. It is increasingly common, in the data communications environment, for a terminal to access dynamically a diverse mix of applications via the same network connection. This diversity may result from multiple applications supported by a single host processing facility to which the user is connected, or perhaps because the terminal is connected to a resource sharing network. Regardless of the manner by which multiple services are available to the terminal, the user is often concerned with the discrete performance level of individual applications. Measuring the responsiveness and other characteristics of separate categories of system access involves additional content analysis beyond that precipitated by the need to distinguish SAR components from a specific protocol. To the extent that there is adequate identification data contained within the transmission sequences, this further categorization of measurement results can be achieved . 69 Identification of categories of system utilization was implemented in the* Network Measurement System for asynchronous use of a Univac 1108. Twenty-seven operating system commands were identified by state table syntactic analysis [4], Examples of such expanded workload profiling capabilities include: From the service consumer 's viewpoint *Relative responsiveness of one system or application versus another From the service provider's viewpoint *Relative frequency of usage for various categories of system services *Profiles characterizing the typical use patterns of each system service From either point of view *Terminal device productivity comparisons 5.0 BIBLIOGRAPHY [1] [2] [3] [4] Abrams, M. D. , I. W. Cotton, S. W. Watkins, R. Rosenthal, and D. Rippy, "The NBS Network Measurement System," IEEE Transactions on Communications, October 1977, pp. 1189-1198. Watkins, S. W. and M. D. Abrams, Interpretation of Data in the Network Measurement System, NBS Technical Note 897, Feb. 1976. Abrams, M. D. and D. C. Neiman, "The NBS Network Measurement Instrument," CPEUG Proc. of the 15th Meeting , pp. 201-211, October 1979, NBS Special Publication 500-52. Measurement of Computing Cotton, I. W., Interactive Methodology and Application, NBS Special Publication 500-48, June 1979. Add itional reporting implemented capabilitie Measurement balancing and choo distributed deployed w examples of will benefi prof il ing . data software to s within Instrum among term sing 1 proces ithin the analytica t from add reduction are now provide the Ne ent. Wor inal popula ocations sing func network 1 projects itional wor and being these twork kload tions for tions are which kload 70 Introduction to Data Communications System Performance Parameters Dana S. Grubb Institute for Computer Sciences and Technology National Bureau of Standards Washington, D. C. 20234 This paper is an introduction to a set of user-oriented data com- munication system performance parameters that will permit the user to specify, compare, and measure data communication service. The set of parameters is designed to be universal in application for any digital data communication system regardless of the control protocol or network topology used. This set of parameters is also selected to provide a comprehensive specification of data communication requirements. The parameters are the subject of a proposed ANSI standard. The parameters are based on a similar set of parameters contained in Interim Federal Standard 1033. The primary parameters are specific measures of speed (delay and rate), accuracy, and reliability associated with the three primary functions: access, transfer, and disengagement. Key words: Computer communications; computer networking; data communications; networks; performance requirements; telecommunications. 1. Introduction Data communication users need to be able to specify their data communications requirements in terms that are useful to them. Their requirements are for the transfer of information between users at a specified delay, rate, accuracy, and reli- ability. The user needs to be able to specify the requirements, compare available data communication services, and measure the service delivered. With appropriate user- oriented parameters, the user can avoid costly over-design by properly specifying exactly what is needed; reduce costs by comparing various data communication services; and determine how well the selected data communication service actually satisfies the requirements. This paper is an introduction to a set of user-oriented data communication system performance parameters that will permit the user to specify, compare, and measure data communication service. The set of parameters is intended to be applied universally to any digital data communication system regardless of the control protocol or network topology used . This set of parameters is also selected to provide a comprehensive specification of data communication requirements. Individual users may choose to omit parameters that they believe are not applicable to their needs or they may specify nominal values for those parameters. This paper is a contribution of the National Bureau of Standards and is not subject to copyright. 71 The parameters are the subject of a proposed ANSI (American National Standard Institute) standard. The parameters are re- lated to a similar set of parameters con- tained in Interim Federal Standard 1033 adopted by the General Services Administra- tion (GSA) in May 1979. 2. User /System Interface User requirements apply to data commu- nication service as seen by the end user. There ate normally two end users, each with its own interface to the data communication system. (An example of having more than two end users is the case of a message broadcast to all users on the system.) The end user may be a human operator at a terminal, a device medium (for example, punched cards in a remote job entry terminal), or an application program in a computer. From an end user viewpoint, the data communi- cation service requirement is the transfer of information between end users with specified delay, flow rate, accuracy, and reliability. This interface between end users and the system is termed the user/system interface. Typical interactions at the user/system interface are an operator keystroke on a terminal keyboard; the punching of a paper tape; or calling an operating system by an application program. The user/system interface contrasts with traditional DTE (Data Terminal Equipment) /DCE (Data Circuit-terminating Equipment) inter- face, which is between the terminal or com- puter port and a data modem or other circuit- terminating device. The DTE/DCE interface has been the point of interconnection between computer equipment and equipment provided by data communication service organizations. 3. User Information User information consists of all digi- tal information that is intended to cross both of the user/system interfaces. The user information bits are those bits used for the binary representation of the user information transferred from the source user to the system for ultimate delivery to a destination user. When user information is input as non-binary symbols (for example, operator keyboard entries) the user informa- tion bits are the bits used to initially encode these symbols. User information does not include over- head information. For example, parity bits and start-stop bits are overhead information used by the system due to its internal needs. The distinction is that only information crossing both user/system interfaces is user information. 4. Functions The three primary functions common to any data communication process between two end users are: access, user information transfer, and disengagement. The access function includes those activities that the originating user and the system must accomplish for the system to accept information for transfer to a destination user. Access starts with an "access request" and ends when the first bit or block of source user information has been transferred to the system. The term block denotes contiguous information bits grouped at the source user/system interface for transfer to a destination user as a unit. A block may be a single ASCII character, a card image, a computer word, or the informa- tion field of a frame, depending on the equipment and protocol characteristics associated with the user/system interface. The user information transfer function includes those activities that the system and the user must accomplish to transfer user information from the source user to the destination user. The function begins when the first bit or block of user information is input to the system and ends when the last "disengagement request" is issued. There are normally two disengagement functions, one for each of the two end users. Each function begins with the issuance of a "disengagement request" and ends on the issuance of a subsequent "disengagement con- firmation" or when that user is free to begin a new access. 5. Parameters The parameters are divided into two categories: primary parameters and ancillary parameters. The primary parameters are specific measures of speed (delay or rate) , accuracy, and reliability associated with the three primary functions; access, transfer, and disengagement. The primary parameters are shown on Table 1. There are also ancillary parameters that qualify the primary speed parameters by describing the extent to which the primary speed parameter values are influenced by user delays. The primary parameters are developed in two stens: (1) a set of possible outcomes 72 are defined for each function; and (2) appropriate probability, delay, and rate parameters are defined relative to each possible outcome. The outcomes are the possible end re- sults that may occur on any given attempt to perform. Using the analogy of the famil- iar voice telephone call on the dial tele- phone network, the access function could end in a correct connection, a wrong number, exchange busy, called party busy, or no connection. The bit and block oriented measures of speed are developed using the average of many consecutive trials. A quantity of bits called a Sample is defined by the standard to serve as a minimum number of bit trans- fer outcomes for meaningful measurement of parameter values. 5.1 Delay Parameters There are four delay parameters. Average Access Time and Average Disengage- ment Time are the average elapsed time for the successful outcome of the access and disengagement functions. Average Bit Transfer Time and Average Block Transfer Time are the average value of elapsed time between start of a bit or block transfer attempt and successful transfer of that bit or block to the destination user. 5.2 Rate Parameter User Information Bit Transfer Rate is the total number of user information bits successfully transferred to and accepted by the destination user divided by the time that is required for their transfer and accep- tance. In the case of message oriented systems (e.g., packet networks), this time is observed at both the source user/system and system/destination user interfaces, and whichever time is longer is used in deter- mining the transfer rate. 5.3 Accuracy Parameters Incorrect Access Probability is the ratio of total access attempts that result in incorrect access to total access attempts, exclusive of failures attributable to the user. Bit or Block Error Probability is the ratio of total incorrect bits or blocks to the total successfully delivered bits or blocks. A block is in error if one or more bits in the block are in error; or when some, but not all, of the bits in a block are lost or extra bits. Bit or Block Misdelivery Probability is the ratio of total misdelivered bits or blocks to the total transferred. A misdeliv- ered block is identified by the occurrence of the misdelivery outcome for any bit transfer outcome in a block. (Due to measurement difficulties, many users will choose to omit this parameter.) Extra Bit or Block Probability is the ratio of total extra bits or blocks to total received bits or blocks. An extra block is identified by the occurrence of the extra bit outcome for all bit transfer outcomes in the block. 5.4 Reliability Parameters Bit and Block Loss are considered to be reliability parameters. Bit or Block Loss Probability is the ratio of total lost bits or blocks to the total transmitted bits or blocks. A lost block is identified by the occurrence of lost bit outcomes for all bit transfer outcomes in a block. Another important aspect of reliability is system availability. The user may be denied service due to a system outage (e.g., a computer failure or communication line failure) or due to system blockage from an excess of communications traffic. It is quite helpful to the user to know whether a service denial is due to system outage or blockage. In the case of an outage, the user calls the service personnel. In the case of a blockage due to overload, the user waits and then tries again for service. Access Outage Probability is distinguished from Access Denial Probability by the failure of the system to issue any active interface signals. 6. Proposed Standard The ANSI work is being done by Task Group 5 of the ANSI X3S3 Committee and is contained in the working document X3S35/125. The X3S35/125 document differs significantly from 1033. It is expected that the proposed standard (if adopted by ANSI) will also be adopted as a joint Federal Information Processing Standard and Federal Telecommuni- cations Standard, replacing the current interim (and voluntary) standard. The current schedule calls for the proposed standard to be completed and sent to ANSI X3S3 in December 1980. Assuming this date is met, the time required for re- view, resolution of comments, and approval make early 1982 a reasonable estimate for approval by ANSI X3S3. Some additional 73 time will be required for approval as a joint federal standard. The standard that is currently under development identifies parameters that de- scribe system performance, but it does not define how values will be determined for these parameters. A subsequent ANSI standard is planned for specifying the measurement methodology. Current federal work includes support for the ANSI committee and a joint NBS (National Bureau of Standards) /NTIA (Nation- al Telecommunications and Information Administration) project on measurement methodology for the parameters. 7. Recent History There are currently two ANSI standards on data communication performance: X3.44 - 1974 (soon to be updated) and X3.79 - 1979. These standards provide parameters for specifying the flow rate, delay, and accuracy of transferred user information and availability of the service. These measures are user oriented; they specify measures of performance with regard to user information. However, they are not indepen- dent of system control protocol and are not always measured at the user/system inter- face. 8. Conclusions The parameters discussed offer the data communication user an opportunity to specify requirements in user oriented, system independent terms. The proposed parameters may be used to specify and compare data communication services, and to measure the service received. References (1) Interim Federal Standard 1033 , General Services Administration, May 1, 1979. (4) Determination of Performance of Data Communication Systems that Use Bit- Oriented Communication Control Proce- dures , American National Standard X3.79 - 1980. (5) Criteria for the Performance Evaluation of Data Communication Services for Computer Networks, N BS Technical Note 882, September 1975. (2) Digital Communication Performance Parameters for Proposed Federal Stan- dard 1033 , Volumes 1 and 2, NTIA- Report 78-4, U. S. Department of Commerce, May 1978. (3) Determination of Performance of Data Communication Systems , American National Standard X3.44 - 1974. 74 «/> I* < < ik ik RFORMANCE TIME LLOCATIONS z £ o * Z £ o * o z « o £ Z LU o s p ►* P N- p 1- P it! 2 * at ^ w J* 1 < Uj u O < < SERFR FACC at «- Ik vt * 5 So at £ uj * < V» U. £ LU < *** SJ» * «/» 12 * 3 O 3 O " 3 " 3 Q f= < at < a. >- at < Ul >- S < < z Ul o ovine Aini >- 3 >- 3 O ± >- >- 3 2 ■ ^. S i/t 2 -1 00 ^ < CO 3 < ** < o 2 * < 3 Ul Ul SO ui CD u to 2 at at u o X o t o O o < - u < Ul YPRO B A Bill BABILI DELIVERY OR PROBA CK PROBA IXI o at UJ u. Z Ul Ul Ul U z at 3 u U < >- IVER PROI PRO! in Z < o < o < at O ik < U 3 at *° uj at ^ O O 5 3 2 O £ 2 2" at t— z Ul S 5 O jo u O !£ «* «r £s2 * * < u u S 5 at bbS OOh Ul ss -* -i X a. oo oq uj 00 00 Ul Z ui l/t UJ Ul < ■^ at at Z > < ^ at 25 w ^ > £ £ < f= < £ < ™ 3 oo < O J= h- Z O Z o z Ul Ul u E at o z < £ < 3 ik Ul U u < USER INFORM TRANSF o z Ul wo 5 M 0> *■» 0J E (5 75 Computer Network Performance 77 User-Oriented Carrier Sense Multiple Access Bus Simulator Mar j an Krajewski The MITRE Corporation Bedford, MA 01730 This paper describes a general purpose, user^oriented Carrier Sense Multiple Access (CSMA) bus simulation program written in GPSS-V. Designed to fill a need to quickly and easily predict bus performance, it provides the capability to simulate local area computer communica- tion networks governed by one of the CSMA protocols. Simplicity, flexibility, and run-time economy were the key criteria employed in the formulation. CSMA protocols are a class of stochastic packet switching tech- niques which allow multiple independent network subscribers to share a single communications channel with reasonably high efficiency. All are based upon the original ALOHA protocol. They possess their great- est potential in meeting the communication needs of bursty, terminal- dominated networks operating over a limited geographical area. In general, these networks consist of low rate devices requiring extremely short response times . The simulator was designed to be quickly adaptable to different network characteristics. It is user-oriented in that all simulation parameters are documented and initialized in a single section of the program. Subscriber populations can range in size from few to many thousands of terminals. Outputs from a simulation run consist of the system characteristics input by the user and statistical channel performance information gathered automatically by GPSS. This includes the average throughput, subscriber traffic, and deferred and retrans- mitted packet statistics. Statistics concerning individual packet delays and overall end-to-end response times are formatted and output in the form of delay distribution tables. The program's primary application domain is the prediction and evaluation of those bus performance characteristics relevant to ini- tial local area network design decisions. These include packet size, bus capacity, and the general suitability of CSMA protocols. Key words: Bus Networks; Bus Performance; Computer Programs; Conten- tion; CSMA; GPSS; Local Area Networks; Packet Switching; Simulation. 1. Introduction eration characteristics, response time requirements, and physical separation, nu- The tremendous proliferation of digital merous techniques exist which can provide devices has spurred a great deal of diversi- the required connectivity. Most of these ty in the technology of their interconnec- techniques utilize some form of packet tion. Depending upon the device data gen- switching such that message packets, 79 consisting of data appended to a header, form the basic units of information transfer. For those networks limited in geographical extent and consisting chiefly of numerous, relatively inexpensive devices, the most cost-effective packet switching approaches are those involving a common shared channel. Networks characterized as groups of interconnected devices jointly sharing a common communications channel, or bus, are termed broadcast networks. Devices belonging to such networks interact first by gaining unique access to the bus through some multi- ple access scheme and then by broadcasting packets over it. Individual devices receive all packets but only accept those containing the proper addresses in the header. Due to the broadcast nature of the bus technique, time is the resource which must be shared. The potential for efficient time sharing of a communications bus is a major purported advantage of numerous multiple access tech- niques which have been developed in recent years . Due to mismatches between subscriber traffic characteristics and idiosyncrasies of the individual schemes, however, the overall system performance can be substan- tially less than efficient. It is impera- tive, therefore, that the prediction and evaluation of bus performance be a necessary step in the design of all such systems. Analytical methods are extremely useful in this regard where such models already exist and where a general estimate of bus performance is sufficient. For detailed estimates of specific network implementa- tions, simulation techniques provide sig- nificantly greater flexibility and thus, more accuracy. The simulation program described in this paper was designed to fill a need to quickly and easily predict characteristics of bus performance relevant to initial broadcast network design decisions. Sim- plicity, flexibility, and run-time economy were the key criteria employed in its formu- lation. Its range encompasses a class of stochastic channel access protocols known as Carrier Sense Multiple Access (CSMA) . 2. Bus Access Protocols A basic broadcast network architecture is shown in Figure 1. Individual subscrib- ers are connected to a Bus Interface Unit Figures in brackets indicate the literature references at the end of this paper. (BIU) through some device-dependent parallel or serial interface. The BIU buffers the data, packetizes it, and then transmits it onto an inbound data stream via a multiple access protocol. Every BIU listens to the outbound data stream continuously in order to detect packets intended for them. SYSTEM RATE Figure 1. Broadcast Networks Of the many multiple access schemes which now exist, [1,2]' each has advantages and disadvantages with different subscriber populations. Some techniques excel with steady, high speed users while others per- form best when servicing bursty, low speed users. In general, broadcast networks supporting steady users perform best when channel time is assigned to individual users. Bursty users, on the other hand, receive better service when channel time is unassigned and individual users "contend" for it. The basic advantage of contention- based access protocols is the ability to provide simple, reliable and flexible ser- vice to a predominately bursty subscriber population. Such networks are often dominated by low data rate terminal devices requiring extremely short response times at irregular intervals . The primary disadvan- tage of these protocols is their potential for unstable operation at certain through- put levels . The first application of contention protocols to computer communication networks was the ALOHA system at the University of Hawaii [3] . With ALOHA (Figure 2) , sub- scribers simply transmit whenever a message packet is ready. Overlapping transmissions from two or more subscribers always result in a collision and mutual destruction of the colliding packets. These are then retransmitted after random delays to pre- vent the same collision from occurring forever. Requiring all transmissions to conform to imaginary time boundaries (slots) improves upon this by forcing all colli- sions to overlap completely rather than partially. CSMA protocols [4] improve 80 performance still further by requiring all subscribers to sense the channel prior to transmitting and thereby avoid obvious collisions. Detection of a busy channel results in the immediate deferral of the transmission attempt. Due to the physical separation of individual subscribers, how- ever, a packet may be enroute without other subscribers realizing it. Another subscrib- er may sense the channel and, finding it empty, transmit its packet into a collision with the enroute packet. The "vulnerable period" when two subscribers may collide is equal to the propagation delay between their respective transmit and receive ports. As the delay increases, the performance of CSMA worsens and approaches that obtainable with the ALOHA protocol. As with ALOHA, sub- scribers involved in a collision detect that fact and then reschedule their packets with a random delay. The random retransmission delay is the fundamental controllable parameter in the design and operation of all ALOHA-based access protocols. Increasing the delay results in higher obtainable throughput at the penalty of increased sys- tem response time. WASTED TIME COLLISION MAXIMUM EFFICIENCY • 18% duration of the colliding packets plus prop- agation delays. Subscribers operating in an LWT mode do listen 'to their own transmis- sions and cease transmitting when a collision is detected. Collisions in this case tie up the channel only for the duration of the overlapping propagation delays. LWT opera- tion, therefore, is inherently more effi- cient than LBT. The CSMA protocols can be further dif- ferentiated into three sub-categories depending upon the specific deferral algo- rithm used. Subscribers sensing a busy channel may reschedule their transmission attempt a random time in the future regard- less of when the channel becomes idle. Otherwise, they may persist in sensing the channel and transmit with probability 1.0 when the channel becomes idle. Yet another option is to persist in sensing the channel until it is idle and then transmit with probability p by introducing a random delay. These algorithms are referred to as non- persistent CSMA, 1-persistent CSMA, and p-persistent CSMA, respectively. All six combinations of operating mode and deferral algorithm can be modeled with the simulation program. Acknowledgement transmissions over the common channel can also be simulated. WASTED TIME SLOTTED ALOHA LISTEN BEFORE TALK CSMA LISTEN WHILE TALK CSMA COLLISION MAXIMUM EFFICIENCY • 37% WASTED TIME COLLISION ^_j 1 >m/" MAXIMUM EFFICIENCY - SO 90% PROPAGATION DELAY (il WASTED TIME MAXIMUM EFFICIENCY >90- 99% "trd — P^= l"" *1 A B Figure 2. Contention Protocols CSMA protocols are divided into two general categories: Listen-Bef ore-Talk (LBT) and Listen-While-Talk (LWT). Subscribers operating in an LBT mode do not listen to their own transmissions. Collisions are detected only after the total packet has been transmitted and a positive acknowledge- ment has not been received within a pre- determined interval of time. This acknowl- edgement may be transmitted on either the common data channel or on a separate one. Collisions tie up the channel for the 3. Subscriber Populations The types of subscriber populations encountered in practice often can be broken down into two generic classes. One class consists solely of subscribers with similar data transmission characteristics. Examples of this might be a collection of distributed processors communicating among themselves or a large collection of terminals transmitting data to receive only devices for storage or display. A second class consists of numerous interactive subscriber pairs in which each complete message is followed by a reply. This situation can occur in a distributed time-sharing environment where users con- struct messages at terminals, send them to individual host computers, and await a response . It is also possible for groups of sub- scribers sharing the same channel to have dissimilar transmission characteristics. Prime examples are independent, interactive or non-interactive populations or an inter- active population where one or a very few host computers service a large number of terminals. In these cases, the populations must be modeled separately. 81 In all of the above cases, the traffic generated by individual subscribers is usu- ally bursty and therefore possesses non- deterministic properties. Similarly, the random aspects of the CSMA protocols also result in the need for a stochastic system model. Combining dissimilar populations in an analytical model is difficult, at best, whereby in a simulation model it is relative- ly straightforward. Such situations demon- strate the intrinsic value of a simulation capability. Two dissimilar populations may be modeled with this simulator. Their subscrib- ers may range into the thousands. Complete messages may consist of several packets with the last packet being shorter than the others if necessary. For convenience, the fixed time delays not associated with transfers over the bus can also be included in the simulation run. In addition, the capability exists to simulate a massive collision in which every subscriber attempts to transmit at time zero. This feature is extremely useful in measuring a system's response to its maximum traffic load and possible channel instabilities. 4. Simulation Model GPSS was selected as the programming language for the simulator primarily because of its power in defining systems governed by queues [5] . Developed at IBM in the 1960's, its underlying rationale was to provide a general structure for easily and rapidly converting discrete system models into com- puter programs. Since CSMA bus systems are, by their very nature, discrete queued sys- tems, GPSS offered many advantages. In essence, it allowed more productive time to be spent on model formulation and less on bookkeeping than was possible with non- simulation specific programming languages. The simulation program is composed of four main sections. An input section allows the user to initialize the various system parameters needed to perform the simulation. The system model section provides the GPSS entity description of the general model implementation. A control section controls the length of the simulation run and initi- ates the collection of those statistics which require a transient period prior to steady state operation. Finally, a report section formats the simulation output. The system model section is the most complex of the four. Referring to Figure 3, messages are created as transactions for each population of subscribers according to a joint exponential interarrival time distri- bution. The mean of this distribution is based upon the aggregate average interarrival time of messages to the system, considering all subscribers in that population. Once created, transactions are assigned appro- priate parameter values and sent on to a block which gives them a probability of entrance into the system. This probability is proportional to the fraction of subscrib- ers not in the process of sending a message or waiting for a reply. Those transactions which do not enter the system are destroyed. GENERATE MESSAGES CONTROL INFORMATION GATE MESSAGES TRANSFER TO Bill HI CSMA PROTOCOL RECORD PACKET STATISTICS FORMULATE REPLY TRANSFER TO FINAL DEVICE RECORD COMMUNICATION STATISTICS DESTROY TRANSACTIONS Figure 3. Simulation Model Once in the system, transactions, representing messages, are first transferred into the BIU in the specified time and then attempt the selected CSMA protocol. A successful transmission is recorded by com- puting the appropriate delay and throughput statistics. If the transmission represents the last packet in a message, a reply is generated if so desired. If it is not the last packet, the transaction again attempts the CSMA protocol. Replies are delayed for an appropriate formulation time before the CSMA protocol is attempted. Following reception of a complete message and its reply (if any), the appropriate BlU-device 82 message transfer time is added and the total communication statistics are recorded. The transaction representing that particular communication event is then destroyed and its memory space made available for new transactions. A complete communication event is termed a call and represents a basic unit of infor- mation interchange between network subscrib- ers. An example of this is an interactive terminal to computer communication sequence. In this case, a call is defined as those events occurring between the initiation of the transmission of the first character of the query and the reception of the final character of the response, by the terminal device. If a reply was not generated, the one-way message defines the call. If an acknowledgement is specified, the successful reception of the acknowledgement packet is required before the next data packet in a message or reply can be sent. Under heavy bus traffic loads, the loss of acknowledgement packets through collisions can, therefore, adversely affect the overall delays. Certain assumptions were made about the characteristics of the systems to be simu- lated. These are as follows: Al. There are no noise or propagation errors occurring on the channel. Retransmissions only result from collisions. A2. All packets involved in collisions must be retransmitted. A3. All transmissions are heard by every subscriber on the channel. Validation of the simulation model involved verifying its accuracy in repro- ducing predicted system behavior. In Figure 4, theoretical derivations of system performance [6] are compared with simulation runs of identical systems. For this partic- ular comparison there are 50 subscribers, the packet length is equal to 100 times the propagation delay, and the average retrans- mission delay (R) is equal to 20 times the propagation delay in one case and 100 times in the other. As is evident, both are in excellent agreement. SIMULATION RESULTS CHANNEL THROUGHPUT A4. The time to sense the presence of other transmissions on the channel and act on this information is negligible. Figure 4. Predicted Delay Comparisons Simulation Initialization A5. The new channel traffic generated by each population of subscribers is Poisson distributed. A6. New messages are not generated until previous ones are completed. A7. Acknowledgement packets are not themselves acknowledged upon reception. All are reasonable assumptions for modeling coaxial cable and fiber optic propagation media and terminal-dominated subscriber populations. The simulation program was designed to be readily adaptable to different bus sys- tems . It is user-oriented in that all simulation parameters are initialized in a single section of the program. The parameters are divided into five functional groups . Common system parameters define the general characteristics of the simulation operating environment and BIU operation. They include the CSMA operating mode and deferral algorithm desired, whether an initial massive collision is desired, the 83 channel data rate, the number of bits per packet, the mean propagation delay, the mean retransmission delay following a collision, the transient time, and the nominal steady- state running time of the simulation. Acknowledgement parameters define the characteristics of packets transmitted by a receiver back toward the sender to acknowl- edge a successful data packet reception. They include the number of bits per packet, the packet formulation time, and the waiting time which must pass before a transmitter assumes it must retransmit its original data packet. Reply parameters define the character- istics of reply messages generated by interactive populations. They consist of the reply formulation time, the number of individual packets making up the reply, and the number of bits in the last packet of the reply. Terminal parameters define the charac- teristics of the two individual subscriber populations. For each group, the parameters include the number of subscribers, the mean interarrival time between messages, the num- ber of packets making up each message, and the number of bits in the last packet of the message. 6. Simulation Outputs Outputs from the simulation program utilize information automatically collected for the particular GPSS entities used in the model. Statistics for delays and throughput are recorded only for the steady-state time period indicated during initialization. The average system throughput is a run- ning average of the overall channel through- put, including all successfully received data and acknowledgement packets. It is computed as the ratio of time spent in successful transmission to total simulated time, expressed as a percentage. Individual terminal population call traffic statistics are also collected. The statistics printed out include the total number of calls generated during the run, the maximum number of calls in progress at any one time, the final number of calls in progress at the end of the run, and the average number of calls in progress at any one time. Deferred or retransmitted packet sta- tistics for the simulation run include the maximum number of transactions in the appropriate queue, the number of transactions in the queue at the end of the run, and the average number of transactions in the queue at any one time. Delay information is output in the form of frequency tables. Packet delay and individual terminal population call response times are handled separately. Call response time is defined as the end-to-end response time encountered during a call. The pro- gram outputs the total number of events recorded, mean delays, standard deviation, upper limit of frequency classes, observed frequency, percentage of total entries in a particular frequency class, and cumulative percentage of total entries in a particular frequency class . 7. Applications The simulation program's primary appli- cation domain is the initial determination of packet size, bus capacity and suitability of CSMA protocols. Both analysis and simu- lation results indicate that CSMA techniques can become unstable if the bus traffic approaches some maximum value. When this point is reached, the channel becomes satu- rated with collisions, throughput rapidly decreases to near zero, and delays grow intolerably. It is imperative, therefore, that such a situation be avoided in all actual implementations . The user-oriented nature of the simu- lation program has been found to be extremely useful in determining if a proposed system design is operating near the saturation point. If such is the case, design changes could be made and evaluated quickly and inexpensively. Optimization of system response time is also facilitated by this approach. 8. Example An example demonstrating the simulator's utility is its application to the performance analysis of a local area network soon to be installed within the MITRE/Bedford complex. Termed MITRENET, the system is envisioned as a non-persistent LBT bus operating at a 1.152 Mbps rate over a coaxial cable plant. Multifunction terminals will use the net- work to access host computers providing word processing, scientific computation, and other services. Local printers will also be interfaced to these hosts via the network. The primary contribution to bus traffic loading is expected to result from the word processing function. Two potential terminal- host protocols were under consideration and needed to be evaluated. 84 The echo-plex protocol requires every terminal-initiated character to be trans- mitted as a small, separately addressed packet which is echoed back by the host as another separate packet. The block-transfer protocol requires characters in the terminal- initiated sequence to be echoed and buffered locally, and then transmitted together as fewer, larger packets. The former carried the promise of straightforward implementation while the latter promised significantly greater efficiency. An important question concerned the maximum population the bus could support under each. The projected network was modeled as in Figure 5. All word processing terminal-host communications were simulated by one inter- active (with replies) terminal population, and all host-printer communications by a second, non- interactive (without replies) terminal population. There was one printer for every four word processing terminals. Simulation results indicated that approxi- mately 280 word processing terminals and 70 printers was the maximum supportable popula- tion under the echo-plex protocol. Under the block-transfer protocol, over 400 word processing terminals and 100 printers could be supported before bus instability became a problem. recommended if the projected number of simul- taneously active word processing terminals exceeded 280. As stated earlier, the evalu- ation of any design changes would be a straightforward and relatively inexpensive task, requiring only that the appropriate input parameters be altered and the simula- tion rerun. 9. Conclusions The capability to evaluate bus perfor- mance via a flexible, user-oriented simula- tor is valuable insurance against potentially costly mistakes. By orienting program usage toward users not already familiar with the programming language, minimizing detail in the model's input requirements, and concen- trating on reducing run costs, simulators such as the one described here form powerful tools for broadcast network design. 10, Acknowledgements The author wishes to gratefully acknowl- edge the efforts of Mary A. Flood for her improvements to the simulator's output format, Norman B. Meisner for his consider- ably helpful reviews and comments, and Karen M. Fiorello for her patience and expertise in preparing this paper. NON-INTERACTIVE Figure 5. MITRENET Example The design choices presented by this analysis were twofold. Selection of the echo-plex protocol was recommended if the projected number of simultaneously active word processing terminals would not exceed 280. Selection of the block-transfer proto- col, an increase in bus capacity, or a switch to the LWT mode of operation was 11. References R. M. Metcalfe and D. R. Boggs , "Ethernet: Distributed Packet Switch- ing for Local Computer Networks," CACM, Vol. 19, No. 7, July 1976. Meisner, Norman B., et. al, "Time Division Digital Bus Techniques Imple- mented on Coaxial Cable," Computer Networking Symposium, December 1977. N. Abramson, "The Theory of Packet Broadcasting," TRB76-1, The ALOHA System, University of Hawaii, January 1976. L. Kleinrock and F. Tobagi, "Packet Switching in Radio Channels: Part I - Carrier Sense Multiple Access Modes and Their Throughput Delay Characteristics," IEEE Trans. Commun. , Vol. C0M-23, No. 12, December 1975. IBM, "General Purpose Simulation System V Users Manual," SH20-0851-1, Second Edition, August 1971, Rev. 1977. F. Tobagi and L. Kleinrock, "Packet Switching in Radio Channels: Part IV - Stability Considerations and Dynamic Control in Carrier Sense Multiple Access," IEEE Trans. Commun., Vol. C0M-25, No. 10, October 1977. 85 A Comparative Evaluation of Local Area Communication Technology R.L. Larsen, J.R. Agre, A.K. Agrawala Department of Computer Science University of Maryland College Park, MD 20742 The builder of a local area network is immediately confronted with the selection of a communications architecture to interconnect the ele- ments (hosts and terminals) of the network. This choice must often be made in the presence of great uncertainty regarding the available alter- natives and their capabilities, and a dearth of comparative information. This was the situation confronting NASA upon seriously considering local area networks as an architecture for mission support operations. As a result, a comparative study was performed in which alternative communi- cation architectures were evaluated under similar operating conditions and system configurations. Considered were: (1) the ring, (2) the cable-bus, (3) a circuit-switching system, and (4) a shared memory sys- tem. The principle performance criterion used was the mean time re- quired to move a message from one host processor to another host processor. Local operations within each host, such as interrupt service time, were considered to be part of this overall time. The performance of each alternative was evaluated through simulation models and is sum- marized in this paper. 1. Introduction As local computer networks become a viable alternative in operational computing environments, the selection of an appropriate intercomputer communications system design becomes one of the most basic design deci- sions made. The performance of the overall system can depend on making a proper choice, yet there exists relatively little informa- tion of a comparative nature by which to make such a design choice. This situation con- fronted a NASA study group at Goddard Space Flight Center, and led to a study performed by the University of Maryland Computer Sci- ence Department and summarized in this paper. The presentation is basically structured in four parts. Section 2 provides background information establishing the problem context. In section 3 the approach to the analysis is described, followed in section 4 by a brief description of each of the four alternative systems evaluated. The major performance results are summarized in section 5, and section 6 closes with a summary and some general conclusions. 2. The Local Network Environment The communications system of a computer network can be viewed as a subnetwork of the computing network. Its major responsibility is to enable the transfer of data messages (files, programs, status and control infor- mation) between computers on the network. To the communications subnet, the nodes of the network are viewed as functional "black boxes", which remove messages from the sub- network, and which provide messages to the subnetwork to be transmitted to other nodes. In a local network, the distance between the components of the system is assumed to be less than a few kilometers. Because of the small distance involved, high transmission rates with low bit error rates are achievable, 87 The subnet must supply an entry point interface to each of the network host nodes (computers). In reality this interface must handle address translation, transmission protocols, line speed conversion, error con- trol, routing decisions, flow control, and the buffering of messages. Address transla- tion is required to determine where a message received from the host node is to be sent or to determine if a message received from the communications subnet is destined for its attached host node. Transmission protocols determine how a message can be sent over a particular line, which line the message is to be sent on, and the message format. Line speed conversions are necessary when the in- coming and outgoing data rates are different as in the case of peripheral devices or host nodes. Error control includes methods that insure the message will be received correct- ly. Routing decisions determine which path a message should follow. Since buffering techniques are used in all of the technolo- gies studied, buffer management is an impor- tant function of the interface. In the evaluation reported here, address transla- tion, transmission protocols, and the buffering of messages were considered. The interactions between the interface (communi- cations processor) and the host processor were also modeled. It is felt that these interactions are likely to have a major im- pact on network performance. The models attempt to mimic the host to interface interactions as accurately as possible. Communication in the models is envi- sioned as being packet based. Data messages are created at the host computer. When a data message has been created at a source host, the source host divides the message into smaller messages of a pre-determined maximum size, called a packet. Using this method, each packet is independently routed through the communications network. Although packet switching involves increased cost (additional control information is required in each packet) and complexity (packets must be reformed into messages at the destination node) this method does provide rapid re- sponse time for short messages in the net- work as well as high throughput for longer messages[5] . 1 Packet switching also provides enhanced error control since each packet broadcast is usually shorter than an entire message would be. Another benefit of packet switching is that no host can monopolize the communications network for long periods of time. In this study, the communications medium was allocated on a packet basis. Figures in brackets indicate the li- terature references at the end of this paper. 3. Approach The major objective was to derive comparative performance measures by which a system designer could intelligently select an appropriate communications technology for a local computer network. To achieve this objective, the overall modeling framework had to be fixed, and a common set of perfor- mance measures determined. The performance measures which will be reported in the pa- per for each of the technologies include: mean message time in the system, communica- tions system throughput, and bandwidth utilization; as a function of message arrival rate and system configuration. An overall system modeling framework was defined which included configurations of 8, 16, and 32 host processors. A communi- cations traffic model was then developed to simulate the generation of messages by host processors for communication to other host processors. Figure 1 illustrates the model- ing framework devised for eight hosts. (^Host Figure 1. Modeling Framework From the perspective of each host processor, performance statistics were kept measuring the perceived performance of the communications subnetwork as a function of the arrival rate of messages to be trans- mitted. Given the structure of Figure 1, alter- native models of communications subnetworks could then be developed and evaluated within a uniform framework and comparative perfor- mance measures derived. The communications traffic model developed includes a Poisson message source within each host processor which generates messages with length uniformly distributed 88 between 300* and 5000 16-bit words. These messages are then broken up into fixed-length packets of 256 16-bit words (4096 bits). Message destinations are selected randomly and uniformly among the other host proces- sors. The rate of the Poisson sources is increased from low loading levels to communi- cations subnetwork saturation. Within this modeling framework, candi- date communication subnetwork designs were selected and modeled. Due to the complexity of the systems studied, discrete event simu- lation was determined to be the most ap- propriate modeling technique, and SIMSCRIPT II. 5 2 was selected as the simulation lan- guage. 4. Candidate Communication Systems Four communication alternatives were selected for detailed modeling: (l) the ring, (2) the cable-bus, (3) circuit switching, and (4) shared memory. For each alternative, if an operational version was known to exist, that design was typically chosen and, in some cases, modified slightly to take advan- tage of known design improvements. In this section a brief description of each alterna- tive is presented, including an overall high level description, a description of the hard- ware organization, and a discussion of the message transmission protocol employed. 4.1 Token-Controlled Ring 4.1.1 High Level Description A ring network is a communication sub- net in which host computers are intercon- nected into a single closed loop (ring) topology by means of a unidirectional trans- mission medium such as a twisted wire pair and appropriate interfaces to it [3]. A message, in our case a packet, circulates around the ring from source host to destina- tion host, with each ring interface forward- ing the packet to the next ring interface and then back to the source host. The scheme is displayed in Figure 2. The token-controlled ring has a unique 10-bit pattern, called a token, that con- tinuously circulates the ring. A Ring Inter- face desiring to transmit a packet must wait for the token to reach it. It then changes the token to what is called a connector (essentially a different 10-bit pattern), transmits its packet, and on receipt of the ACK (when the packet is returned to the "SIMSCRIPT 1 1. 5 is a product of CACI. transmitting host) restores the token pat- tern. A Ring Interface scanning the infor- mation flowing through it, will see a series of connector-packet pairs followed by a token that signals available space for transmission. 4.1.2 Ring Interface Description All computers are connected to the ring by devices called a Ring Interface. This interface is responsible for control and actual communication between host computers over the ring. This interface is composed of a transmitter, a receiver, a buffer for incoming packets, a buffer for outgoing packets, switches to control the routing of information through the interface and con- trol logic. The control logic is respon- sible for such functions as address recog- nition, error detection, and setting ACK flags. It also handles the DMA transfer of packets between the ring interface and the host computer. 4.1.3 Message Transmission Protocol Event sequence for the transmission of a message from CPU A to CPU B: (1) CPU A divides the message into fixed size packets and transmits them one at a time via DMA to the outgoing buffer of Ring Interface A. (2) Ring Interface A waits to receive the token and then changes the token to a connector, transmits the packet and follows the packet by a new token. (3) The packet moves around the ring to Ring Interface B. (4) If Ring Interface B's incoming buffer is empty, Ring Interface B copies the packet into its buffer and computes a checksum. If no errors are detected, the ACK flag is set as the packet is forwarded to the next Ring Interface. CPU B is then interrupted to inform it of a received packet, and a DMA trans- fer is initiated. (5) The packet continues around the ring until it reaches Ring Interface A, which removes it (and the leading con- nector) from the ring. If the ACK flag is set, Ring Interface A assumes suc- cessful transmission and informs CPU A. Otherwise go to step 2 to perform a retransmission of the packet. (6) If there are no more packets in the 89 message, CPU A copies the next packet into the outgoing buffer of its ring interface and goes to step 2. 4.2 Cable-Bus 4.2.1 High Level Description 'The cable-bus system modeled is a variant of the Ethernet random access scheme which was developed by Metcalfe and Boggs[7j . This technology makes use of a coaxial cable (Ether) as the communications medium along with a communication protocol that was spe- cially developed to work for cable based schemes. Unlike the Aloha random access scheme, Ethernet is designed to carry two- way traffic between several host computers. This modification enables the transmission of messages for a computer- to- computer, as well as a terminal -to-computer, computer-to- terminal organization. 4.2.2 Hardware Organization The cable-bus is designed to accomodate up to 256 hosts. Each host is interfaced to the cable by means of a communications processor (CP), Figure 3. The communications processor consists of a line tap, a transceiver, two buffers: one for incoming data, one for outgoing data, and control hardware capable of interrupting its host. The tap enables physical connec- tion to the Ether. The transceiver is re- sponsible for receiving and transmitting bits over the cable. The control hardware supervises the transfer of packets between the host and the buffers via DMA, carrier detection, truncated packet filtering, and collision consensus enforcement. Carrier detection refers to detecting if the cable is in use. Communications processors will never transmit when the ca- ble is sensed busy. During transmission of a packet, the communications processor moni- tors the cable, this is termed interference detection. Interference (a collision) occurs when the signal the communications processor is transmitting and the signal it receives are not the same. In this case the communications processor stops transmitting. Collision consensus enforcement insures that proper action is taken once a collision has occured. If a collision occurs, the commun- ications processor detecting this condition places a noise pulse on the line to make sure all other communications processors are aware of the collision. Next, each communi- cations processor which was involved in the collision begins a collision avoidance pro- cedure. This procedure is also known as "blocked mode". First, a pseudo random num- ber, uniformly distributed, is calculated in hardware. This number is then multiplied by the number of collisions which have oc- cured in trying to send the current packet. The communications processor then waits the period of time indicated by the resulting number before attempting to transmit again. The primary difference between the modeled cable-bus and Ethernet is that the modeled design incorporates what is termed an "in-buffer collision" scheme [2]. This refers to the case when a communications processor begins to receive a packet while its incoming buffer is full. The communica- tions processor causes an artificial col- lision which terminates the transmission. 4.2.3 Message Transmission Protocol Event sequence for transmission of a message from CPU A to CPU B: (1) CPU A divides the message into a se- quence of transactions called packets. (2) CPU A transmits the next packet of the sequence via DMA to the outgoing buf- fer of its communications processor (CP A). (3) CP A waits for the cable to be inactive and then transmits its packet, which results in one of the following: a. No collisions detected, packet suc- cessfully transmitted. b. A collision is detected (go to step 4). (4) Each CP involved in the collision goes into "blocked mode", calculates the wait period, waits that period, and then go to step 3. (5) All CPs begin wait period for the cable to once again be inactive. (6) CP B copies the packet for the cable into its incoming buffer, and performs a checksum on the packet. If the checksum is valid, go to step 7. Otherwise, do not send an ACK and wait for the packet to be retransmitted. (7) CP B transmits an ACK to CP A. CP B interrupts CPU B and a DMA transfer of the packet to the host is initiated. (8) CP A receives the ACK. CP A interrupts 90 CPU A to inform it that the packet was suc- cessfully transmitted (go to step 2). 4.3 Circuit Switching 4.3.1 High Level Description The Data Distributed Network (DDN) as developed by Data General Corporation (DGC) was designed to take advantage of circuit switching techniques to provide communica- tions links that operate at a rate of a few megabits per second [43. 4.3.2 Hardware Organization As developed by DGC, DDN is designed to accomodate up to 255 host computers. Each host is interfaced to the central switching facility of the DDN by means of an Adapter. DDN thus consists of a central switching facility called the Switch Matrix and as many Adaptors as there are hosts. The com- ponents of a DDN are shown in Figure 4. Each adaptor is controlled by a micro- processor. Under host computer direction, the Adaptor sends data to and receives data from the Adaptors connected to other compu- ters in the network. Data transfer between an Adaptor and its host computer takes place via DMA. Each Adaptor has two 256-word buf- fers, one for outgoing data and one for incoming data. The Switch Matrix consists of three units: (1) Diagnostic Unit - this unit is used to determine faults on the network and is capable of taking corrective actions to keep the network in operation. It is capable of communicating with any of the Adaptors in the network. (2) Data Routing Matrix - this is the actual circuitry which establishes the physical connection between any two Adaptors, or between the Diagnostic Unit and any Adaptor. This unit is de- signed to contain up to 36 full duplex links to permit up to 72 computers to communicate simultaneously. (3) Link Controller - monitors and controls the Data Routing Matrix. It cyclically polls all Adaptors, searching for one which desires to establish communica- tion. When the Link Controller finds such an Adaptor, it cyclically scans the communication links until it finds a free link. Next, the Link Controller determines whether or not the specified destination is currently "busy". If the destination is busy, the Link Control- ler can not process the request, the current Adaptor is skipped, and the Link Controller continues to search for another Adaptor that desires to esta- blish a communication. Otherwise, the communications link is established and the source begins transmitting its mes- sage on a packet basis. When trans- mission of the entire message is com- pleted the link is dropped. 4.3.3 Message Transmission Protocol Event sequence for transmission of a message from CPU A to CPU B: (1) CPU A divides the message into packets and places the packets at the end of its packet queue. When the first packet of the message reaches the head of the queue, it is transmitted via DMA to the outgoing buffer of Adaptor A. Adaptor A now begins to request com- munications capabilities. (2) The Link Controller reaches Adaptor A. If a link is available, the Link Con- troller then determines if Adaptor B is free. If Adaptor B is free the con- nection is made, otherwise the Link Controller resumes polling (Adaptor A is in a "suspended" state). (3) Adaptor A prepares for transmission. (4) Adaptor A transmits its packet. (5) If the incoming buffer of Adaptor B is empty, the packet is copied into the buffer, else it is lost. The micro- processor of Adaptor B is interrupted upon completion of the transmission. If the packet was successfully receiv- ed, the interrupt-handling routine of the microprocessor begins transmission of a 3-word packet ACK to Adaptor A, CPU B is also interrupted to inform it of a received packet. If the packet was lost, the microprocessor sends a NACK to Adaptor A. (6) On receipt of the packet ACK: a. The microprocessor of Adaptor A is interrupted. The microprocessor then interrupts CPU A to inform it of an ACK. If this was the last packet of the message, Adaptor A informs the Switch Matrix to drop the link. b. When CPU A is interrupted and pack- ets remain to transmit, a DMA 91 transfer of the next packet of the message to Adaptor A's outgoing buffer is initiated. c. When the entire packet is received by Adaptor A, this packet can now be transmitted to CPU B (go to step 4). (7) When CPU B is interrupted: a. CPU B initiates a DMA transfer of the packet from Adaptor B's in- coming buffer into its own memory. b. Once the entire packet is copied by CPU B, Adaptor B's incoming buffer is reset so that it can now receive the next packet. (8) If a NACK is received by Adaptor A, Adaptor A's microprocessor is inter- rupted upon receipt of the NACK which causes a retransmission of the lost packet (go to step 4). 4.4 Shared Memory 4.4.1 High Level Description It has been found [l] that the use of buffer memories to set up communication links between two computers is appropriate when the distances to communicate are not very large. This method is used by the Launch Processing System at the Kennedy Space Center of NASA. The Common Data Buffer (CDBFR) is a specially designed piece of hardware design- ed for intercomputer communication. This hardware permits a shared memory medium for transfer of messages between several CPUs. 4.4.2 Hardware Orgainzation CDBFR consists of a memory module, memory controller, CPU interfaces called Buffer Access Cards (BAC), and interrupt control logic. Orginally the design per- mitted up to 32 CPUs to communicate via a single CDBFR. Subsequently the design changed to permit up to 64 CPUs to communi- cate. Figure 5 illustrates the CDBFR configuration. CDBFR' s memory module consists of from 32K to 64K 16 -bit words of storage with a cycle time of about 250ns. The first IK of memory (page 0) is divided into 32 stacks each of 32 words, with one stack being as- sociated with each CPU attached to the CDBFR. The next IK of memory (page 1) is reserved for "block data transfer". When a CPU obtains access of Page 1, in block data mode, no other CPU can access Page 1. The remainder of the CDBFR memory is allocated into private write areas for each CPU. While a CPU can read any word of memory, it can only write into its onw private write area, and Pages 0,1. Each CPU has the re- sponsibility for allocating and managing its own private write area. Each CPU communicates to the CDBFR through a dedicated interface called a Buf- fer Access Card (BAC). The CPU sends re- quests for service, address, data and mode information to the BAC, which then provides this information to the CDBFR at the appro- priate time. From the CDBFR, the BAC receives data and error condition informa- tion resulting from read and write opera- tions into the CDBFR. The BAC then provides this information to the CPU. The BAC is also capable of transmitting interrupts to and from the CPU. The Memory Controller is responsible for granting access to the memory. Only one CPU (through its BAC), is permitted access to the memory module at a time. The controller must scan each BAC, starting with the lowest addressed BAC to find the next BAC desiring service. Once a complete scan of the BACs is made the controller starts again with the BAC with the lowest address. The CDBFR is capable of interrupting the CPUs for message synchronization. This interrupt stack control logic is activated when a CPU writes a word into the stacks of Page 0. When a word is written onto a stack the CPU which owns the stack is interrupted. Interrupts are postponed whenever the host computer's BAC is active with a data trans- mission. These interrupts are stacked in the CDBFR until the message transmission is complete. 4.4.3 Message Transmission Protocol Each CPU is permitted only one outstand- ing message to any other destination at a time. Messages are divided into packets of a fixed size. The private write area is divided into a fixed number of blocks, each the maximum packet size. Event sequence for transmission of a message from CPU A to CPU B: (1) CPU A divides the message into a se- quence of transactions called packets. (2) CPU A allocates a block of its private write area to the next packet of the sequence. CPU A then transfers the packet into this block. 92 (3) CPU A then places a word containing the address of the packet (in the CDBFR memory) on the interrupt stack of CPU B. (4) Next CPU B is interrupted by the CDBFR. CPU B then reads the packet address word off its stack. (5) CPU B copies this packet out of the CDBFR into its own storage area. (6) CPU B then places a word containing the original address of the packet onto the interrupt stack of CPU A. This serves as the ACK of the packet by CPU B. (7) The CDBFR interrupts CPU A and CPU A reads the word off its stack. CPU A than verifies that the address is the original address of the packet and so assumes that the packet was success- fully transmitted. (8) CPU A then releases the block contain- ing the packet and the packet transfer is complete. (9) CPU A then returns to step 2 to process the next packet of the message sequence. 5. Performance Comparison The major performance measures to be compared among the alternative systems are: a. mean message time in the system b. throughput capability c. bandwidth utilization efficiency To fully understand such a comparison however, one must consider the major differ- ences in design variables among the systems and, while being unable to factor their effects out, at least appreciate their influ- ence. Table 1 summarizes the two most criti- cal design variables for each technology: the transmission rate and the number of physical paths. Two key observations can be made. The transmission rate for the Common Data Buffer is an order of magnitude greater than the others, and DDN utilized multiple physical paths. Since DDN is designed to provide communication among many (up to 255) computers using up to 24 distinct data paths, the number of physical paths available were scaled down for modeling purposes to be con- sistent with the fewer (8, 16, and 32) num- ber of host processors, with the expectation that this provides one with more realistic insight into the operational performance of DDN. Furthermore, the original design for DDN utilized single-buffering in the adap- tors. This design was subsequently modified to include double-buffering. Results for each of these cases are reported subsequent- ly and demonstrate striking improvements for the double-buffered case. Figures 6, 7, and 8, and Tables 2, 3, and 4 illustrate the primary performance results. Table 2 is a display of message time in a system measured in seconds, versus the intensity of arriving messages, measured in megabits per second. To be noted first is the observation that the ring and the cable-bus have virtually indistinguishable performance characteristics, which are also rather insensitive to the number of host processors attached. What could have been expected to be 6 separate plots on Figure 6, then, become essentially only one. A slight difference is discernible in Figure 8, but its influence is barely observable in most performance measures. Six curves are shown for DDN. The first set (solid lines) display message time in a system for the single-buffered case, and the second set (dashed lines) display results for the double-buffered case. A significant performance improvement is ap- parent in the double-buffered case. It is apparent from these figures that double- buffering more than doubles the capacity of the DDN system. For each of the DDN con- figurations modeled, a processor to link ratio of 4:1 was maintained. The overall bandwidth available, therefore, increased by a factor of two going from eight to six- teen processors, and by another factor of two going to thirty- two processors. This study indicated that this bandwidth is allocated equally efficiently regardless of the number of links. Some saturation effects might be expected as the number of links grows beyond eight, but this was not explored. Unlike DDN, which can be configured with a variable number of relatively low speed links, CDBFR provides one very fast shared path. When configured with only a few host processors, then, the hosts are unable to provide enough traffic to keep the CDBFR busy. This is shown in Figure 4, where the efficiency (the percent of band- width provided which can actually be used) is shown to increase with the number of host processors. The data on which Figures 6, 7, and 8 are based are provided in Tables 2,3, and 4 respectively. Data was taken for light, medium, and heavy loading. The notation 93 "oo" in the tables denotes a saturation con- dition where an infinite queue of messages to be transmitted was assumed to exist at each host processor. 6. Summary and Conclusions ■ Four intercomputer communications systems have been described and their per- formance compared. While these systems utilize different architectures and operate at different rates and capacities, compar- able performance measures were stated and utilized to compare the systems. Existing technology provides bandwidth capabilities in the range of one to fifty megabits per second. Proven architectures allocate this bandwidth in different ways to host computers, with varying efficiency. The principal lesson to be learned from this study is to know the communication require- ments. Relatively modest ring or cable-bus systems can very efficiently provide rela- tively modest capabilities. If growth flexibility is a strong requirement, then, perhaps, a DDN-style system is the preferred approach, although this flexibility comes with a substantial price tag. For a rela- tively large number of hosts with a heavy communications load, short response time requirements, and a potential need for immediate access by each host to a set of dynamic system-wide parameters, a CDBFR-type system may be an attractive alternative. At the time this study was performed, a brief cost survey of the builders of these systems failed to reveal significant econo- mies of scale in going to larger, faster communications systems for local networks. It seemed to be the case that one could ex- pect to pay for ewery bit of the capability required. It is our hope that this performance analysis can be used as a basis for compari- son of most current and future communi- cation technologies appropriate for local computer networks. The communications traffic considered in the analysis of these systems was of the most general form. The designer of a local computer network must understand the specific communications traffic expected in a system being designed, however, and understand the peculiarities of it. It was our experience in doing this for a NASA case study that some traffic- dependent performance deterioration can be encountered. This case study and its im- plications are the subject of a companion paper to this one [63. Many people made significant contributions to this study.. In particular we would like to gratefully acknowledge the contributions of Karen Gordon, Rose McGinnis, Ray Bryant, and Diane Acaron. The computing support for most of this work was provided by the Com- puter Science Center of the University of Maryland. The research was supported in part by NASA Goddard Space Flight Center under grant No. NAS5-22878 and by the Air Force Office of Scientific Research under grant No. AF0SR78-3654A. References [1] Agrawala, A.K., Bryant, R.M., and Agre, J.R., "A Study of Shared Memory as a Communication Medium", Proc. Sixteenth Annual Technical Symposium: Systems and Software , National Bureau of Standards, Gaithersburg, Maryland, June 2, 1977, pp. 51-60. [2] Agrawala, A.K., Bryant, R.M., and Agre, J.R., "Analysis of an Ethernet-Like Protocol " , Proc. of Computer Network- ing Symposium , National Bureau of Stan- dards, Gaithersburg, Maryland, December 15, 1977, pp. 104-111. [3] Agrawala, A.K., Agre, J.R., and Gordon, K.D., "The Slotted Ring vs. the Token- Controlled Ring: A Comparative Evalu- ation", IEEE COMPSAC 78 , Chicago, Illinois, November, 1978, pp. 674-679. [4] Agrawala, A.K., Gordon, K.D., and Agre, J.R., "Performance Analysis of a Circuit- Switched Sommuni cation Scheme for Local Computer Networks", Techni- cal Report TR-645, Department of Computer Science, University of Mary- land, April , 1978. [5] Agrawala, A.K., and Larsen, R.L., "Distributed Computing - A Review", Technical Report TR-484, Department of Computer Science, University of Maryland, September, 1976. [6] Agre, J.R., and McGinnis, R. , "Com- parative Performance Evaluation of Several Local Distributed Computer Networks: A Case Study", Technical Report, Department of Computer Science, University of Maryland, November, 1979. [71 Metcalf, R.M. , and Boggs, D.R., "Ethernet: Distributed Packet Switch- ing for Local Computer Networks", Com- munications of the ACM , Vol 19, No. 7, July, 1976. 94 Table 1. Transmission Rate and No. of Paths Technology Transmission Rate on Each Path (Megabits/sec) Number of Physical Paths Ring 4 1 Cable-Bus 4 1 Common Data Buffer (CDBFR) 32 1 Circuit Switching (DDN) 3 2,4,8 | Table 2. Mean Message Time in System (in ms) .Number of Nodes ! System Arrival Rate 8 16 32 [(in Megabit/sec) L- .?.. 24. f 1.2 - 26. 26. Cable L__ ..1,8 35. - Bus ! 2.4 59. 57. 57. | T.T 104. - 3.0 - 58. 54. Ring .6 25. 24. 23. 30. 27. 26. 1.2 I 2 ' 4 f 87. 59. 49. 4.5 37. - 6.0 - 35. 36. Shared 9.0 r 75. Memory 12.0 12.6 - 49. 41. 362. - "16.8 - 83. 48. .6 17. - Circuit 1.2 - 17. Switched 1.8 ; 20. - (Double Buffered) 2.4 - - 17. 3Y6 j 37r 20: 7.2 - 31. 20. "14.4 - - 29. Circuit .6 44. - 1.2 61. 43. Switched 1.8 140. - ^Single 2.4 - 57. 46. Buffered) 3.6 - 107. 4.8 62. a. II ■ i i in m in ii " - 77-2- - 108. Table 3. Throughput (in 10 16-bit words) Numbe r of Nodes System Arrival Rate 8 16 32 (in Megabit/sec) .9 6.3 - 1.2 - 8.2 8.2 Cable 1.8 12.3 - Bus 2.4 16.4 16.4 16.4 2.7 18.5 18.5 18.5 i 3.0 - 20.3 20.3 1 _, oo 20.3' 20.3 20.0 ! .6 4.1 4.1 4.1 Ring 1.2 8.2 8.2 8.2 2.4 16.4 16.4 16.4 oo 18.6 20.4 21.3 4.5 36. - 6.0 - 42. 42. , Shared 9.0 62. - Memory 12.0 - 84. 84. 12.6 82. - 16.8 - 116. 116. CO 94. 160. 192. .6 4.1 - Circuit 1.2 - 8.2 - Switched 1.8 12.3 - (Double 2.4 - 16.4 Buffered) 3.6 24.6 24.6 - 7.2 - 47.3 49.2 14.4 - 98.4 CO 32.4 64.8 129.6 .6 4.1 _ Ci rcui t 1.2 8.2 8.2 - Switched 1.8 12.3 - (Single 2.4 _ 16.4 16.4 Buffered) 3.6 - 24.6 - 4.8 - 33.8 7.2 - 49.1 CO 13.9 28.1 54.6 95 Table 4. % Throughput Number of Nodes Cable Bus Ring Shared Memory Circuit Switched (Double Buffered) System Arrival Rate (in Megabit/sec) .9 1.2 1.8 2.4 2.7 3.0 CO .6 1.2 2.4 oo *TT" 6.0 9.0 12.0 12.6 16.8 Oo 1.2 1.8 2.4 3.6 7.2 14.4 oo ! Circuit Switched! (Single Buffered) .6 1.2 1.8 2.4 3.6 4.8 7.2 oo 16 32 25. 33. 33. 49. - 66. 66. 74. 74. 66. 74. 81. 01. 81. 01. 80. 16. 16. 16. 33. 33. 33. 66. 66. 66. 74. 82. 85. i 16. 31. 41. 47_. 11. 33. 21. 42. 58. 80. 11. 21. 42. 58. 97. 65. 33. - 66. 86. 87. 11. 33. 66. 86. 11. - 22. 11. 33. - - 22. - 33. 37. 38. 11. 22. 33. 36. Figure 3. Cable-Bus IILH ©— 0- IOOO FEET © 0— OAT* HOUTWO MATRIX 1 LINK CONTROLLER ~ 1 DIAGNOSTIC UNIT SWITCH MATRIX Figure 4. Circuit Switching C=> MEMORY MODULE 64 K WORDS 16 BITS/WORD 250 na.c Figure 5. Shared Memory Figure 2. Ring 96 MESSAGE TIME M SVS1EM „, ( SECOMH ) oz Figure 6. Comparative Performance 0OMJ2 IOOUI E SUFFERED) DOM 16 tUOUOt £ BUFFERED) OHM -12 ISIHSLE BUFFERED! -DON-* (DOUBLE BUFFERED) "ODN-M ISMKLC BUFFEREOI -RBW 5 CABLEBUS IX* SISWOLE BUFFERED) ASQHEBAIC COMUUMCATWN5 TRAFFIC ARRNAL RATE BKBABIIS FEB SECOND) Figure 7. Comparative Thoroughput umlZATON OF AVAL ABLE CGMMUMCATUMS BOX BANDWIDTH DON (DOUBLE BUFFEREOI ■ DON (SINGLE BUFFERED) NUMBER OF HOST COMPUTERS Figure 8. Comparative Efficiency 97 Performance Prediction Techniques - 1 j . :j5sss::R55;5 , ■■:•:■> ™: /.:;-: : : ■■*;: .,•;•■:■■:. 99 Some Properties off a Simple Deterministic Queuing Model Rollins Turner Digital Equipment Corporation Systems Performance Analysis Group Maynard, Massachusetts Abstract A simple queueing network with fixed service times is defined. This network might be an appropriate model for two asynchronous devices sharing the use of a single resource, where all service times are fixed. An algorithm permitting efficient computation of system behavior is described. Certain general properties of the system are determined. Key words: Fixed service time; queueing network. 1 . Introduction There are many components in computer systems that perform some action in a fixed amount of time. Examples include memory subsystems, synchronous busses, and communications links. When such components are included in a queueing network model of a higher level system they are most accurately represented as servers with fixed service times. Unfortunately, most of the results of queueing theory do not apply to fixed service times. Useful results can often be obtained by substituting a probability distribution for the fixed time but important characteristics will sometimes be lost. Operational analysis can be used in some cases , but again some important questions cannot be answered [1,2]. In this paper we examine one particularly simple queueing network with fixed time servers, and develop from first principles some of its properties. We mention in passing that work reported here was originally motivated by the need to explain some counterintuitive results from measurements on actual computer system components. The model developed in this paper does explain those results, which will be discussed later in the paper. The model is of some interest as a mathematical object in its own right. It has a number of rather surprising properties, including discontinuities in its response time functions. This model can provide some interesting tests for conjectures about queueing networks with no restrictions on service times, both in stochastic analysis and operational analysis. It should be considered as a potential source of counterexamples when a general theorem is under study. The remainder of this paper is divided into three sections. Section 2 defines the model and lists some of the questions to which it can be applied. Section 3 discusses the behavior of the system and how it depends on the service times. An algorithm is given by which one can efficiently compute the time from the completion of a wait until the next wait. Using this algorithm, one can compute essentially all quantities of interest about any specific system. In Section 4 we look at certain general properties of this model, treating the service times of the servers as variables. A graphical representation of systems in a two dimensional parameter 101 space provides some useful insights into the systems' behavior. 2. Definition of the Model The model that we shall examine is a simple closed queueing network with fixed service times. The network consists of a single central server, designated Server C, and two peripheral servers, called Server A and Server B. There are two customers in the system. Customer A visits Server A, then proceeds to Server C, where he may have to wait for service. After receiving service from Server C, Customer A returns to Server A. Customer B follows a similar pattern between Server B and Server C. Queueing can occur only at Server C, and the wait time can never exceed one service time for Server C. A diagram of the system is shown below in Figure 1 . Server A 111 Server B y+ Server C Figure 1 . Network Simple Deterministic Queueing This system might serve as a model for two 10 devices sharing a bus, or perhaps for a processor and a single 10 device sharing access to a memory. Because the service times are fixed, most of the results of queueing theory do not apply. We also note that the "routing homogeneity" requirement of operational analysis is not satisfied. The system is characterized by three parameters, the service times of the three servers. These times are called a, b, and c. The state of the system is specified by where each customer is and how long he has been there. Given the state of a system, its future behavior is completely determined. The time of all future events and the state of the system at any time in the future can be computed by means of a step by step calculation of system events. Given that the system is deterministic, one might ask whether there is anything interesting to be learned from a mathematical analysis. The answer is yes. This simple system turns out to have some surprisingly complicated properties, and there are a number of questions, possibly of practical importance, whose answers are not immediately obvious. Some of the questions discussed in this paper are the following: 1. Given a starting state, which customer will first have to wait and when? 2. What will be the service rate for each customer and utilization of each server? 3. How does a change in the service rate of one of the peripheral servers affect the overall system behavior? 3. Behavior of a Specified System We know that the future behavior of any given system can be determined by direct computation. Furthermore, after an initial transient, the system state is always periodic. To show this, we need to define two particularly useful system states. Let State A be the state that occurs when Customer A has just begun service at Server A and Customer B has just begun service at Server C. Similarly let State B be the state that occurs when Customer B has just begun service at Server B and Customer A has just begun service at Server C. One of these two states will inevitably follow whenever any queueing occurs. Suppose then that some queueing occurs. Since one of the two possible states follows each wait, a state is sure to be repeated after no more than two waits. Since the future behavior is determined given the state at any instant, the system behavior is periodic. If there is no queueing, the two peripheral servers must be synchronized, so that Customer A completes m cycles in exactly the same amount of time in which Customer B completes n. Hence after m 102 cycles by Customer A, and n by Customer B, the original system state will be repeated. Since the system behavior is periodic, we only need to compute the times of successive system events until we see an earlier state repeated. We know then that the sequence of events between the two occurrences of that state will be repeated indefinitely. However, it is possible to have an arbitrarily long sequence of system events between repetitions of a system state. It is desirable to have a computationally efficient algorithm for determining when a state will be repeated. Algorithm CYCLES, shown below, fills this requirement. The following paragraphs provide some background and motivation for the development of this algorithm. They do not, however, give a proof of its correctness . In the following discussion "a" and "b" are assumed to be greater than "c". (If either "a" or "b" is smaller than "c", the system is relatively straightforward.) Suppose we begin observing a system in State A. Let us define an "A-Cycle" as the sequence of events: 1. Customer A begins service at Server A. 2. Customer A completes service at Server A and goes to C. 3. Customer A completes service at Server C and returns to A. Similarly, let us define a "B-Cycle" as the sequence: 1. Customer B begins service at Server C. 2. Customer B completes service at Server C and goes to B. 3. Customer B completes service at B and returns to C. (Note that each type of cycle is defined as if we begin observing the system in State A.) If a B-Cycle ends within c time units prior to the end of an A-Cycle, Customer B will find Server C busy and will have to wait. Upon completion of service by Server C, the system enters State A. If a B-Cycle ends at a time greater than c and less than 2c prior to the end of an A-Cycle, Customer B will find Server C idle and begin service immediately. Upon completion of service by Server A, Customer A will find Server C still busy with Customer B and will have to wait. Upon completion of service for Customer B the system enters State B. Thus after beginning in State A, the queueing will first occur when a B-Cycle ends prior to the end of an A-Cycle by an amount less than 2c. To determine the first occurrence of either State A or State B following State A, we must find the smallest values of positive integers m and n that satisfy the equation: = m(a+c) - n(b+c) .£ 2c If m 1 and n 1 are the solutions we define the variable d as: d = m' (a+c) - n' (b+c) We see that d can be computed as REM(m*(a+c), (b+c)) where REM is the remainder of the first argument divided by the second. If d is exactly the system is conflict free, with State A returning to State A after m 1 A-Cycles. If d is in the range between and c, then State A returns to State A after m' A-Cycles, with Customer B waiting for d time units. If d is in the range from c to 2c, then State A goes to State B after m' services by Server A, with Customer A waiting for (2c-d) time units. Customer A makes m*-1 visits to Server C during this interval. By reversing the roles of a and b we can determine which state will follow State B and when, in the same way we have just done for State A. Algorithm CYCLES can be used to compute the value m' for any given values of a, b, and c. The inputs to CYCLES should be (a+c)/c and (b+c)/c. CYCLES returns the value of m' . From m' we can compute n' and d, from the above equations. Algorithm CYCLES returns a value that is the number of times Customer A will visit Server A prior to one or the other customer's having to wait, given that the system started in State A. Hence it gives the time from State A until the next occurrence of either State 103 A or State B, whichever comes first. To determine what happens following State B we can use the same algorithm reversing the two arguments. (The entire system is symmetrical in terms of A and B.) It can be seen that Algorithm CYCLES will always terminate with no more than L0G2( (b+c)/c) recursions. On each successive call, the second argument will be no more than half the value of the preceeding call. When the second argument is less than two, the algorithm terminates. Algorithm CYCLES (A,B) Let A1 = REM(A,B) If A1 i. 2 then Return 1 Else if A1 i. B/2 then CYCLES (A1 - REM(B,A1),A1) « B A1 Return Else Return CYCLES (REM(B,B-A1),B-A1) « B - 2 B-A1 4.0 General Properties Having seen how to determine the future behavior of any sepcific system, we turn our attention to general properties of all such systems. One question of interest is "For what values of a, b, and c is the system conflict free?" The answer is provided by the following theorems. Theorem 1 . For any real c if a = (2n-1)c b = (2m-1)c where m and n are positive integers then the system is conflict free. Proof: Suppose the system starts in State A. Then until the first conflict, Customer A's arrivals for service by Server C come at multiples of (a+c) minus c. Customer B's arrivals come at multiples of (b+c). Since both a and b are multiples of c, requests for Server C come only at multiples of c. Thus it is impossible for the first wait to occur due to a customer's arriving at Server C during service for the other customer. The only possibility is that both customers arrive at Server C simultaneously. This will occur if and only if there are positive integers i and j such that i(2nc) - c = j(2mc) or 2ni - 1 2mj But this has no solution, because the left side is odd and the right side is even for any choice of i and j. Hence no conflict can occur following State A. A similar argument shows that no conflict can occur following State B. For initial conditions other than State A or State B, a wait may occur. However, following a wait the system will have to enter either State A or State B. Hence after the initial transient no conflict can occur. [Q.E.D.] Theorem 2. For any real c if aO = (2n»-1)c bO = (2m'-1)c and if m* and n' are relatively prime, then they are the smallest positive integers, m and n, for which m(a0+c) = n(b0+c) Proof: Suppose there are smaller values m" and n" for which m"(a0+c) = n"(b0+c) then m"(2n') = n"(2m') But this is impossible since m" and n" were assumed to be smaller than m* and n' , and m' and n' were assume to be relatively prime. [Q.E.D.] 104 For a system described by the parameters in Theorem 2, the overall cycle consists of m' A-Cycles in parallel with n' B-Cycles. During this overall system cycle there will be no conflicts, and State A will lead to another State A after m'(aO+c) time units. However there will be one "near miss" , as shown by the next theorem. Theorem 3. Let aO = (2n'-1)c bO = (2m'-1)c with m' and n' relatively prime. Then at some point during the overall system cycle starting with State A, Customer A will arrive for service at Server C just as Customer B is completing service. Thus State A leads to State B at this point, and State B will lead to State A. Proof: Customer A's i'th request for service at Server C will come at time i(2nc)-c. Customer B's j'th request will come at time j(2mc). Customer A will arrive at Server C just as Customer B is leaving if and only if there are positive integers i and j such that the following equation is satisfied: i(2nc) - c = j(2mc) + c i(2nc) - j(2mc) = 2c in - jm 1 Since n and m are relatively prime, 1 is their greatest common divisor. We know from Euclid's algorithm [3] that there are positive integers i and j that satisfy the equation. Thus for the system described, State A leads to State B after i services by Server A. Reversing the roles of A and B we can compute the number of services by Server B after which State B leads to State A. During the entire system cycle there are no conflicts. [Q.E.D.] 4.1 Graphical Representation In characterizing systems in terms of the parameters a, b, and c, it is useful to have a graphical representation of the systems. In the following discussions we let our time unit be c. With this convention, we can represent any system by two parameters, a and b. Thus we can represent any system as a point on an a-b plane. What we have seen so far is that there is an array of conflict free points in the a-b plane. These are the points whose coordinates are positive odd integers. If the odd integers are relatively prime, the point has the additional property of directly specifying the number of A-Cycles and B-Cycles in the overall system cycle. Next we shall see that each of these special points is the beginning of a line, all points of which represent conflict free systems. Furthermore all systems represented by points on this line have the same pattern in their overall system cycles. In the following theorem we see how to determine the line segment of conflict free points given a minimal conflict free point (aO,bO). Theorem 4. Let aO and bO be given by aO = (2n-1)c bO = (2m-1)c where n and m are positive integers, and are relatively prime. Then the system represented by a1 = aO + & a b1 = bO + Ab with &a Ab is conflict free if and only if £>a and Ab are greater than zero. Proof: Starting from State A, until a conflict occurs, Customer A's i'th request for service at Server C comes at i(aO+&a+c)-c. Customer B's j'th request comes at j(bO+Ab+c). A conflict will occur if and only if there are positive integers i and j such that -c < [i(aO+£>a+c)-c] - [j(bO+Ab+c)] < c -c < [i(2nc+n£b/m)-c] - [j(2mc+Ab)] < c < (in-jm)(2c+Ab/m) < 2c 105 < in - jm < 1 + £b/(2mc) If &b > 0, this is impossible because the term on the right is a fraction between and 1 while the term in the middle is an integer. Thus the system represented by (a1,b1) is conflict free. If, on the other hand, -2mc 0) 120 r^\ H \J f= c 01 o 4-1 •H CO ■u >> nj en M 3 P. OC 3-H '4-t 4J 01 g CO o IS t->x: u {H 4J H fl) H Xi 4J oo <-a 1 co Xl -H o *o •n 01 e co 0) CO 4J Q) CO rH >% 01 CO t-l 1 *""* 1 cn i ^ 1 N— ' CO •H •o 01 XI CO O <4 i-> 0) £2 1 iH CO tH •o 01 CO t0 01 r-t 01 u •-* I M 01 jo o> o co in I „ M e co 0) t-l 4J >rj « g 1 CN v! 01 0) co H C H 0) 1 *"** 1 rH M 0) 01 M-l CO O 1 M •o to C -H 0) *o i clock iXt time 60 e •H P. P. CO 3 CO •a c 0) - advance to n< event l 4*" \ o 1 ^ 1 ^"* 3 PM U XI 0) o oi •n co 0) 1 ** i-H *\ S3 cu CJ 0) CO CO 0) iH 01 u \ ) S CO <4H 1 01 60 CO a. •o c a> - o 1 H 1 N^ > § « AS ° 3 H C H 01 1 y "" V 1 S ~"' X> O i->H CO X! > O -H 4J U CO M X> CO >—• 4J i-i a CO CO »»H 1 4-1 01 l-i 60 CO CO ■u a. CO 1 - r~\ v_y CO n 4J CO CO 4-1 cu C 9 ai cr > 0) w n !>. X» Xt o i-> rH 01 co TJ 5s co 1 01 •• Xi S~\ 4-1 i-H cm «tH ^^ 01 O •• 6 4-> u •H U 0) 4-1 CO ■H 1 60 x: rH M a a B C -H 5 4J cu •H o. o C 4J X ex rH CU CO 4-1 CO pC4 > >» 3 01 1 t>i 0) O 4J CO a •H at :=> CO CO c 6 s u g Pu, I-H 01 •H cu 3 O Q > 4-1 4J CO CO CO 3 4J 4J 4-1 01 01 >N • >4-4 CO CO CO rH O CO CM 0) CU at 00 a 3 3 X* CO W XI O* o* cr o Xi m 4J 01 01 at T-) o CO S3 M u rl •t-l 4-1 o c ^ u M o o 4-1 &H H cO 4-1 C>4 4J •^ s~\ /-N CO H CO o CM en m XI c t-H rH rH >-• — f x-^ • • • • CO M 0k A • • /^N •-N 4J /— \ •*"N ^-s y-*s 00 o\ c CM CO Irt oo rH t-A at > w -5.2 s oi co 4J XI CO o CO 121 A record node has the following format: (pointer) predecessor link user- job number successor link where (pointer) is the record number For example, (201) Dummy queue-head 1 f (202) 15 QH 201 — • Dummy queue-head 2 ill 201 f (209) 1 Ilii 9 V7/S\ Dummy queue-head 9 (15) 202 3 202 To put the above queues in the form of an array, we have: Record nodes dummy queue-heads RECORD NO. PRE 15 JOB NO. sue 201 1 2 1 2 201 . 202 3 202 2 //// 1 15 //// 15 209 ///// 209 FIGURE 3. Doubly-linked queue structure in the model. 122 4(a) Generation of a random variate x from a discrete C.D.F.. fr is a random number uniformly generated between and l.J x x' FIGURE 4(b) Example of the error of approximation by the discrete C.D.F.. (x' is the estimation of x.) *-t 123 Job Arrival ^ CM- queue CPU •■DISK "Job Completion FIGURE 5 A simple 1-CPU, 1-disk uniprogramming system. Job Arrival CPU DISK disK^ rotational seek delay ' r Job Completion FIGURE 6 A simple system with disk-seek and latency time. 124 1.5 - 1.0 - 0.5 - A : System overhead CPU time in 100 seconds, (total time = 1,000 seconds) "*"• Swapping rate due to time-slicing, in 100 times per second. 0.0 0.01 0.02 0.05 0.08 0.1 0.3 0.6 1.0 FIGURE 7( a ) The system overheads for various quantum sizes. Quantum Size (sec) 1.5 - 1.0 _ 0.5 _ 0.0 *■-- ,-./ °" Turnaround time in 10 seconds. *: CPU-queue length in number of jobs. 0.01 0.02 0.05 0.08 0.1 0.3 0.6 1.0 Quantum -*Size (sec) FIGURE 7(b) Turnaround and CPU-queue length for various quantum sizes, 125 * O* 'furnaround time in 10 seconds. x: Response time in seconds. ^MPL 13 5 7 9 11 13 FIGURE 8(a) Turnaround and response time under different Multi-Programming Levels . 1.0- 0.5- ^ ; Throughput- in 1,000 jobs per hour. ■* ' CM-queue length in 100 jobs. 11 13 MPL FIGURE 8(b) Throughput and CM-queue length under different Multi- Programming Levels. 126 i.o- 0.5" I: CM utilization 2 : CPU utilization 3; Channel utilization - - - _» 3 MPL 13 5 7 9 11 13 FIGURE 8(c) System utilizations under different Multi-Programming Levels, 28.0- 20.0 10.0' O : Paging rate in pages per second. A: Channel overhead in 10 seconds. CPU overhead in 10 seconds. 0.0 ■ — ' ■ ' ' 1 " *- MPL 13 5 7 9 11 13 FIGURE 8(d) System overheads under different Multi-Programming Levels. 127 Utilization 0.5- 0.4. 0.3 0.2- 0.1« 0; CPD utilization 4 : CM utilization x ; Channel utilization No. of °'°it IB* 20" 25 30 35 40" 50 60 FIGURE 9(a) System utilizations vs. number of terminals Termin als 5.0- 4.0" 3.0 2.0 1 1.0- 0.0 o: Response time in seconds. *-No. of interactions completed, in thousands. ^ Swapping rate, in jobs per seconds. 4.:CI4-queue length in no. of jobs. No. of Terminals 10 15 20 25 30 35 40 50 60 FIGURE 9(b) System overheads and response time vs. number of terminals. 128 Optimal Selection of CPU Speed, Device Capacities, and Allocation ot Files with Variable Record Size Kishor S. Trivedi and Robert A. Wagner Department of Computer Science Duke University Durham, N.C. 27706 This paper extends a previous model for computer system configuration planning developed by the authors. The problem is to optimally select CPU speed, device capacities, and file assignments so as to maximize system throughput subject to a fixed cost constraint. In our earlier paper we assumed that the record sizes for all files are equal, the block sizes of all devices are equal ana these two in turn are equal. In this paper we extend our earlier results to allow each file a distinct record size and each device a distinct block size. Key words: Capacity planning; configuration planning; file assignment problem; optimization; performance evaluation; performance- oriented design; queuing networks. 1. Introduction This paper is an extension of a model for computer system configuration planning developed in [1]. We are interested in optimally selecting the CPU speed, the capacities of the secondary storage devices, and the allocation of a * This work was supported in part under the National Science Foundation grant number MCS 78-22327 and the National Library of Medicine Program Project grant number LM-03373. given set of files across the secondary storage devices. The objective of optimization is to maximize system throughput subject to budgetary limitations. In [1] , it was shown that any relative maximum of this complex nonlinear programming problem is also its global maximum. A technique for significantly reducing the computational complexity of the optimization problem was developed. Finally, an interesting subproblem, known as the file assignment problem, was also discussed . 129 discussed . Two simplifying assumptions were made in our earlier model. The first assumption was that all secondary storage devices use the same fixed block size. In practice, block size is likely to vary with the type of device. The second assumption was that a logical file I/O request always generates a physical I/O request. This assumption may nothold in practice. Suppose that the logical record size of a file is smaller- than the physical block size of a device so that several records are packed into a single physical block. Then a logical I/O request will give rise to a physical I/O request only if the block in which this record resides is not already resident in main memory. Section 2 develops an extension of our earlier model assuming that files are accessed in a sequential fashion. Each device may have a different clock size and each file may have a different record size. Any relative maximum of the modified optimization problem is also its global maximum. This optimization model has one variable per file-device pair (giving the amount of file i to be loaded onto device j). The resulting problem has too many decision variables to be solved quickly. However, we have developed a simple file loading rule which reduces the number of decision variables from one per file-device pair to one per device (specifying the size of that device). The file loading rule is then used to ietermine exactly what parts of each file are to be used to fill each device. This loading rule is oeterministic - no elaborate computation is needed to use it. We have shown in [1] that this file loading rule is always optimal. This rule is presented in more detail in Section 2. Section 3 presents an example of use of our model. We a several parameter capacitie conf igura assignmen system i queuing n type [2]. re interested in de hardware conf s (CPU speed and s) as well as tion parameters t) . The throughp s modeled using etwork of the centr This is in con termining iguration device software (file ut of the a closed al server trast to several efforts ignoring queuing delays [3,4]. Other authors have used open queuing network models which are more suitable for computer-communication network design [2,5,6,7]. For hardware configuration planning, several authors have used decision models of closed queuing networks [8,9,10,11,12,13]. Similarly, several models of file assignment are available [14,15,16]. Apart from our earlier paper [1], the work closest to the present paper is that by Arora and Gallo [17]. Our model may be thought of as a refinement of their model that does not require the approximate decomposition into two submodels that they proposed. In fact, we derive an exact decomposition of the design problem. 2. The Basic Model Inputs to our model consist of workload parameters and device cost/performance characteristics. The workload is specified by the resource requirements of a typical program. Let W be the total number of instructions required, on the average, to complete the execution of a program. There are f files numbered 1, 2, ..., f. Associated with file i is its logical record size r. (in bytes), its total size S. (in bytes) , and an activity n. where n. denotes the total number of logical I/O requests directed to file i, on the average, during the execution of a typical job. There are m secondary storage devices and the cost of device j (1 < j <_ m) is assumed to be proportional to its capacity K. (in bytes), that is, COSTDEVICE- = C- K . . The total cost of the I/O devices is given by m F(K) = 2 C K.. The block size of j = l 3 3 device j is d . bytes. The CPU cost F. is modeled by a power function of its speed b n (in MIPS), so that 130 W = c o b o • We assume that for each j (1 * d. t tor eacn j h i (1 < i < f), < j < m) and each i is exactly divisible by r^ and that files are accessed sequentially. Since d. / r. logical records of file i are stored (blocked) on a single physical block of device j, the probability that a request for a logical record of file i on device j generates an I/O request to device j Let X. . denote is r i / dj ( < 1) the number of bytes of file i assigned to device j. Thus a fraction X^^ . / S i of all logical requests to file i will be directed to the portion of file i residing on device j. Only r^ / d- fraction of these requests generate a physical I/O request to device j. Therefore, the total number of physical I/O requests, W . , directed to device j is given by f X Ai r ± W j = * n i S^cT J 1=1 l j Let t- denote the average service time (to fetch one block of size d.) from device j. Then y- = W- t- is the total time demanded by a typical program on device j. Let t Q be the average time to execute an instruction so that t n = 1 / b n « Then typical program. The average service times for the I/O devices are assumed to be fixed parameters in this model while the CPU instruction delay t n is a control variable. In order to make the queuing model computationally tractable, we assume that if the scheduling discipline of a device is FCFS (First Come First Served), then the service time distribution is exponential. Any dif ferentiable service time distribution is permitted provided the scheduling discipline is either PS (Processor Sharing) or LCFS-PR (Last Come First Served Preemptive Resume) [18]. The number of jobs circulating in the central server model will be denoted by n and is also referred to as the degree of multiprogramming. We consider n to be a fixed parameter but a discrete search for the optimal value of n can be performed if desired. The throughput of the system is given by U Q / y Q [19; pp. 252-253], [20] where Un = yn G(y;n-1) / G(y;n) and m k • m G(y;n) = z{ H y - J | s k,=n } j=0 J j=0 J Then the throughput T(y;n) is given by T(y;n) = G(y;n-1) / G(y;n) . Instead of maximizing throughput, we will set up our problem to minimize the reciprocal of the throughput, denoted by z(y;n). Now the problem of the optimal selection of the CPU speed, device capacities, and file assignments (CSDCFA) can be stated as: ^0 = W fc is the total CPU time required by a 131 CSDCFA Problem: .in z(y;n) (la) With an appropriate change of variables, the optimization problem (1) above reduces to problem (9) in [1] . Therefore, the following results are easily derived: Subject to ( i=l, 2 , . . . , f ; j=l,2,...,m) THEOREM 1 m F o (w o t o ) + F(K) - C0ST (lb) (lc) (Id) t . f y j = 4 i=i ni X ^ ri 7 Sj - (le) t > ° yj > 0, Kj > X ij > ° (If) dg) (lh) The constraint (lb) assures us that the capacity of each device is not exceeded by the file assignment, and the constraint (lc) guarantees that every byte of every file is placed on some device. The CPU delay t Q , aevice relative utilizations y-(l < j <_ m) , the device capacities K.(l < j < m) , and the file assignment variables X. .(1 < i < f,l < j < m) are the i] — — — J — decision variables for a total of rn*(f+2) + 1 variables. The values of V dj(l < j < m) , s i' n i' r. (1 < i < f ) , W-, n, m, f, and COST are assumed to be fixed parameters for this problem. The CSDCFA problem (1) is a convex programming problem (assuming that c( n > ) , hence any relative minimum is also its global minimum. It should be noted that X i -'s, K-'s, and Sj^'s are now measured in bytes rather than blocks as in [1]. Similarly, C. is the cost per byte of device j. A special case of problem (1) is the file assignment (FA) problem where the device capacities K 1 ,K 2 ,...,K m and the CPU delay t Q are fixed parameters. As a corollary to theorem 1, we obtain the convexity of the FA problem. COROLLARY 1: problem is problem. The file assignment a convex programming Any solution to the CSDCFA problem can be shown to have an ordered characteristic as in [1] . Specifically, files may be ordered by non-increasing n. / S- value (access probability). These ordered files can then be assigned to devices, filling each device in non-increasing order of per-byte cost (C •) . We have shown that every optimum solution necessarily has this form. Since this form involves as decision variables only the device capacities, we are able to reduce the number of decision variables from m*(f +2) + 1 to 2m + 1 . 132 3. An Example We now consider a numerical example illustrating the use of our model. Assume that there are 10 files with the parameters as specified in Table 1. The total number of instructions to be executed per program is one million, the budget is 3 million dollars, and the main memory cost per degree of multiprogramming is assumed to be $50,000. The input data for the I/O devices and the CPU is given in Table 2. The block sizes of the two devices are d, = 3,840 bytes and d„ = 1,920 bytes. This latter, storing one block per f moving one device from the most believe that the cost ov throughput d 1%. extra cost is the cost of additional maximum-size ile, plus the cost of block per file and per the least expensive to expensive device. We , for practical problems, errun is trivial, and the egradation is less than The model reported in this paper was implemented as a computer program by Dr. A.K. von Mayrhauser. The design model was exercised for each degree of multiprogramming starting from 1 up to 8. The resulting optimal values of throughput, device capacities, and CPU speed in MIPS are given in Table 3. Scanning the table we determine the optimal degree of multiprogramming to be 7 . The optimal file assignment, though an output of our model, is not reproduced here. The total CPU time n 370/165 to compute t values above was 1.68 sec many design options, variations in the numb speeds of, I/O devi attempted; the detailed device capacities, CPU sp of multiprogramming placement can be done au at low cost, using this technique. eeded by our he solution ends. Thus, such as er of, and ce can be cnoice of eeds, degree and file tomatically , optimization Note tnat device capacities and file allocation variables are assumeo to be continuous variables in this paper while in practice they are discrete. The effect of this assumption has been analyzed in [21] and will be further explored in a future paper. Briefly, these papers show that discretizing the continuous solution according to our methods may cause a throughput degradation of a factor < 1 + (m-l)max {n./S.j , and a cost i overrun o *(f*C max v max of at most [1] [2] [3] [4] [5] [6] References K. S. Trivedi, R. A. Wagner, and T. M. Sigmon, "Optimal Selection of CPU Speed, Device Capacities, and File Assignments," JACM , July 1980. L. Kleinrock, Queuing Systems Vol . II ; Computer Applications , Wiley-Interscience, New York, 1976. C. V. Ramamoorthy and K. M. Chandy, "Optimization of Memory Hierarchies in Multiprog rammed Systems," JACM , Vol. 17, No. 3 (July 1970), pp. 426-445. C. K. Chow, "On Optimization of Staging Hierarchies," IBM Journal of Research and Development , Vol. 18, No. 3 (May 1974) , pp. 194-203. K. M. Chandy, J. Hogarth, and C. H. Sauer, "Selecting Capacities in Computer Communication Systems," IEEE Trans , on Soft . Eng . , Vol. SE-3, No. 1977) , pp. 290-295. 4 (July < f+m >*( C max- C min>> S. T. Chanson and P. S. Sinha, "Optimization of Memory Hierarchies in Multiprog rammed Computer Systems with Fixed Cost Constraint," Technical Report, Department of Computer Science, University of British Columbia, 1979. [7] S. Mahmoud and J. S. Riordan, 133 [8] [9] [10] [11] [12] "Optimal Allocation of Resources in Distributed Information Networks," ACM Transactions on Database Systems , Vol. 1, No.~T (March 1976), pp. 66-78. W-W. Y. Chiu, "Analysis and Applications of Probabilistic Models of Multiprog rammed Computer Systems," Ph.D. Dissertation, Department of Electrical Engineering, University of California, Santa Barbara, California, 1973. D. Ferrari, Performance Computer Systems Evaluation, Prentice-Hall, Englewood Cliffs, New Jersey, 1978. S. K. Kacnhal and S. R. Arora, "Seeking Conf igurational Optimization for Computer System Configuration Planning," Proceedings ACM Annual Conference , 1975, pp. 96-101. K. S. Trivedi and R. E. Kinicki, "A Mathematical Model for Computer System Configuration Planning," in Performance of Computer Installations , D. Ferrari (ed. ) , North-Holland, Amsterdam, 1978. K. S. Trivedi and T. M. Sigmon, "A Performance Comparison of Optimally Designed Computer Systems with and without Virtual Memory," Proceedings , 6th Annual International Conference on Computer Architecture, 1979. Vanderbilt University, Nashville, Tennessee, 1977. [16] T. G. Price, "Probability Models of Multiprog rammed Computer Systems," Ph.D. dissertation, Department of Electrical Engineering, Stanford University, Palo Alto, California, 1974. [17] S. R. Arora and A. Gallo, "Optimization of Static Loading of Multilevel Memory Systems," J ACM , Vol. 20, No. 2 (April 1973) , pp. 307-319. [18] K. M. Chandy, J. H. Howard, and D. F. Towsley, "Product Form and Local Balance in Queuing Networks," J ACM , Vol. 24, No. 2 (April 1977), pp. 250-263. [19] P. J. Denning and J. P. Buzen, "The Operational Analysis of Queueing Network Models", ACM Computing Surveys , Vol.10, September 1978, pp. 225-261. [20] T. Giammo, "Extensions to Exponential Queueing Network Theory for use in a Planning Environment," Proceedings of the IEEE COMPCON , Fall 1976. [21] R. A Wagner and K. S. Trivedi, "Hardware Configuration Selection Through Discretizing a Continuos Variable Solution," Proceedings of 1980 Int . Symp . on Comp . Per . Mod . Meas . and Evaluation, Toronto, Canada. [13] K. S. Trivedi and R. A. Wagner, "A Decision Model for Closed Queuing Networks," IEEE Trans , on Soft . Eng . , Vol. SE-5, No. 4 TJuly~T979) , pp. 328-332. [14] D. V. Foster and J. C. Browne, "File Assignment in Memory Hierarchies," in Modeling and Performance Evaluation of Computer S"ys tern's^ Beilner and" Gelenbe (eds.), North-Holland, Amsterdam, 1976. [15] D. V. Foster, L. W. Dowdy, and J. E. Ames, "File Assignment in a Star Network," Technical Report 77-3, Systems and Information Science Department, 134 Table 1: File Activity Profile record size No. of records Av . Activity Count r t (bytes) s i/ r i "i file 1 480 84,000 2 320 125,000 3 480 84,000 4 960 84,000 5 960 521,000 6 320 188,000 7 320 500,000 8 320 625,000 9 320 625,000 960 1,125,000 13.86 14.85 9.9 7.92 28.71 4.95 3.96 3.96 3.96 6.93 Table 2: Device Input Data Device #| Label Cost Curve |coeff. exponent Speed 1 fc j Controller | Cost 1 | CPU |$1,147,834 0.55309 1 Variable 1 1 | I/O |0. 0030575 1.0000 | 0.00378 1 $24,496 2 | I/O |0. 000155 1.0000 | 0.02938 |$19,910 Table 3: Output of the Design Model degmul TPUT n jobs/sec Capacity Capacity CPU speed w l "2 (million bytes) (million bytes) 1 0.986 201.28 2203.2 2.599 2 1.293 201.28 2203.2 2.479 3 1.423 212.12 2189.48 2.29 4 1.487 214.68 2186.92 2.161 5 1.518 209.88 2191.72 2.07 6 1.532 205.8 2195.8 1.999 7 1.535 201.28 2203.2 1.92 8 1.528 201.28 2203.2 1.816 135 Performance Prediction Techniques - 2 137 MVS Performance Prediction Using Mechanically - Generated Queuing Models R. J. Feil, New York Telephone, New York, NY B. A. Ketchledge, AT&T, Piscataway, NJ A central issue in capacity planning for IBM MVS environments is the prediction of system performance under varying workload and configuration assumptions. This paper addresses the issue of performance prediction in the context of a general approach to capacity planning for MVS environments. In particular the paper covers : a. Definition of an RMF-based measurement strategy for MVS environments. b. Software for mechanical generation of BEST/1 queueing models of MVS systems. (BEST/1 is a proprietory product of BGS Systems, Inc.) c. Calibration and use of the resulting BEST/1 models for capacity planning studies. An actual case study will be presented. Key Words: Capacity, Planning; queueing models. 1 . Introduction This paper discusses an approach to capacity planning for MVS environments which has grown out of a Bell System Capacity Planning project. Before exploring this approach, it is worthwhile to overview the project itself. The Capacity Planning project has as its goal the definition and implementation of a Capacity Planning Methodology. The current phase of the project involves the development of a Capacity Planning and Management System (CPMS) for use in IBM - compatible mainframe environments employing MVS as the System Control Program. This effort focuses on developing techniques to: 1. Establish Current Status: Characterize current status of the mainframe environment in terms of workload, hardware utilization, and service. 2. Verify previous plan: Isolate and resolve significant deviations from a previously developed capacity plan. 3. Generate Future Workloads: Forecast and characterize new workloads, as well as growth trends in existing workloads. 4. Forecast Processing Performance: Predict future processing capabilities taking into account workload and hardware/software changes. 5. Generate Recommendations: Generate and report capacity planning recommendations to management . 139 During 1979, a project team developed and implemented major portions of an IBM - compatible methodology. Prior to the formal start of the trial, two commercial tools were selected as potential aids for capacity planning. SAS, a proprietary product of the SAS Institute, Incorporated, is a data manipulation, reporting and statistical analysis tool. BEST/1, a proprietary package of BGS, Incorporated, is a performance prediction tool. Evaluation of these tools in the context of the CPMS was a major goal of the effort. The specific objectives of the project were as follows: 1 . Develop a workable approach to capacity planning (CP) in an IBM MVS environment using available tools and packages. 3. Execution of a series of CP studies. These studies allowed the CPMS software to be further defined and the operational characteristics of the tools to be evaluated. In the case of BEST/1, a series of preliminary validation studies were carried out. These studies had the objective of verifying the analytic algorithms used in BEST/1 and its ability to model various data processing situations. The intent of this paper is to discuss two major aspects of the MVS Capacity Planning Methodology developed thus far: 1 . Software and techniques to characterize the current status of the computer system. 2. Evaluate the usability, flexibility, and validity of the data manipulation tool, SAS, and the performance prediction tool, BEST/1. 3. Test the methodology in a series of actual CP studies. The main effort to date has concentrated on systems with these characteristics : 1 . MVS or MVS/System Extensions (MVS/SE) as the System Control Program. 2. IBM or an IBM - compatible mainframe as the CPU. 2. A methodology and supporting software for prediction of processing performance. In particular, Section 2 of this paper discusses a SAS - based, 'Performance Management Subsystem' (PMS) which employs data from the IBM Resource Management Facility (RMF) to report on status of the computer system. Section 3 of this paper extends this discussion into the question of forecasting processing performance. Software to reduce raw measurement data (SMF and RMF) and mechanically generate BEST/1 queueing models is described. Section 4 discusses a case study in which the model generation software was employed. Workloads including batch, TSO, IMS/DL-I, and IMS/DC. 2. Establishing Current Status The approach used in the development and trial of the methodology using the CPMS software has been threefold: 1 . Definition of a workable approach to each of the five functional CP areas listed above. Techniques to characterize the current status of the computer system stand at the foundation of the CP methodology. Beyond the general goal of providing an answer to the question 'Where are we today?', this area of the methodology addresses several specific issues: 2. Integration of the commerical tools (SAS and BEST/1) into these approaches. 1 . What data sources and metrics should be employed to measure workload, utilization, and service? 140 2. How are functional workloads to be defined and measured? 3- How should the raw measurement data be reduced, stored, and displayed so as to allow its use in bottleneck identification and resolution? 4. How can the reduced measurement data be massaged to provide input to the portion of the methodology dealing with performance prediction? The approach to characterizing current status is based on use of a widely-available measurement tool (SMF/RMF) and an available data manipulation tool, (SAS). (Characterization of IMS - related workloads involves additional tools). Highlights of the methodology are: 1 . Definition of a standard MVS Installation Performance Specification (IPS) across all systems in the data center. MVS Performance Groups (PG's) are assigned functionally. The methodology is implemented through use of SAS software written to process raw SMF-RMF data. The PMS software consists of two functional modules: 1 . An SMF-RMF data extraction program which allows the user to process SMF-RMF Record Types 0,4, 8, 34, 40, 70-75 and generate one or more SAS datasets for each record type. 2. A set of SAS - based report generation programs which selectively process the extracted SMF-RMF data and generate both technical and management-oriented reports. The example reports on the following pages illustrate some of the salient aspects of this approach to characterizing current status. The reports were generated using SMF-RMF data collected from an IBM 370/158-AP processor supporting both TSO and batch users. Workloads corresponding to various functionally - related PG's are grouped together into 'Logical Workloads.' This classification is maintained consistently across all systems to be studied. The concept of a PL/I-generated 'Workload Resource Table' which ties together workload demand (by 'Logical Workload') with service and hardware utilization measures provides a vehicle for interpretation of SMF/RMF data into modeling terms ('Predict Processing Performance'). This concept is discussed in Section 3. Existing workloads are characterized (on an ongoing basis) using RMF data only. RMF Service Units play a principal role in this characterization. These units are also used to characterize consumption of hardware resources. SMF workload data is used only in modeling ('Predict Processing Performance*), 141 The first report displays the weekly average CPU utilization of the AP processors by two hour intervals (referred to as 'Durations'). Based on data extracted from the RMF Type 70 record, this report allows the user to easily identify intervals of peak CPU usage . The next report is a SAS 'star chart' (generated using the SAS procedure PROC CHART) which shows the distribution of batch job terminations CTRNSTERM') by two-hour intervals. The source for this data is the RMF Type 72 record. Only those Type 72 record fields corresponding to batch Performance Groups were used in constructing this chart. This selection is accomplished via an internally coded table of RMF Performance Group definitions. The following report shows the distribution of TCB CPU time (RMF Type 72 record CPU Service Units) across the various functional workloads. It is done in SAS 'pie chart' format. The next report is a SAS 'block chart' of CPU-I/0 overlap across two-hour intervals. On the left the various system states are displayed. 'WAIT' refers to CPU Idle; 'OVERLAP' refers CPU and Any Physical Channel Busy; 'CPU BOUND* implies CPU Busy and All Channels Idle; 'I/O BOUND' implies CPU Idle and Any Channel Busy . The final report displays physical channel usage by two-hour interval. It is used to identify potential I/O load imbalances. These and other reports from the PMS are reviewed regularly by both tuning and capacity planning personnel. As such they provide support for day-to-day tuning efforts as well as characterization of current system status. 3. Predicting Processing Performance As discussed earlier, the effort to develop software and a methodology for determining current status of the computer system has been matched with a complimentary effort to define methods to predict processing performance. In this section an approach to predicting performance given a specific workload and hardware configuration is presented. Techniques to develop workload forcasts for input to the performance prediction process are not discussed. The approach developed for predicting performance is based on the use of the BEST/1 performance prediction tool supplemented with software to mechanically generate the models. Major aspects of the methodology are as follows: 1 . Performance prediction is performed using the BEST/1 package. 2. Use of BEST/1 is supported by an extensive set of modeling strategies. MVS batch, TSO, and IMS environments are supported. (IMS/DC models require additional measurement and data reduction tools) . 3. These modeling strategies are implemented in PL/I-based software which further manipulates the SAS-reduced SMF/RMF data and mechanically generates BEST/1 models. These models employ the •Logical Workload* concept discussed earlier. 4. Calibration and validation procedures for refinement of the mechanically-generated models are defined. Selection of BEST/1 for use as the performance prediction tool was motivated by several factors. Among these were the following: 1 . The demonstrated capability of queueing-based analytic models to give reasonably accurate predictions of system performance as evidenced by the performance literature over the past 5-10 years. 1 . The demonstrated capability of queueing-based analytic models to give reasonably accurate predictions of system performance as evidenced by the performance literature over the past 5-10 years. 142 ECOV-1412 3 O 3 i o. •k. x o •« •i *- X K • « ^ « « « • • : t • ■ ■ I ! I! *•« so •»'»•'# • » « « * 'J v -■!•*«» X * • * • *#«««««««*«« t •» 3 • I i i • * e • a * • 3 « • • <• • * « I ! I '' ■ ■ ' ' ' » « * « a • • ». 1 X 1 . «. «. *■ «■ «■ ••«■«• «■ *■ «■ 1 ■<• I ! • i I • i ! 1 • 1 1 ««««««««««« IX 1 «*«««««««*«• • • • • ' • ..l..i..J|.. *«*««««•«• • i ! i * 1 1 • • i : • 1 1 1 1 i • o ] I • ** • X 1 • • ] • - • « > e e < > e e « » e ! • » •* ■m i > • M • • 1 I I -4- 143 ?• o o a « i >- «J » I * o 5 ui *- a. • 3 3 x at => e P . ... i « « • <' • . • * I • 41 • I - • • 1 • « • P • « •\r: « Ml 144 o X o (9 Z « « « l 9 I o ( in m. C I S3 a. x u x »-. 3 ■* _ ■? •: X • I CI i •• 3 ' « 1* ♦ « « « o Ml X • « • 4 • * « 145 - ! o l i ♦ l a a si * MC 1 Bi v£ vlv """""""v a • la b| • via M "• ^JV""" ""' II— 2, . > IB B M Sv .^ ! * i« ■■«■■■! «■ IllSIIBll • m « I | N 'J'? kO VI -Oil ea> iv ! I | I via M • I » • -r— — i in , iv | IV V cm I v • I I v I — *-_ m | »vj -. | ♦ laaaiiBiilBtlii I ivia ■■■■aaaaai ; « I . la via I CM I • — — I *lv"v n I I ♦— -J I | v •via i i - i i ! V m |i**jv "!"" " i I* iBaaa • a b"b ai • I vlaaaaaaaaal.r II o^ v| ! Hi H riv til' — in I } • lal vli v If ■— ri i at i Ml o * via S e ! __ ^ | l V ■ II »»V S »H -r— — -f ir» v iv v at VI • —_ — — — cm • la ■ U-i- H Iv ! v 'I • la a a a a a a a at ! a> I I via aaaaaaaai , cm i .- J"|v"J I IV V 1ft i > I J-.-. 3 I | x v ji ys 4 i |._ — . ___ , n • laaaaaaaaaai »> viaaaaaaaaaai: • ID Iv -4 i . —J. in la ai p- via at • - ■ , i •» » v| I j I I J v ~* IV — — -r— — -T — — — „ -1 S I ih-I— L— li 2 s r.-i ? i "4 ! t i a at • II v ' vi a ai o j v | I il 5}' HI Ll-Ii 3 i «■ ►+ H iv >| f- I v ! 4 ' v I • mil 4 !i | v 9 ( S '-•" s a al • II v S Ivia a al • II T v i i f * i i vi : i -^ji J < IV 'Is V ji . i. "4 la a a a a I x.1 « a n b m i -I W ! Z ' o £ ! . ! «\i I >• ■ I X C I •« z or 7 > X I I I ! "I t „ I I. ■ !! - ■ ? ■ * * t I II !+ 4 t~'t' 147 2. The desire to provide a performance prediction tool with interactive capabilities and a good user interface. 3. The vendor-independent nature of the BEST/1 package, allowing for potential extension of its use into non-IBM DP environments. Development of software to mechanically generate BEST/1 models was likewise motivated by several factors: 1 . The need to make implementation of the CP methodology as cost - effective as possible. This implied that mechanization of appropriate portions of the methodology, such as model generation, should be carried out. 2. Successful use of BEST/1 requires development of suitable modeling strategies in which algorithms to manipulate raw data for creation of the BEST/1 model are defined. The algorithms tend to be specific to both the software environment modeled (e.g., TSO, IMS, CICS) and the type of measurement data used. Mechanization of these modeling strategies and their associated algorithms enables a broader range of CP analysts to employ BEST/1 in a consistent manner. The remainder of this section discusses first, the Model Generation Component (MGC) of CPMS. After discussing this software the process by which it is used to generate calibrated BEST/1 models is overviewed. This will provide background for the Case Study addressed in Section U. 3. A PL/I module ('BESTGEN') which accepts the OS files written by SASBEST and generates a BEST/1 model, among other outputs. The flow chart on the following page illustrates the structure of the model generation system. Raw SMF-RMF Data ( 1 ) from a specific time interval provides the input to the overall system. (The measurement interval is selected based on representativeness criteria outside of the scope of this paper.) The SMF-RMF data is processed by the SAS data extraction program (2) and results in creation of a 'database' (3) in SAS format. This database consists of a number of SAS datasets, each corresponding to a specific SMF or RMF record type and each with a specific name and set of variables. For example, one such dataset is named PMSDB.T70-CPU and contains variables such as CPUBSY0 and CPUBSY1. These variables indicate the CPU utilization on processor and 1 of the mainframe, respectively, at a given point in time. This data is extracted from the RMF Type 70 record. The SAS database of extracted SMF-RMF data provides input to another SAS routine, SASBEST (4), which creates OS files accessible tp PL/I. In additon, these routines perform sorts on the various SAS datasets so as to accomadate the PL/I model generation program. The SASBEST output files (5) are input to the model generation program, BESTGEN (10). The MGC of CPMS presently consists of three functional modules: 1 . An SMF-RMF data extraction program written in SAS. This an enhanced version of the analogous PMS module. 2. A SAS-based interface program ('SASBEST') which massages the extracted SMF-RMF data and generates conventional OS files. 148 1 1 ■Tie- 's -*- W tl 5 c: CL /- > c- UJ O M ! > c or 0; LU H-. > <3' Q U; c w V} vrt Z i a. Q>! o -o o: 2 ©p • • • © o vJ CO <^ r < «o ISSUE Rel-eosc i.i BRAWN TITU £ PMC . d l/£7£ :•'.' .7" «..' 149 no. of t-t; 2. The desire to provide a performance prediction tool with interactive capabilities and a good user interface. 3. The vendor-independent nature of the BEST/1 package, allowing for potential extension of its use into non-IBM DP environments. Development of software to mechanically generate BEST/1 models was likewise motivated by several factors: 1 . The need to make implementation of the CP methodology as cost - effective as possible. This implied that mechanization of appropriate portions of the methodology, such as model generation, should be carried out. 2. Successful use of BEST/1 requires development of suitable modeling strategies in which algorithms to manipulate raw data for creation of the BEST/1 model are defined. The algorithms tend to be specific to both the software environment modeled (e.g., TSO, IMS, CICS) and the type of measurement data used. Mechanization of these modeling strategies and their associated algorithms enables a broader range of CP analysts to employ BEST/1 in a consistent manner. The remainder of this section discusses first, the Model Generation Component (MGC) of CPMS. After discussing this software the process by which it is used to generate calibrated BEST/1 models is overviewed. This will provide background for the Case Study addressed in Section 4. The MGC of CPMS presently consists of three functional modules: 1 . An SMF-RMF data extraction program written in SAS. This an enhanced version of the analogous PMS module. 2. A SAS-based interface program ( • SASBEST ' ) which massages the extracted SMF-RMF data and generates conventional OS files. 3. A PL/I module ('BESTGEN') which accepts the OS files written by SASBEST and generates a BEST/1 model, among other outputs. The flow chart on the following page illustrates the structure of the model generation system. Raw SMF-RMF Data (1) from a specific time interval provides the input to the overall system. (The measurement interval is selected based on representativeness criteria outside of the scope of this paper.) The SMF-RMF data is processed by the SAS data extraction program (2) and results in creation of a 'database' (3) in SAS format. This database consists of a number of SAS datasets, each corresponding to a specific SMF or RMF record type and each with a specific name and set of variables. For example, one such dataset is named PMSDB.T70-CPU and contains variables such as CPUBSY0 and CPUBSY1. These variables indicate the CPU utilization on processor and 1 of the mainframe, respectively, at a given point in time. This data is extracted from the RMF Type 70 record. The SAS database of extracted SMF-RMF data provides input to another SAS routine, SASBEST (4), which creates OS files accessible to PL/I. In additon, these routines perform sorts on the various SAS datasets so as to accomadate the PL/I model generation program. The SASBEST output files (5) are input to the model generation program, BESTGEN (10). This contains a subset of the data items originally contained in SMF-RMF Record Types 4,34,40,70,72,73 and 74. Hence data containing resource demand by the various workloads (Record Types 4,34,40 and 72) as well as utilization and multiprogramming levels on the system (Record Types 70,73 and 74) is available to the BESTGEN program. BESTGEN also accepts a number of manually created files. These files supply control information to the program: 150 1. The ADDRMAP ('Address Map') (6) file contains a mapping from physical device addresses into 'Logical Addresses'. (These Logical Addresses correspond to BEST/1 'Servers' in the mechanically generated model). 2. The WKLDMAP, ('Workload Map') (7) maps actual workloads, as identified by RMF Performance Group Number/Period on the SMF-RMF Type 4 and 72 records, into •Logical Workloads'. Each Logical Workload corresponds to a BEST/1 Workload in the mechanically -generated model. 3. The SYSIN (8) file contains user - assigned values to various control parameters. These parameters control which phases of the BESTGEN program are executed, and let the user overide or modify various internal BESTGEN algorithms. 4. The ASGFILE ('Assignment File') (9) provides a vehicle for the user to give BESTGEN special environmental information. Among other things, the user can assign Capture Ratios to various workloads via this file. Likewise, the user can force BESTGEN to allocate the activity on various devices to one or more workloads via control cards. (It should be noted that the BESTGEN program includes routines to mechanically generate WKLDMAPs and ADDRMAPs suitable for most modeling applications. ) Execution of the BESTGEN program results in the creation of a number of output files and reports: 1. The BEST/1 model created by BESTGEN is written to the MODELA (11) file. It is a complete ready-to-run BEST/1 model. 2. The WRTFILE (1) file receives one 'Workload Resource Table' for each time period analyzed. For example, if the user specified in the SYSIN file that raw measurement data for the interval 5 A.M. to 11 A.M. was to be processed on a one-hour basis, one Workload Resource Table would be generated for each of these intervals: 5-6 A.M., 6-7 A.M., ... 10-11 A.M. The Workload Resource Table contains data derived from all of the various SMF-RMF Record Types processed. It contains all the information necessary to build a BEST/1 model for the interval to which it pertains. 3. The MSGFILE (13) is used to record informatory and warning messages generated by BESTGEN during its execution. 4. The various reports created by the BESTGEN program are written to the REPORT files (14). In its present form the BESTGEN programs generates the following reports: 1 . Workload Resource Table . This report summarizes the essential workload and hardware parameters of the system being modeled. Referring to the example on the next page, note that the columns of the report refer to the various Logical Workloads. The upper part of the report gives various information regarding each workload, such as: a. WKLD TYPE - The 'type' of workload, using BEST/1 terminology. 'BP' workloads are batch; 'TS' are time-sharing; 'TP' workloads are transaction processing. b. WKLDNAME - A descriptive name given the workload by the user in the WKLDMAP input file. c. CPU UTILIZATION - The estimated utilization of the CPU by this workload, as computed by BESTGEN using the SMF-RMF data. 151 DATE 50086 WORKLOAD RESOURCE TABLE TIME INTERVAL 14.50 - 15.50 WORKLOAD-- SERVICE MATRIX UNITS* SEC CPU UTILIZATION 93. 04 8. UKLD NUMBER WKLD TYPE WKLD NAME CPU UTILIZATION ELAP.'RESP TIME. AVE TERMS-'MPl THRUPUT CPU SERVICE TAPE MOUNTS I>0 SERVICE POOL TYPE UTIL 1 c .,;! 4 TS TS TS TS TSOUS TSOUS TSOUS TSO U. 00 0. 00 0. 00 0. 00 0.00 0. 1£. 09 3.66 4.91 5.95 0.41 £0.48 6.13 71.5*6 ££78. 5£ 561.68 30.03 1379.74 435.30 1 3 1 . 1 b I'* TS BP BP TSOEL QUICK BATS 0. 00 1 . 3 8.6 0. 00 67.39 64£. 1 0. 00 . 6£ 0.8 0. 00 33.13 4.8 0. 00 370.73 31£.6 140 DASH £.7 0. 00 0. 00 . 0. 00 0. 0. 0. 0. 00 141 DASD 9.4 0. 00 0. 15.74 £77. 44 0. 0. 46. 66 14£ DASD 8. £ 0. 0. 00 0. 00 0. 00 0. 0. 0. 00 143 DASD 0.1 0. 00 0. 00 0. 00 0. 0. 0. 0. 00 145 DASD £.6 0. 00 0. 00 0. 00 0. 0. 0. 0. 00 146 DASD 3.9 0. 00 0. 00 10£..ll 0. 08 0. 0. 6£ 0. 00 £50 DASD 43.5 0. 00 0. 00 0. 00 0. 00 0. 00 0. 0. 00 £51 DASD £4.7 0. 00 13. 85 6£8.£1 £4£. 18 0. "-■ ■ £7 0. 00 £5£ DASD ££.3 0. oo- £6. 49 598.85 178. 54 0. 0. 00 0. 00 £53 DASD 1.5 0. 00 0. 00 19. 09 0. 00 0. 3£. 09 0. 00 £54 DASD 19.3 0-. 00 0. 00 10. £5 £5. 58 0. 1. C-Z' 65. 03 £55 DASD 13.8 0. 00 0. 00 0. 00 0. 0. 0. 00 0. 00 £56 DASD 1.6 0. 00 0. 00 £.98 0. 0. ££. 88 0. 00 380 DASD 0.9 0. 00 0. £8.98 C m 15 0. 0. 0. 00 38£ DASD 4.9 0. 00 0. 00 0. 00 0. 00 0. 0. 00 0. 00 3S4 DASD •? ■? 0. 00 0. 0. 00 0. 0. 0. 1 j Ij 0. 00 3 9 7 DASD 0.9 0. 00 0. 0. 00 0. 0. 0. Ij 1 j 0. 00 39F DASD £.9 0. 00 0. 0. 0. 0. Ij. 1 j 0. 00 3B0 DASD ■-' *r' ■ o 0. 00 0. 0. 00 0. 0. Ij. 0. 3B1 DASD 18.0 0. 0. 133.46 0. 0. ij t' • "7 "7 0. 00 3B£ DASD 1. 1 0. 0. 0. 67 0. 0. 1. 47 0. Oij 3B3 DASD 0. 1 0. 0. 4.68 0. 0. Ij Ij 0. 0. 00 3B4 DASD 1 . o. 0. 6.45 0. 93 0. Ij Ij 0. 93 195. £0 3B6 DASD 9.6 0. 0. 0. 00 0. 0. 344. 33 0. IS* DASD 15.5 0. i j 0. 0. 00 0. 0. 0. . SSS DASD 36.9 0. 0. 1 J L'l. 00 0. 0. 1 j Ij Ij. 0. 3 $3 TAPE -' i*' ■ C 0. y • 66 9 £3. 40 34. £8 0. ij c ■ 151.84 4SS TAPE 11.7 0. 0. 0. 0. i j 0. ij Ij. ij 1£3.3£ 5** TAPE 4.3 0. 0. 0. 0. ij i'i 0. ij 3 1 . f' c 0. 152 d. I/O SERVICE - The total time (in seconds) spent by this workload at the Logical Address ('POOL') listed at the left. Note that the actual utilization of the physical devices corresponding to this Logical Address is also shown CUTIL'). The major use of the Workload Resource Table is in model calibration, where results from the execution of the generated BEST/1 model are compared to the Workload Resource Table. 2. Multiprogramming Report - This report gives estimates of multiprogramming depth, throughput, and service for each Logical Workload. 3. I/O Service by Pool - This report summarized total EXCPS (I/Os) by Logical Address as well as milliseconds of device busy. A 'milliseconds per EXCP' ratio is also given, which is useful in identifying cases in which EXCPs are a poor measure of relative usage of a device by the various workloads. 4. CPU Time Report - This report displays TCB and SRB CPU time totals by workload type (e.g., 'BP', 'TS', or »TP') as well as total CPU utilization. It provides input to SAS-based Capture Ratio analysis routines. Select the time interval of interest. This choice is based on the questions the model will attempt to address as well as availibility of valid measurement data. CPMS includes a number of SAS routines which are useful for this analysis. Extract and reduce the SMF-RMF data using the CPMS extraction program and SASBEST. Prepare BESTGEN user inputs. These include the ADDRMAP, WKLDMAP, SYSIN and ASGFILE files. A number of issues must be addressed in this phase, such as: a. Strategy for modeling Operating System functions. b. Workload aggregation (WKLDMAP). c. Peripheral modeling approach (ADDRMAP). Execute BESTGEN and create the initial BEST/1 model. Calibrate the BEST/1 model by executing it and matching the results of the BESTGEN output reports (especially the Workload Resource Table). This is in general an iterative process. Address the 'what-if* questions laid down during the scope/objectives phase. A typical modeling study using this software might proceed as follows: 1 . Determine the scope and objectives of the study. Most importantly, decide what question(s) will the proposed models answer? 2. Identify and obtain relevant data. SMF-RMF is the minimum data required. Operating system parameters (IPS and VATLST especially) are also used. To date the CPMS Model Generation Component has been used in several studies. The next section discusses one such application of the system. 4. Case Study In this section calibration and use of typical BEST/1 model is described. The system modeled was an IBM 370/158-AP configuration supporting approximately 20 to 25 TSO users and U to 6 active batch initiators during prime-shift hours. Typical CPU utilizations varied between 60-75$ during these periods. 153 Construction of the model was facilitated by use of the mechanical model generation software incorporated into the CPMS. One important goal of the study was to test that software in the context of modeling a live operational environment. Another goal was the development and validation of modeling strategies for MVS/TSO. The modeling and validation approach employed was divided into these phases: a. Mechanical generation of a base BEST/1 model of a prime-shift hour with heavy load. b. Calibration of the model to reflect operational consideration, e.g., tape mounts, system pack usage, and varying TSO MPL levels. c. Modification of the calibrated model of heavy load to reflect lower multiprogramming (MPL) levels occurring at periods of reduced loads. For the purpose of creating a base BEST/1 model of the period of heaviest load, a particular hour was selected. Four workloads identified as active during this interval were modeled. They are listed in Table 1 at the end of the paper. The approach used in modeling the hardware configuration was highlighted by: a. Representation of each active DASD device and tape channel by an individual BEST/1 Server. b. Approximation of the 370/158-AP CPU by a 370/158 uniprocessor with 1.8 times the internal speed as a normal UP. c. Use of SAS-based linear regressions to apportion device busy times across workloads where necessary. This technique was applied to system-oriented packs such as SWAP, CVOL, SPOOL, and SYSRES volumes . Calibration of the base model involved providing estimates of tape mount time (for batch workloads) and assuring a correct distribution of CPU time across workloads. The distribution of CPU time was accomplished via SAS analysis of SMF capture ratios (ratio of SMF TCB CPU time to actual CPU time). Additionally, estimates of user think time were calculated for each of the two TSO workloads. Results Calibration of the base model resulted in good overall agreement between predicted and actual values; as shown in Table 2 at the end of the paper. Validation of the model was accomplished by varying the model parameters to reflect a different interval on the same system. Table 3 compares this interval to the original interval represented in the base model. Adjustments made to the base model of heavy load to reflect the interval of lighter load were as follows: a. The number of TSO terminals in the two TSO workloads was reduced from 13 and 6 to 8 and 4 respectively. b. The batch MPLs were all scaled by the ration of 1.80/3.50, i.e., the ration of Batch MPLs for the two intervals. Note that the mix of batch workloads (i.e., the relative MPLs of the various batch workloads) was not adjusted. Since the primary validation criteria was TSO response time and total batch throughput, exact representation of the MPLs of the various batch workloads occurring in the interval of lighter load was not considered necessary. Results of the model are shown in Table 4. Based on the relatively good results obtained, it seems that the base model is valid and quite robust. 154 Several conclusions were based on this study: a. BEST/1 is applicable to modeling of live operational environments provided that data concerning operational factors such as tape mount time is available. b. SAS can be used as an aid in statistical analysis of measurement data during model construction and calibration. SAS-based linear regressions were used in both determining appropriate capture ratios and partitioning system pack usage across workloads. c. Use of the BEST/1 priority feature (PRIORITY =) is essential for valid modeling of mixed batch-time sharing MVS environments. TABLE 1 Workload Functional Name Description TSOUSER-App.A TSO users involved in application 'A development TSOUSER-App.B TSO users involved in application •B development BATTEST Batch users doing application testing. BATTEST-DEV Batch users involved in application development. d. The mechanical model generation software incorporated into the CPMS is adequate for generation of BEST/1 models. Quantity CPU Utilization (?) TSO Response Time - App. A. i TSO Response Time - App. B. I Batch Throughput *(Jobsteps/hr. ) TSO Command Throughput **(Cmds/hr.) TABLE 2 Relative Actual Predicted Error 69.5 68.5 -1.4$ !.) 2.40 2.65 +10.4? ».) 2.93 2.83 -3.5? I 72.2 63.0 -12.7? ir.) 2308.0 2302.0 -0.3? TABLE 3 Interval Comparison (Actual Values) CPU Utilization (?) TSO Response Time - App. A. (sec.) TSO Response Time - App. B. (sec.) Batch Throughput ( Jobsteps/hr. ) Logged on TSO Users Batch MPL Interval Light Load Heavy Load 46.80 69.50 2.08 2.40 2.80 2.93 82.30 72.20 12.00 19.00 1.80 3.50 155 TABLE 4 Quantity CPU Utilization ($) TSO Response Time - App. A. (sec.) TSO Response Time - App . B . ( sec . ) Batch Throughput *(Jobsteps/hr. ) TSO Command Throughput »»(Cmds/hr. ) Relative Actual Predicted Error 69.50 68.50 -1.4* 2. MO 2.65 +10.4$ 2.93 2.83 -3.5$ 72.20 63.00 -12.7$ 2308.00 2302.00 -0.3$ 156 An I/O System Model for 303X Processors Sushil Bhatia Phillip Carroll International Business Machines Corporation P.O.Box 950 Poughkeepsie, New York 12602 In the 303x input/output system certain jobs are critical in the sense that the system pays a penalty in reduced performance if they are not finished in some maximum allowable time. The purpose of the model is to determine the extent of the penalty a system will pay in a heavily loaded environment. The model of the input/output system of the 303x processors decomposes the system into a three level hierarchy of server/requestors. The levels are: 1. (highest) the teleprocessing controllers, 2. the byte channel interface, and 3. (lowest) the director of the 303x channel group. Each server services requests from the next higher level of the hierarchy (in a strict priority order defined by the system design) and makes requests to the next lower level of the hierarchy. In a typically heavily loaded situation, the rare situations when the queues for the shared resources become excessive, some jobs may pay a performance penalty. The analytic techniques used to determine the probability of these rare situations of excessive queue length and the model validation results will be described. Introduction Evaluation of channel performance for channels designed with shared resources (e.g., 303X directors) is more complex than those with nonshared resources (for example stand-alone channels of the S/370 Model 168). Queueing theory is applicable to the evaluation of systems with shared resources and has been used quite successfully in the evaluation of computer system performance. However, it has not been used to evaluate the detailed hardware designs of the systems with shared resources, or to predict the occurrence of rare events. The paper deals with the first successful application of queueing theory to the evaluation of performance of channel hardware and in predicting the occurrence of rare events, which in some cases have channel performance implications. Model Need Greater care needs to be exercised in configuring, and in 157 designing, channels using shared resources than is required in configuring and designing stand-alone channels. The increased sensitivity of the I/O system performance to details of configuration, combined with the greater complexity of the I/O system, generated the need for a model to assist the systems engineer and the channel designers in the analysis of the various interacting resources. Such a model should be easy to use, have a reasonably fast execution time, and should be able to handle the wide variety of I/O devices attachable to the system. It should also have an acceptable level of accuracy. The model was created to meet these objectives. System Overview An I/O device experiences performance degradation if its time -dependent function requests are not completed by the I/O system within the allowable time. The phenomenon is labelled an overrun. On some devices, the degradation due to an overrun is negligible and transparent to the user; on some it is marginal and barely noticeable whereas on other devices it results in human intervention and hence is visible. As examples, IBM 3705 Communications Controller on WRITE operation, sends idle characters on the transmission line if the data has not been received from the channel. This results in lower teleprocessing (TP) line utilization but the user view of system performance is not affectd. On some mechanical devices, overruns create a fixed time delay until mechanical synchronization can be reestablished, for example, DASDs and line printers. While on unbuffered devices like IBM 2501 , card reader and IBM 2701 data adapter unit, human intervention is required on the occurrence of an overrun. The model estimates the overrun rates for each device in the specified configuration. It does not deal with or analyze the performance effects just described. The contention for shared resources is the primary cause of overruns on I/O devices. These critical resources utilized by different devices, the contention for which may cause a device overrun, are shown in Table 1 . Table 1 Critical Resources of Devices Device Critical Resources 1 . Channel Group Processor 2 . Main Memory Devices on Block Channel Devices on Byte Channel 1 . Byte Channel I/O Interface 2. Channel Group Processor 3 . Main Memory Teleprocessing 1 TP Control- ler's Interface Adaptor Byte Channel I/O Interface Channel Group Processor Main Memory For all the devices listed in Table I, when a higher-numbered critical resource is in use by a device, all the lower-numbered critical resources are also being used by that device, e.g., when a TP line is using the channel group processor then it is also using the byte channel I/O interface and the TP controller * s interface adapter (TPCI adapter) . Thus, multiple critical resources may be used simultaneously by a device. The channel group processor and byte channel interface service the requests by a predetermined nonpreemptive priority scheme. The service mechanism used by main memory and the TPCI adapter is very complex and its description is beyond the scope of this paper. Model Overview As mentioned earlier, the current version of the model is limited to the prediction of the overrun rates on each device in the system. An overrun occurs if a certain sequence of work (labelled critical sequence) is not completed by the channel in the time allowed by the device. This allowable time 158 is labelled critical time. If the work involved in the critical sequence is not completed within the critical time, an overrun occurs. Hence, to determine the occurrence of an overrun, the critical sequence response time (CSRT) is calculated. It is then compared with the critical time. If CSRT exceeds the critical time, an overrun occurs. This is illustrated in Figure 1 . -CRITICAL SEQUENCE RESPONSE TIME CRITICAL TIME FIGURE 1. CRITICAL SEQUENCE RESPONSE TIME DISTRIBUTION An analysis of the critical sequence work indicates that for some portions of the work one shared resource is required. For some other portions, more than one shared resource is required, and for some portions of work dedicated resources are used. As an example, consider the diagram shown in Figure 2. During intervals t 2 , t-j and t 5 , only shared resource #1 is Being used. During the intervals W3 and ty, when the request is waiting for and receiving service from the shared resource #2, the #1 shared resource is not released. Hence the effective service time of the shared resource #1 is tg, which is the total time the next request for the shared resource #1 would see it busy. During t«, only dedicated resources are being used and since these are not shared with other requests, no time is spent waiting for these resources to become available. The service times of the requests, when a single resource is being utilized (that is, times t-| , t 2 , t3, t 4 , and t 5 ) , can be determined from the analysis of the system design. But the waiting times, (W y W 2 and WJ depend on the load on the shared resources and the service discipline of the resource (e.g., FCFS or priority). Analytic techniques requiring use of queueing theory are used in the model to determine these waiting times. When the queue forms for the use of a single shared resource, as in W ^ and W 3 in Figure 2, the load on that resource only is required for the analysis. But when multiple shared resources are used simultaneously, hierarchical techniques are used for analysis, for example, W 3 is first calculated from the analysis of load on the shared resource #2. This enables one to determine the effective service time tg of the shared resource #1. W 2 , the waiting DEDICATED RESOURCE I REQUEST ARRIVAL SHARED RESOURCE NO. 1 SHARED RESOURCE NO. 2 K W 1 H« X 2 H< W 2 > \ < t 3- REQUEST DEPARTURE -W, -WAITING TIME FOR A SHARED RESOURCE • SERVICE TIME FOR A RESOURCE FIGURE 2. CRITICAL SEQUENCE TIMING DIAGRAM 159 time for shared resource #1 can then be calculated by the analysis of the load on this resource. This hierarchical technique is repeated as many times as necessary to evaluate the situation when two or more shared resources are held simultaneously . An analysis of the main memory response time behavior, which was later supported by the measurements on the system, indicated that the variation in response time was very small. Hence, for the sake of simplicity, it is represented in the model by a constant value. The value chosen is on the conservative side based on the laboratory measurements of the system. The byte channel I/O interface and the channel group processor service the requests by a strictly predetermined nonpreemptive priority discipline. For the channel group processor, the priority is based on the channel number of the channel making the request and the type of service required. For the byte channel I/O interface, the priority is based on the position of the device on the SELECT-OUT line and whether the device is wired for selection at SELECT-OUT or SELECT-IN time. Both the channel group processor and the byte channel I/O interface service one request at a time. The service time distribution of the requests at each priority level is determined from the analysis of system design and the use of hierarchical modeling where necessary. Successive arrivals are assumed to be independent of each other and a Markovian arrival pattern is assumed. Under these conditions, the M/G/1 nonpreemptive priority queueing theory developed by Takacs [1 ] is used to calculate the moments of waiting time distribution at each priority level. The service discipline in the TP controller for the use of an interface adapter by the TP lines is very complex. It uses a dynamic priority algorithm and services one TP line at a time. For the sake of simplicity the dynamics of the priority algorithm are not modeled but are represented by a fixed nonpreemptive priority scheme. The priority of ,the TP lines are specified by the model user. The service time of the interface adapter for the different TP lines were obtained from the TP controller's designers. As with other shared resources, the successive arrivals to TP line are assumed to be Markovian in nature although it is known that they are not. With these simplifications and assumptions, the M/G/1 nonpreemptive priority queueing theory mentioned earlier is also applicable for the analysis of waiting times for the interface adapter. Again, hierarchical modeling technique, described earlier, is used as many times as required. The heart of the model is the determination of the critical sequence response time distribution. The analytic queueing technique mentioned earlier enables us to determine n moments of waiting time distribution when the arrival rate and n+1 moments of the service time distribution of the priority server are known. The implications of this loss of one moment from queueing model input to output are demonstrated with the aid of example shown in Figure 2 . Let us say that we wish to generate n moments of waiting time, W , of the shared resource #1 . Thus we require n+1 moments of the service time of this resource. But this service time consists of the waiting time of resource #2 and thus we require n+1 moments of the waiting time of resource #2. This will require n+2 moments of the service time of resource #2. Thus we see that each application of the hierarchical technique results in a reduction of one moment. From the simplified critical sequence work diagram for TP line shown in Figure 3, and the previous discussion on hierarchical modeling, it may be seen that three levels of hierarchical representation are required to evaluate the waiting time for TP controller's interface adapter. It will be shown later that three moments of response time are used to obtain the approximated probability density function of the response 160 REQUEST ARRIVAL REQUEST DEPARTURE T.P. INTERFACE ADAPTER t BYTE CHANNEL I/O INTERFACE CHANNEL GROUP PROCESSOR t FIGURE 3. SIMPLIFIED CRITICAL SEQUENCE WORK A T.P. LINE SERVICE time distribution. Hence, to determine the approximate response time distribution of TP interface adapter, we have to start with six moments of service time distribution of the channel group processor. This process leads us to have five moments of waiting time distribution for the channel group processor, and four moments of waiting time distribution for the byte channel interface. Thus more moments are available than are actually used in the model for the calculation of overruns on the block and byte channel devices directly. But these six moments are required to calculate the overruns on TP lines. The principal output of the model is an estimate of the expected overrun rate for the specified input configuration and load. We have described how moments of response time are calculated for various overrunable sequences of work which we have called critical sequences. Having obtained these moments, and having determined the critical response time values for each case which, if exceeded, will result in overrun; the next step is to determine from this information the probability that the critical response will be exceeded. This process has two conceptual stages. First, a set of moments are used to derive an equation which describes a probability density function consistent with the moments. Second, some approximation method is used to evaluate the integral of this curve beyond the critical value. A typical approach used is to assume a normally distributed variable and to use the calculated mean and variance as parameters for transforming the value of interest into standard deviations of an equivalent unit normal distribution. Standard formulas are available for computing highly accurate approximations of this cumulative function [2]. This approach may be reasonable for roughly estimating areas that are typically greater than, say, 2055 of the total cumulative distribution; particularly when the approximated distribution is roughly symmetrical. However, in the cases considered by the model, an area to be calculated is typically less than 1/10000 of the total area. Furthermore, typical response times are highly skewed toward the smaller values rather than being symmetrical like the normal. Therefore, this technique would generally result in estimates too grossly inflated to be of practical value. At another extreme, one might consider assuming exponential service times. This could result in gross underestimates where the actual distribution is more balanced. The method used in the model is to assume that the response time is distributed according to one of the family of lognormal distributions. Lognormal curves begin at a finite lower bound, terminate at infinity, and possess a continuous range of skewness. With knowledge of at least 161 three moments, the coefficient of skewness can be calculated and a lognormal distribution of the skewness can be fit. Any lognormal distribution can be transformed into an equivalent unit normal distribution, and the critical time can then be stated as standard deviations of the unit normal. Having calculated this transformation, the model uses a standard approximation formula for the normal cumulative which is accurate to five decimal digits. Finally, this probability estimate is converted to overruns per minute at the I/O load described to the model. Model Validation The model has had extensive use in the past year. In addition, its output was compared with measurements made using an extensive set of controlled test workloads. In one group of tests, I/O loading details were varied while holding total I/O loads at a relatively constant high level. In another group, a fixed set of I/O programs were operated at several load levels. This collection of measurements permitted extensive study of variations in workload and the corresponding model estimates. Included were tests involving solely the byte interface, solely block channels, and combinations of both. Model Accuracy On the byte interface, data overruns are typically of the type which cause some user inconvenience such as an aborted job. It is desirable that the total absolute number of such overruns be held below some specified rate in terms of time rather than throughput. Therefore, an important performance question with respect to a given configuration is what level of throughput can be achieved at a given overrun rate. In this context, a model should ideally give conservative overrun estimates, but less conservative than any simpler alternative estimating technique. We have found that the model consistently overestimates byte channel overruns but has been able to predict substantially higher safely achievable throughput rates than those predicted by the available alternatives. On the other hand, DASD overruns merely impose a minor response time penalty on the work involved. Therefore, a more meaningful performance criterion is the percentage of successful operations, that is, those without overrun. We have found that the model consistently yields estimates with +0.1% of the percentage of successful operations. A primary intended use of the model is the evaluation of alternative device arrangements on channel, or the amount of overrun increase to be expected due to increases in load. The model has been consistently accurate in predicting the magnitudes of such changes whenever the changes are substantial. In addition to the above mentioned uses, the model has had the following uses: 1 . A special-case version was developed to predict channel response times for a unique proposed customer application involving customer-written real-time channel programs. 2. Other special versions have been used to assist in evaluation of engineering design alternatives. Summary and Conclusions The available literature on system models, based on analytic queueing techniques, suggests that the current state-of-the-art is the derivation of the average values of response times for the shared resources. In our situation, the average response time is of no particular interest, and only those rare situations when the response times exceed a prespecified value are of concern. Hence, in our situation, a model which accurately calculates the tail of the response time distribution is required. Comparison of the model to lab measurements at known points on the 162 tail of the response time distribution, have shown that the model has the required accuracy. Hence, the model has been successful in predicting the occurrence of rare events arising in queueing situations. The model was initially intended to be used by the IBM Systems Engineers for performing configuration analysis of IBM 303X I/O systems. Its use has extended beyond this requirement to include engineering design tradeoffs and to perform special systems analysis, (for example, I/O throughput analysis) . We feel that the techniques developed within the model can be further applied to other queueing systems in predicting the occurrence of rare situations of large queue lengths. References [1] Lajos Takacs, "PRIORITY QUEUES," Operations Research, Vol. 12, pp. 63 (1964). [2] Handbook of Mathematical Functions, National Bureau of Standards, 1964. 163 Configuration and Capacity Planning ii a Distributed Procossing System K.C. Sevcik G.S. Graham J. Zahorjan Quantitative System Performance Toronto, Ontario and Seattle, Washington and Computer Systems Research Group University of Toronto The distributed health claims processing system of a major insurance company was based on hardware that soon proved inadequate for the processing load. A decision was made to replace the entire system with new hardware and software through an acquisition and development process scheduled to take three years. We were asked to undertake a study with the goals of (l) comparing alternative proposed configurations for the replacement system, and (2) assess- ing the adequacy of alternative transitional systems based on the existing software intended for use during the development of the replacement system. In this paper, we describe the latter aspects of the study. Using analytic modelling and simple asymptotic bound analysis, we were able to show that the performance of a proposed transi- tional system was likely to deviate significantly from the expectations based on intuition and information from the hardware vendor. Subsequent benchmark tests, motivated by the results of our study, conformed closely to our predic- tions. Without the information that resulted from our brief study, the company might have purchased hardware inade- quate for its intended purpose. Key words: Asymptotic bound analysis, benchmark tests, capacity planning, configuration analysis, queueing network modeling. 165 1. Problem Context A major insurance company has decentral- ized its health claim processing, which is dom- inated by claims under group health insurance policies, by establishing mini-computer systems at approximately twenty geographically distri- buted sites. At each site, five to forty clerks with terminals enter and process claims. Each clerk can process about BO claims per day. Telecom- munications is used to provide data interchange between the head office and the remote sites. A few months ago, it was determined that the systems in place at the sites were not providing adequate response, and that it was necessary to acquire more powerful hardware. The aquisition cycle plus conversion of applications systems was planned to take place in the next three years. Seven different mini-computer systems were pro- posed to serve as the basis of computing at each site. The company wished to reduce the field of candidates down to two or three before proceed- ing with benchmarking studies. In addition they wished to investigate several alternatives for tran- sitional systems. Using the existing software applications systems, hardware upgrades were necessary to handle the workload between 1980 and 1983. Two analytic modelling studies were carried out to support the company's decisions. In one, the simple technique of asymptotic bound analysis of a single class queueing network model provided insight into the capabilities of alterna- tive proposed transitional systems. In the second study, a single class queueing network model was used to predict the perfor- mance of each of the candidate configurations for the long-term system. We produced predictions of throughput and mean response time as functions of the number of active terminals, and of the number of terminals leading to the highest sys- tem throughput. The remainder of this paper describes the modelling techniques and assumptions and the results of the first phase of this study, the evalua- tion of the alternative transitional systems. 2. Method At the time our study started, the existing system was overloaded at most of the sites. Even with only a moderate number of active terminals mean response time was in the 30 to 60 second range. The productivity of the claims processing clerks was impaired by the slow response from the system. The vendor of the existing hardware had two newer systems capable of executing the same software as the existing system. The insurance company realized that a transitional upgrade to the existing system was needed because the workload was expected to increase steadily over the three year period during which the equipment selection and application system conversion for the long-term system was to take place. Discussions with the vendor gave the insurance company the impression that replacing the existing system (EX) by either of the two newer systems (TR1 and TR2) would improve per- formance by estimated factors of 1.5 to 2 for TR1, and 2 to 3.5 for TR2 Of the twenty sites, the company wished to know which sites could be upgraded to the less expen- sive TR1 system and which would require the TR2 system to provide acceptable performance during the transitional period. The information we were given to support our analytic modelling study included: (1) system measurements of the existing system taken at several sites under "live" workloads, (2) system measurements of the existing system taken during "benchmark" tri- als in which varying numbers of clerks entered transactions from fixed scripts, (3) information from the vendor on some aspects of the relative performance of systems EX, TR1, and TR2. From the live system measurements over 15 minute intervals at several sites, we were able to determine that each health claim processing tran- saction involved four to eight terminal interac- tions, with an average of five. Although there was some variation from site to site, the overall aver- age service requirement per interaction was 4.6 seconds at the cpu and 4.0 seconds at the disk on the existing hardware. These numbers were obtained by dividing respectively the cpu busy time and the disk controller busy time over fifteen minute intervals by the total number of terminal interactions during the interval. The dis- tribution of think time was highly skewed and could not be measured directly under live condi- tions. We used a rough estimate of mean think time of 60 seconds. The "benchmark" tests were done on both the existing hardware and the more expensive of the two proposed transitional systems (TR2). A script designed to closely resemble health claims tran- saction processing was followed repeatedly by 166 clerks at terminals. On each of the two types of hardware, three 15 minute trials were done, in which one, four, and eleven clerks participated respectively. From these trials, we were able to (1) verify the speed ratios of both the cpus and disks between the EX and TR2 systems, and (2) quantify the growth of memory contention over- head with the number of terminals. No bench- mark tests were done on the TR1 system because none was available in Toronto. The vendor-provided information on the rela- tive speeds of the three systems included the fol- lowing. The cpu instruction rate of the TR2 sys- tem is 1.5 times that of the EX system, while the TR1 cpu is only .9 as fast as that of EX. Both the TR1 and TR2 systems use the same disk, which has an average service time approximately half that of the EX system disk. The rotational speeds of the two disks are the same, but the seek times are only half as long with the newer disk, and some io is avoided entirely in the newer system relative to the Ex system. The benchmark tests substan- tiated the 1.5 cpu speed ratio between the TR2 and EX systems, and indicated that the average disk service time was reduced by slightly more than a factor of two. With the above information, we were able to estimate the service time requirements (without overhead) that would exist under live loads on the TR1 and TR2 systems: The parameters required queueing network model are: to specify the (1) (2) (3) (4) the number of active terminals, the mean think time (including time to enter input and receive output), the mean cpu time per interaction, and disk service time the mean interaction per This queueing network model was analyzed in two ways for each of the three sets of parameter values representing the EX. TR1 and TR2 systems respectively. First, we carried out "asymptotic bound analysis" as described by Denning and Buzen [DB]. Next, we did a single-class analysis based on product-form queueing network analysis [DB]. In both cases we used the queueing network solution package, THEsolver [GZ]. Both analyses are described in the following sections. 4. Asymptotic Bound Analysis Asymptotic bound analysis is a simple tech- nique for quickly determining upper bounds on throughput and corresponding lower bounds on mean response time as a function of system load. EX (observed) CPU 4.60 sees. DISK 400 sees. TR1 (estimated) TR2 (estimated) 5.12 sees. 1.88 sees. 3.10 sees. 1.88 sees. 3. The Model Each terminal interaction cycle consists of (1) think time, (2) data entry, (3) processing, and (4) data output. The processing involves bursts of cpu activity interleaved between io operations. In representing the system as a queueing network model, we represent the terminals, the cpu, and the disk as the three congestion points. (Although some sites had two physical disk drives, the disk controller did not permit them to be active simul- taneously. For this reason, having only a single disk service center in the queueing network model is appropriate.) Each clerk corresponds to a token that moves from service center to service center in the model. While the clerk is thinking and entering data, the token resides at the termi- nal service center. During processing, the token alternately visits the cpu and disk service centers. At the start of terminal output, the token moves back to the terminals service center. For more detail about the theory and applications of queue- ing network models, see [Gr]. It is based on the service requirements of transac- tions at each of the system resources. With the definitions Li = total service requirement per interaction at device i, Z = mean think time, M - number of active terminals, X = throughput, and R — mean response time, asymptotic bound analysis consists of the follow- ing relationships [DB]: X * 1/L b RZY.Li. X*M/(Z+Y,I is required to serve each user at the bottleneck dev- ice, so a user's time for a complete terminal interaction (which is T + Z) can be no smaller than this on average. Inserting the service requirements for each of the three models into the four equations leads to the graphs shown in figures 2 and 3. A brief glance reveals that, at high loads, system TRl is in fact inferior to the system EX. This, of course, is a consequence of the fact that system TRl has a slower cpu in a cpu bound environment. Thus, rather than a performance gain factor of 1.5 to 2, performance degradation will be seen in moving from system EX to system TRl whenever the number of active terminals exceeds some thres- hold (probably in the vicinity of fifteen terminals). Figures 2 and 3 indicate a substantial perfor- mance gain in moving from system EX to system TR2, although the gain in throughput cannot quite be a factor of two or more as originally expected. 5. Single-Class Analysis In the previous section, we developed upper bounds on throughput and lower bounds on mean response times. By making a few additional modelling assumptions that have proven viable in many computing environments, we can obtain point estimates of throughput and mean response time. These point estimates coincide with the bounds at M=l and asymptotically as M becomes very large, and form a smooth curve in the mid- dle. The additional assumptions made to obtain the point estimates were: (1) Homogeneous Routing - The probabil- ity that a job completes after a partic- ular cpu visit is independent of other jobs and of the number of visits that the job has made to the cpu, and (2) Homogeneous Service Times - The time before a job's service completes at a device depends only on the job and the device. With these assumptions, an efficient compu- tational algorithm yields the throughputs and mean response times with various numbers of users [DB]. One queueing network solution pack- age that incorporates this algorithm is THEsolver [GZ]. The appendix shows the THEsolver input to do both the asymptotic bound and single class analysis. Figures 4 and 5 show both the asymp- totic bounds and the point estimates obtained from single-class analysis for throughput and mean response time respectively. 6. Conclusions On the basis of our study additional bench- mark tests were done in order to re-assess the advisability of involving system TRl in the transi- tional plan. The benchmark studies of the TRl system confirmed our prediction that the perfor- mance of TRl became worse than EX once the number of terminals reached 10 or so, and the performance gain of TRl over EX at lower loads was negligible. Consequently, there was no perfor- mance reason to invest in TRl systems for any sites. The results of the benchmark tests are included in figure 5. In this study, very simple queueing network modelling techniques proved valuable in evaluat- ing the proposed hardware. Without the simple modelling study that we carried out, the company might have ordered TRl systems without doing benchmark tests on them. In that case, they would have been destined for a great disappoint- ment with the arrival of the new equipment. Instead, they are now in the process of installing TR2 systems at all processing sites. 7. References [DB] Denning, P.J. and J.P. Buzen, The Operational Analysis of Queueing Network Models, Com- puting Surveys 10 (3). September (1978), 225-261. [Gr] Graham, G.S., editor, Queueing Network Models of Computer System Performance, special issue of Computing Surveys, Sep- tember (1978). [GZ] Graham, G.S.. and J. Zahorjan, THEsolver User Guide, Computer Systems Research Group, University of Toronto (1980). 168 8, Appendix Below are input statements to THEsolver that produce the information about the TR2 system in figures 2 to 5. CO •SYSTEM TR2 ANALYSIS' CL CLAIMS END_CL DV CPU DISK END_DV LD CLAIMS CPU 3.10 DISK 1.88 *»* END_LD AB SN CLAIMS 1 TO 32 END_SN TH 60 LF 1 TO 32 END_LF QU The purpose of the commands are as follows: CO -comment CL -there is one class called "claims" DV -there are two devices called "cpu" and "disk" respectively LD -the service time requirements per terminal interaction for health claims are 3.10 seconds at the cpu and 1.88 seconds at the disk AB -determine upper bounds on throughput and lower bounds on mean response time SN -solve for throughput at multiprogramming level from 1 to 32 TH -mean think time is 60 seconds LF -solve for system throughputs and mean response times for numbers of active terminals from 1 to 32. TERMINALS Figure 1. The Queueing network model 169 16 24 Number of Active Terminals Figure 2. Upper bounds on throughput. 16 24 Number of Active Terminals Figure 3. Lower bounds on mean response time. 170 16 24 Number of Active Terminals Figure 4. Estimates of throughput. / J / 40- If I 1 / ' f-> IT •' If) -"v/ I , o a> * * 30- TRl" Zf / / Sv 1 ' / / A? i ! cc s / c s / / y / cfl s l *r / ' L^ / 10" i^r / •■^ j^^^e: — — — -~^<^r / \ ~ ~i~ _ ._. __ _ _ __ / / 1 1 1 1 8 16 1 1 24 1 32 Number of Active Terminals Figure 5. Estimates of mean response time. 171 Capacity Planning 173 File Allocation Methodology for Performance Enhancement Sujit R. Kumar Digital Equipment Corporation Maynard , Massachusetts Robin B. Lake Case Western Reserve University Cleveland, Ohio C. Tom Nute General Dynamics Corporation Fort Worth, Texas Abstract Storage device configuration is an important issue in the performance of computer systems. The designer or installation manager has to map user and system demands for storage space across the range of I/O devices available on the system. We present here a methodology for optimizing such mapping. A generalized performance model for interactive systems is developed and validated. The model incorporates the device organization, and given the descriptors for storage devices, predicts performance figures for a broad range of system configurations and workloads. Alternately, given the target performance figures for the installation, it helps work out alternative system configuration for improved performance. We present cost benefit tradeoff studies based on hypothetical scenarios to determine optimal file storage strategies from the point of view of the user as well as the installation. Key words: I/O Resource Allocation; Systems Performance Modeling; Logical Storage to Physical Device Mapping; O.S. Performance Prediction; File Binding; Systems Storage restructuring; Storage Partitioning; O.S. Tuning. 1. Introduction the hardware and deployed by the software to perform specific tasks for the user of A modern computer system is best the system, considered as a collection of resources. The various funtional components of the Performance aspects -- namely computer system -- central procesor, performance measurements, analysis, memory, peripheral devices, system modeling, tuning, and prediction — have utilities — are the resources of the been the focus of increasing interest system, and the system's performance lately (c.f. bibliographies compiled by depends on the effectiveness with which [Agajanian-76] , [Mohan-78]). The roots of these resources are brought together by this trend can be traced to the early 175 annals of Computer Engineering. Rudimentary operating systems were first evolved in an effort to improve the utilization of the increasingly costly computer hardware. These operating systems were just job schedulers for the CPU and I/O channels trying to keep these resources working at capacity as much as possible. The benefits of multiprogramming in a computer installation were then practically realizable. Thus, from the very beginnings of the computer, performance issues have motivated, guided, and decided basic trends in its evolution. 2. Storage Allocation The peripheral storage devices in a •computer system cater for permanent system and user files; they also provide space for swapping, spooling, temporary system and user files necessry for compilation, inter-process communication ('pipes' and 'filters' in UNIX ), assemblies, and so on. Because the performance of any computer system is dependent on the speed at which information in these areas can be accessed, the distribution of these accessing demands across the various classes of devices is an important issue in the analysis of system performance, and has been the goal of our research [Kumar-80]. overall model before we can express the inter-relationship between storage characteristics and the performance measures we are interested in. We will term such a model an 'integrated' one. Providing the ability to express different system policies and device configurations makes the model a generalized one. We present next such an integrated, generalized, analytical (IGA) performance model for interactive computer systems. 3.1 Th" TRA Model Figure 1 presents the generalized computer system model. The model is a closed network one as we are mainly interested in interactive computer systems where a fixed number of terminals, proceeding in think-wait cycles, submit !jobs to the system. Devices are partitioned into numbered resource classes. The numberical subscript, if any, referes to the particular device within the class. Thus device 1 is the cpu; refers to the outside world (user terminals); U represents the third cpu in a multiprocessing environment . 3. Analytic Modelling Analytic modeling seeks a set of equations relating the chosen performance measures to the parameters of the system. A simulation approach is often used when these equations are not soluble, and cannot be expressed in a closed form, or when it is not obvious what these equations are. An analytic model of a computer system usually consists of a graphical, job-flow representation of the system with a set of mathematical relationships of its various parameters. Our goal is a model which describes storage usage and can be used to investigate the performance impacts of alternative peripheral storage configurations in a computer system. The point we wish to emphasize is that a variety of resource allocation usage policies must be interpreted in tractable terms and factored into the Jobs submitted to the system are loaded into main memory when space is available according to the job scheduling policies for the particular system. Jobs resident in memory and waiting to be processed are given control of a cpu according to the system's cpu scheduling policies. In the UNIX Operating System (OS), this cpu switching policy is a preemptive round-robin within priority classes, with a one-second time slice limit. The mechanism outlined above describes the situation where a job is assigned to s single cpu; note, however, that parallel processing is not precluded by the model; parallel processing may be modelled, without loss of generality, by assigning multiple cpu's to a single job. Alternatively, each job may be assumed to Trademark of Bell Laboratories [Ritchie-7 1 *] 176 £ e U n > V) w HS swap -> DV /tmp -> DV /usr -> DV In the UNIX system "/" refers to the root directory of the tree structured file system. User directions reside in /usr; so /usr /Kumar would be the pathname for a particular user's area. All devices are assigned logical files. Thus writing to /dev/tty3 would accomplish the logically expected result system priveleges permitting. /tmp is used by the OS for scratch files in compiles, edits and other system utilities. The twin DIVA drives are logically partitioned into 6 different subdevices (dvO through dv5); our system instrumentation traces activity at the logical device level, enabling us to differentiate between, say, /tmp and /usr activity even though they are mapped to the same physical device. Note, this would be difficult to implement with hardware system instrumentation. Figure 2 tables parameters for some common storage devices. The normal configuration CO has 124K words of memory of which the lower 48K is 1 sec core and the rest is 750 sec MOS. The usual complement of card readers, lineprinter, magtape, crts and remote dial-in ports are also present. The configuration was exercised by running benchmark scripts built through analysis of the actual user load on the system, to simulate typical user demands on the installation for controlled and reproducible experimentation. Runs were made simulating different degrees of user loading up to the maximum possible loading. The limit was set by system process table size which decides the maximum number of current process that may be accomodated. Different think times were set in the user scripts run to evaluate response changes with different user characteristics. The analytic performance model was then solved for the particular system configuration, with parameters determined by the system instrumentation, to get response time and system throughput measures under different user loading conditions. As will be demonstrated, these measures match the actual system performance closely, validating the analysis of the model. Figure 3 shows the performance model for the normal configuration CO. The synopsis of data from a benchmark run for this parameter configuration is given in Figure 4. The parameters for the model 'have been obtained from system, .instrumentation data analysis utilities developed. Figure 5 plots the predicted and actual measured response for different loading conditions. At lower values of system loading, the model's predictions are consistently found to be conservative for this and other configurations analyzed. That is, the measured response is better (lower) than that predicted. At higher levels of loading, the model predicts slightly better response than that actually measured by benchmark runs. This may be due to any of the following factors: 179 •4. UJ UJ* CO 4 . U.4J 1 DEVICE TIME IN MSEC. SEEK LATENCY TRANSFER OF 32K AREA DIVA (DD25) i 10-55 12.5 255 RF11 17.0 550 RK05 50 20.0 835* RP03 29 12.5 325 RS04 8.5 70 EMU 6 us set up time 40 *2.84 msec/256 word transfer time Figure 2 • I/O Device Parameters ACTUAL i'iWU PREDICTED RESPONSE ON CONK lOUK'UT ION CO * : rtHSl*t"l' kEifOHSE a : HkL'PIirei' RESPONSE 1 1 — 1 1 1 1 — 1.0 c.O a.tf U.»» S.i) b.*> 7. a USER LOADING •J.O — 1 U'.tf Figure 5 180 i-4 90 tfilfl 09 (A 1 « 'CI 5 4 £ r*. in ad e- C * (9 O 00 N 00 r-( y u y y 4) 4) 09 y 41 CD .» in o* as y v « a CO y 41 or *l II D I I > > > > H en X s u H H en X a § en h n m •» en en c/j en H I en at 3 o z CN CO I en -a > a y 4) CO HI O r* g r*. eoeo w •»* • m »N -» "s» «9 a oi - a I ** 4> 03 e o a 09 4) U b 4) > o OO e e o ■a e o A b 4) » O 41 a o 1 i u 4) a o. s « I y -» r-t > > + i-H en o > o + l-» M F o CN > > > > vO en r^ o + r~ CN o It o • • • <-* > O O o cm CM >» fn CI 00 00 lit o CN en «Cf en > > n n -j > > P- > CO b CN vO* OO vO 00 ! 00 1 b 4) a §■3 1-4 ^ H o G 4) u 3 00 o h n n ^ > > > > > CN I 5 00 181 © o in nO 1 I ^ u CO o 41 U OS Hi 00 in N -J i a ii i I O fl » > > > > > a e c- a 3 (0 w m cm vO CN VI at V! oo CM CM cm to U 41 co m •* CM o> sc V/ oo cm cm CM > 00 o> O en f4 r« <»» NO r>» oo vO oo nO u-i CTv 00 O N Si fl ei O 00 n» oo r~» II II V CM 'Ct ^ CO CO CO Pn| CO •* > > > > CO o cm I CO > CO > Oil •Hll SI COR CN fH o O CM CM vO «* in nO 00 tn • • • • ro f*> CM m *T vO o> ei i-l fi N fl -» > > > co CM r*> >» (0 CO CO CM /II o X o u in a in r^ ■»-. a. A oo V fj ON 3 • u | o H J 00 o X> 01 Cfl a o a. co U 4) CO o o (0 CJ 0) CO XI 00 CO Cn| -O • > CM £ I CO > O cd J3 co 3 o u CO o CO N •a e 3 o -a 41 D. a. 3 l-i 41 u 0) I* u V en -O o to y o u Cb CO >N a 1 o o 41 3 00 182 IScrlpt 5Script Total Duration CPU Idle Scheduler Idle Process Swaps Swap 387.30 sees 13.71% (User - 24.37% Sysint - 2.21%) 99.13 112, .07 sees/swap Size - 1709.15 Active - 1.91% Overlap 0.42% No queuing for swap Total Disk Interrupts 7364 DV's Busy Total Transfers Overlap with CPU HS Busy Total Transfers Overlap with CPU HS-DV overlap 88.25 sees 5676 (2018016 wds) 10.96% 15.75 sees 1688 (432128 wds) 1.98% 0.56% 1861.32 sees 1.63% 97.03% Blocked - 0.27% 530, 0.14 sec/swap Size - 2234.51 Active - 3.97% Overlap - 3.53% Queuing 0.27% 53501 707.69 sees 44221 (23537600 wds) 36.5% 89.05 sees 9280 (2375680 wds) 4.63% 2.72% Figure 4 Configuration - CO: Data Synposis 183 a) Increased contentions for data base accesses , memory , and cpu at higher degrees of loading and increased process management overheads on the part of the operating system, which were being neglected, become noticeable now. b) To reduce the interaction of concurrent scripts (prevent them from falling into step) activation are delayed (by 10 or 20 seconds) in the benchmarks when the run is started. Thus the level of interference due to other jobs varies throughout the benchmark run and the actual measured values reflect this. 5. Cost/Performance Benefit Tradeoffs in I/O Mapping Strategies In this section we shall examine allocation policies for using a fast, solid state device . the Extended Memory Unit (EMU). The EMU is a plug compatible replacement for head-per- track disks and its salient parameters are shown in Table 2% It features a zero seek time and very fast data trasfer rates. From the System Instrumentation we observed that data transfers between processes and peripheral devices occurred in 256 word blocks. The swap sizes however, were considerably higher. For data accesses, the seek and latency times constitute a large proportion of the total activity, as the transfer time for a single block (512 bytes) is 2 or 3 msecs. only, even for a slow device. In the case of swapping, fast accesses (seek and latency) are not as important because the transfer time accounts for a major portion of the peripheral activity. Devices like the EMU which are non-rotational and have no seek or latency times (requiring a set-up time of a few^isecs. only) are thus more attractive for these short data accesses. We observed that there is a high degree of contention in data accesses. Besides mutual interference in accesses, potential overlap with the processor is also denied by his physical device contention on independent logical I/O accesses. The best response time improvement would come from mapping the /usr files on to the EMU. However, this is not feasible due to space constraints. EMU is a trademark of Monolithic Systems Corp., Englewood, Colorado. An alternative proposal is to have a part ion on EMU reserved for use as a user assignable volume.' We shall investigate the merits of this proposal next and establish optimal strategies for the user applicable in any interactive system. Performance improvements are realizable with this proposal, because of 2 factors: 1) Faster data access with the EMU, reducing processor idle times in waiting for data. 2) Higher degree of overlap from splitting the user's logical files over different devices, and reduced intereference between different accesses to the same device. We have observed that mapping some accesses even to a slower device may be benefical. Factor 2, outlined above, explains why this happens. Assume that t is the access speed (blocks/sec) for the new device and t is the access speed for the old device. Let the total user area be b, such that b blocks in the user area may be partitioned on to the new device, leaving b blocks behind in its original position. Assume also that the capacity of the new device is B blocks and 8 2. b . Let T be the run duration on the older configuration. Assume that P was the degree of device-cpu overlap in the original conf guration . During this a total of P t T blocks were accessed on the old .0 Q O . .. device. Assume that accesses were uniformly distributed over the user area. Then after reconfiguration PoToto bn (_J ]_) bo+bn tn to represents the time savings on device-cpu overlapped accesses to the fraction of user space allocated on the faster device. Then, the performance gain realizable, in terms of savings in access and overlap times is represented by: bn+bo _ Jin _ Jia + PoToto JjqX-J 1_ ) (2; to tn to b tn to There is also an additional savings realized by overlapped accesses of the two user areas. This is a second order effect and if Pon represents the degree of overlap between the old and new user 184 areas, the total reconfiguration gain in terms of time, is: ToPon + bn [ (1 _ l).l.(bo+bn-PoToto)] (3) to tn b Expression (3) above is a polynomial of the form: A + Bbn + Cb. (4) The installation's cost accounting policy decides the additional charges levied with EMU usage and thus has to be balanced against the benefits realizable. In general, there are two ways of assigning charges: 1 ) A flat usage fee when the volume is assigned to the user. 2) Prorated usage cost function. The proration may be a linear one based on space and time usages. Often, there is an initial assignment cost built-in also ( a minimum charge). Alternatively, an exponential size or time proration algorithm may be used at the installation for load levelling and for promoting equitable usage (discourage hogging the resource) . Figure 6 shows the Cost/Benefit tradeoff for a family of normalized benefit polyomials and different usage charge policies enforced at the installation. The curves A, B, C, D are the savings realized by allocating different proportions of the user's logical file b to the EMU. We have illustrated 3 different charge policies — a flat fee EMU assignment cost, a size prorated linear cost function, and a non-linear combination of the two to promote equitable sharing of the EMU amoung the user population. We see that B just breaks even marginally when 100$ of b is mapped to the EMU. A is always a losing proposition unless there is additional file area space partitioning. In fact, for the system policies shown, any benefit curve lying in the area below B is not cost justifiable. With the benefit characteristic C, the flat charge policy dictates that EMU allocation greater than 70$ of b only are justifiable, whereas with the linear usage cost prorated function, only allocations under 70$ are justifiable. An interesting threshold effect is observed for D, when there is a exponential usage charge function. Here the strategy should be to allocate between 45$ and 65$ of normalized b n oh to the' EMU or simular device. The thresholds above determine break-even points for operation, when the response time benefits achieved through reconfiguration evenly balance the costs incurred in such an operation. The user may derive optimal operating strategies from the preceeding analysis. Thus, for the last case an allocation around 60$ gives the best performance gain. 6. Conclusions The use of analytic models for performance studies is well accepted ([McKinney-69], [Browne-77] ) , we have demonstrated its use as a tool to study and reconfigure I/O accessing demands, for system performance enhancement. The methodology for this is based on the simple premise that during the execution of a job, various system and user areas residing in peripheral storage devices have to be fetched or updated, and the rate at which this information can be accessed is an important factor affecting the performance of the computer system. In addition to the raw speed limitations of the peripheral devices, interference and contention between accesses initiated by the same job, or by other concurrently executing jobs, introduce processing delays. We have, then, certain accessing demands for peripheral storage, and a specified set of devices available, the problem being the determination of an optimal mapping strategy between demands and devices. This issue has not been studied to date, and our generalized performance model served as an excellent tool for such activities. The performance model developed is simple to use and has proved to be an accurate means of performance prediction and evaluation. It was formulated for flexibility and generality, to serve as an analytic tool in performance studies, and is sufficiently general to be set up to represent any interactive system. The I/O mapping methodology developed is applicable to any computer system, the only requirement is that some means of reconfiguring the system storage areas be available. On systems with tree-structured files (e.g., UNIX) the mapping between storage area accesses and physical devices is an explicit one and is easier to restructure as the logical partitioning is distinct. In this 185 v u 3 SO 3 .. » .. . eo i«- <* tf> *. '16 ea ($) s^Tun xiianaa/isoo 186 context, we may point out that it is not necessary to map a storage area to a faster device to obtain improved performance. We observed some cases where reassigning a storage area to a slower device results in response time improvements because of reduced contention delays, and improved overlap. We observed that using the EMU as a swap device gives only marginal response time improvements, and that too at higher degrees of system loading when there is abnormal swap activity. The best strategy for performance enhancement would be to allocate the short 'bursty* transfers to devices like the EMU. Thus, on the UNIX system response time improvement is expected if the user areas are mapped on to the EMU or similar peripheral devices. As this may not always be feasible, because of size constraints, we suggest a user-mountable volume strategy, outlined and analyzed before, as a feasible technique for response time improvement. Allocating the root directory (/) of the tree-structured file system on the EMU would also produce a significant response time improvement for the entire user community. From the point of view of system throughput, this may be a better strategy, especially when the mount-volume approach does not provide enough space for multi-user assignments. Accesses to the paging device in a paged memory environment is another example of frequent , short accesses . The EMU (or similar device) should yield significant performance improvements when used in such an environment; however further research and experimentation is required to substantiate and quantify the degree of performance improvement. In full swapping environments, when space is not any constraint, the configuration strategy suggested is to partition the EMU or similar peripheral, into two individually ported (dual-ports) file subsystems, one for global access areas (the root in UNIX) and the other for user-assigned file allocation. For systems with small memory configurations, where heavy swap-loads occur at moderate user loading, using the second partion as a swap device would improve the response times for all users currently on the system, instead of favouring ony those assigned to the mountable partition. If dynamic reconfiguration were feasible, an optimum strategy would be to assign the second partition for system use — swapping — whenever a predetermined load threshold is exceeded. 7. Acknowledgments The authors thank the Computer Science and Biometry Departments at Case Western Reserve University, Cleveland, Ohio for the cooperation and facilities extended. Funding and equipment for the experimental portions of this research were provided by the Department of Biometry. 8. References [Agajanian-76] [Brown-77] [Browne-75] [Buzen-78] [Courtois-75] [Courtois-77] Agajanian, A.H., "A Bibliography on System Performance Evaluation" , ACM Sigmetrica. January 1976. Brown, R.M., Browne, J.C., Chandy, K.M. , "Memory Management and Response Time" , Communictions of the ACM, Vol. 20, No. 3, March 1977, pp. 153-165. Browne, J.C., Chandy, K.M. , et . al . , "Hierarchical Techniques for the Development of Realistic Models of Complex Computer System" , Prog. IEEE., Vol. 63, No. 6, June 1975, pp. 966-976. Buzen, J. P., "Operational Analysis: An Alternative to Stochastic Modeling" , Ekqs* Int^-Conf . on Performance of Computer 1a a iiii Aiiflaa , North-Holland Publishing Co., 1978, pp. 175-194. Courtois , P. J. , " Decomposability , Instabilities, and Saturation in Multiprogramming Systems" , Communiations of the ACM Vol. 18, No. 7, July 1975, pp. 371-376. Courtois , P. J. , Deoompoaabilitv: Queueing &JiiL-£jim.iil£r_3,xst.j£m. 187 &PPUtt?U1ona, Academic Press, 1977. [Denning-78] Denning, P.J., B uzen, J- P., "The Operational Analysis of Queueing Network Models", CflBBUUBg Sumya, Vol. 10, No. 3 Sept. 1978 pp. 225-262. ' [Kumar-80] Kumar, S.R., n An 1/Q Resource Allocation Methodology for Performance Evaluation and Enhancement Studies of Interactive Computer Systems", Ph . D> Dissertation January 1980 Case Western Reservce University, Cleveland, Ohio. ' [McKinney-69] McKinney, J.M., « A survey of analytic time sharing-models", Computing SurVfiYft, Vol. 1, No. 2 June 1969, pp. 105-116. [Mohan-78] Mohan, C. , "Survey of Recent Operating System Research, Design and Implementation", Operating Syatsmfl RfiYlfiH January [Ritchie-74] R itchie> D . M>> K. , "The UNIX Time-Sharing System", £ojMULi£aJLJ£ns_of $ te ACM , Vol. 17, No. 77 July 1974, pp. 356-375. 188 Capacity Analysis of Shared DASD Control Units Floyd L. Pedriana County of Los Angeles Data Processing Department Downey, CA 902 k2 Abstract The IBM DASD control unit is an expensive and important component in an IBM 370 processing system. As is the case with most computer equipment, the data handling capacity is not well defined in terms of its impact on system throughput. Defining this is an especially difficult task where the control unit is connected to several CPU's. A queuing model can ad- dress this situation, however, the standard modeling equations do not account for a multipath environment. In addition, the collection of model input data is not a simple process. The data collection and model verification procedures are discussed in detail in this paper. In addition, some significant observations are made regarding the opera- tion of the control units. This study resulted in a recommendation of limiting the control unit load to less than 50 SIO's per second. 1. Findings The control unit performance data calculated using the queuing model described in this paper compared very well with actual measurement data. Therefore, it appears that the queuing model used here is a valid mechanism to use in the evaluation of the capacity of control units that are shared between central processors. can be verified using the average block size and data transfer rates. The "Control Unit Busy" flag in the Unit Control Block is an indication of con- trol unit queuing (not utilization). The SIO records stored in GTF files are a good source to use in determining the relative number of occurrences of this "Control Unit Busy" flag. If a hardware monitor is not available, the percent of time that a shared control unit is busy can be determined using a device and channel activity distribution table . Using the method above to determine control unit utilization, the service time per SIO is easily determined by noting the total number of SIO's in a specific time interval. This service time calculation The following two factors may contribute significantly to disproportionate DASD service to the central processors sharing control units. a. The faster CPU will experience a smaller percentage of "Control Unit Busy" indications. 189 b. The 3830 control unit has a "built-in logic mechanism that , in certain cases, will process SIO's according to a priority determined by the arrangement of channel cable connections. At the installation involved in this study, it was recommended to limit SIO activity to less than 50 per second for each 3830 control unit. This means that the guideline of one control unit for each string of eight DASD units used at. this installation, provides much more control unit capacity than is actually needed. 2 . Background The DASD Control Unit (3830) is a fairly large and expensive piece of computer equipment and since there is considerable flexibility in the way that it may be incorporated into the system configuration, it was decided to perform a detailed analy- sis of the control unit operation. The objective of this study was to determine techniques and guidelines for use in the capacity analysis of IBM control unit configuration and loading. The software monitors, GTF, RMF, and CMF were all used for parts of this study. These monitors have certain limitations in a study of this type, however, a hardware monitor was not immediately available and it was decided to proceed as best as possible without one. The general approach to this problem was as follows . First , some general DASD performance data was collected and reviewed. Then a mathematical model was proposed and some specific data was collected and analyzed. At several points in this process, calculated values were compared to measured values in order to verify the validity of the model, the calculation, and the general understanding of the situation. Finally, all of the data was reorganized and consolidated into meaningful values which have been summarized in this report. In the entire process, a conservative approach was used in order to provide for a margin of error in data collection and analysis. For example, all of the data used was obtained during brief periods of highest activity on all systems . These periods were mid-morning or mid-afternoon. This, therefore, assures that the analysis will account for the capacity requirements during the peak demand periods rather than for the average requirements. The objectives of this paper are to define the control unit mechanization and to describe procedures to help evaluate the con- trol unit capacity. This paper is not intended to be a tutorial on queuing model theory. The model used is a very basic one for which all of the equations are readily available in many books. 3. DASD Control Unit Operation and Model The main function of the 3830 control unit is to select and transmit data and control information between the disk devices and the I/O channel at the central processor. The actual data transmission time is generally very small compared to the total I/O time. This is due to the delays encoun- tered in the channels, the disk device, and also in the I/O Supervisor (IOS) software. The majority of the I/O delay time is usually due to mechanical device time or file contention. Figure 1 shows the interconnection between the various channels and devices . This figure shows that any CPU can get to any device through either of two channels and through either of two control units. This provides for backup in case of hard- ware failure and provides for improved performance when a single channel or control unit becomes overloaded and cannot handle all of the traffic by itself. When a control unit detects a request to transmit data by the channel (indicated when the "Select Out" bit is set by the channel interface ) , the control unit responds by setting either the "Operational In" tag or the "Status In" tag on the channel inter- face Cl]. In the case of the former, I/O processing continues. In the case of the latter, the control unit sets a combination of bus bits that indicate some difficulty in completing the I/O. The combination of "Status Modifier" (bus bit l) and "Busy" (bus bit 3) indicate that the I/O must be delayed due to the control unit being pre- occupied handling another I/O request. The control unit also sets an internal flag called "Request Pending". This flag is used later by the control unit to initiate reconnection to the central processor. Meanwhile, IOS places this request on a queue and sets the "Control Unit Busy" bit in the Unit Control Block (UCB). This process has led to some misunderstanding regarding the significance of this bit. This bit is incorrectly labeled since it is not always set when the control unit is busy but rather only when an I/O request has been delayed due to a busy control unit. The meaning of this bit would be clarified if its name were 190 Figure . 1 Logical Congiguration 191 changed to "Control Unit Delay". When the control unit becomes free it sends an I/O interrupt (Device End - status bit 5) to the central processor. IOS then tries to com- plete the original I/O which has been queued by starting the process all over. In order to define a mathematical model of the system, it is necessary to understand the interrelations between the various com- ponents of the system as discussed above. Although there are many types of models defined in the literature, none of them exactly fit the shared DASD - multipath configuration of this problem. However, after making some simplifying assumptions and examining various models in detail, it was found that the class of models which are described in Kendall notation as M/M/c could be used quite successfully for this application. In order to fit this simple model to the actual configuration, the control units are considered to be servers that are independent of the rest of the system configuration and interaction. It is beyound the scope of this report to explain this class of models and the interested reader is directed to reference [2~\- In selecting this model it is assumed that the interarrival times of I/O requests and control unit service times are exponen- tially distributed. This is not an unusual assumption and is generally a good approxi- mation even if the data is not exponentially distributed. The key items that must be defined in order to use this model are the expected control unit service time for each I/O request and the expected arrival rates for each I/O request. In this discussion an I/O request is defined as a Start I/O machine instruction (SIO - op code of 9C). h. Control Unit Service Time The expected service time for an I/O operation was determined by two different methods in order to have a high degree of confidence in the calculations. The first method was based on actual measurements, while the second is based on calculations using engineering hardware specifications. Determining service time from perform- ance measurements is a fairly straight forward process : The amount of time spent servicing all SIO's is simply divided by the number of SIO's serviced. The time spent servicing all SIO's is the percent of time that the control unit is busy, multiplied by the total elapsed time of the data col- lection period. Control unit busy time can be determined from channel busy time and the distribution of device activity. Since each control unit is connected to three systems, the data had to be collected on all three of these systems simultaneously. The control unit time is the same as channel time except for time spent in error recovery and in format write erase. Therefore, the percent of time that the control units are busy is determined by the total channel and device time as measured on all three systems. The difference in speed between the 3350 and the 3330-11 must also be taken into account. Figure 2 is a matrix showing the distribution of channel and control unit time. Figure 2 also has a comparison of channel time as calculated using the above process, and as measured using the software monitor. Since these values compare very well, it is safe to assume that this method of determining the control unit busy time is valid. At the same time that this control unit data was collected, the quantity of SIO's being processed was also collected. This data was averaged over several hours and used to find the average control unit service time for each I/O, This value was 8.5 ms. Using the data transfer rate of 1,198,000 bytes per second and an average block size of 7000 bytes, the service time is found to be 5-8 ms . These values are in the 5 to 10 ms range and, therefore, establish a high level of confidence in the approximate value for control unit service time. The more conservative 8.5 ms was used in this analysis. 5. Model Validation The M/M/c model itself was validated by comparing three separate calculations with measured values. These three values are: (l) the probability of an 1/0 request encountering a busy control unit, (2) the amount of time that an 1/0 request can expect to wait due to a busy control unit, and (3) the level of control unit utiliza- tion. The symbology used in these calcula- tions is explained in detail in reference 2 and is summarized in Figure 3. The items above are determined by the following relationships . c<=,»> = (£)„{£♦ i^X (sf Wq = C(c.u) E(s) c(l-/>) /° = XE(s)/c Where u = X E(s) 192 »Y»T»M a . • fr ii A If 1 l« * it 4 IS • T*tM. 1 .•*. J. 59 .of /X.70 »»te Ik. 1 I (.191 .17 .»« M7 1.4 3. Ma 7.17 .OX 84,1 0.* sr»rtH A 3t*» i I/.7* c.± |?.W H.7 7 H.7Y fin |7.*' «;* 3 10.15 (a.77 ».« j j- a 1 10.15 U.77 AV*1 ai.A i»»»* Mtt SYfTEM c ■ 1 l.»1 I.U C.I3 .•1 f.7« i.3 2 .•3 ••» &I3 .01 eta (.3 3 .03 ••1 M1 Jt.M- 3.(0 M MH /V31 IM1 f.C» IMF »l.*l (4.&I M3 II.SI 17.06 I7.C* M.V8 l*.7i 177.1* 174.3 Figure 2 Channel - Control Unit Activity Matrix 193 Figure k is a tabulation of the values calculated using the preceding equations and the same values as recorded by GTF and RMF. These validation calculations were made using the raw RMF data (see figure 5) accum- ulated for control units 2 and 15. These control units which control 3350 drives, had the highest traffic rate (51.02 SIO's per second) during the period of measure- ment. For general modeling purposes the degree of agreement between the calculated and measured values is considered very high, and indicates that the model can be used with a relatively high level of confidence. N Number of customers in steady state system. C(c,u) P(Nic) ; probability . all c servers are busy ; Erlang ' s C formula u Traffic intensity measured in Erlang' s; Minimum vlaue for steady state system. Wq E(s) X Probability that any particular server is busy; server utilization. Expected (mean) time in the queue, excluding service time for steady state system. Expected (mean) service time for one customer. Mean arrival rate of customers into queuing systems. Number of Servers. Figure 3. Symbology Summary M/M/c Model RMF or GTF Calculations Measurements Wq .1+23 ms .515 ms C(c,u) .077 .066 /° .217 .2 (appr) Figure k. Model Validation Data Control % of Time Unit Central SI0 "CU busy" bit Number Processor Count is best 5, 10 A 19701+1 .06 B 20562 .12 C 3007 Total 220610 .18 6, 11 A 101775 .03 B 105216 .03 C 87 Total 207078 .06 2, 15 A 335775 .38 B 396 .03 C 31211 2.22 Total 367362 2.63 1, Ik A 90597 .11 B 69238 .17 C 71206 .11 Total 23101+1 .39 3, 12 A 1031+00 1.01+ B 169752 .11+ C 1150 .57 Total 271+302 1.75 *, 13 A 121996 .21 B 2003 .1*3 C 172050 .06 Total 29601+9 .70 Note : RMF data was collected between ll+:00 and l6:00. Figure 5- RMF Data Summary 6. Control Unit Capacity There are two distinct advantages to developing a system model in an analysis of this type. First of all, the process of putting together the pieces of the model requires the analyst to become familiar with the detail operation and loading of the sub- system in question, thereby, making an intuitive appraisal more realistic. Secondly, the model itself can be used to answer questions regarding the subsystem which cannot be answered directly from measured values. The parameters of the model can be modified to reflect changes in the system, the Workload, or the desired response time and the model will then yield specific quantitative values for evaluation. One area of concern regarding the con- trol unit capacity is the significance of the percentage of 1/0 requests that encounter a control unit busy condition. Figure 6 194 shows that this percentage is 7-1+9 for the 3350 control units and 9.k6 for the 3330 control units. Although these percentages are not particularly high, the ultimate concern of course is how these values relate to user service levels. Control Unit SIO Control Unit Control Unit Busy Average Number Count Busy Count Percent of SIO's Busy Percentage 5 6,212 350 5.63 10 6,1+61+ 1+17 6.1+5 6 10,278 61+6 6.28 11 11,188 685 6.12 2 10,180 681 6.69 15 9,501 621+ 6.57 1 13,51+9 l,52l+ 11.25 11+ 12,256 1,31+8 11.00 7.1+9 (3350) 3 28,777 3,1+86 12.11 12 29,031 3,166 10.91 1+ 16,761 1,223 7.30 13 15,599 1,172 7.51 9.U6 (3330) Total 169,796 15,322 9.02 Data collected during peak period duration of 20 minutes. Figure 6. GTF Control Unit Data Summary The M/M/c control unit model was used to produce Figure 7 which shows the rela- tionship between control unit loading and the increase in service time to the user. Figure 7 shows that this relationship is non-linear and that it starts to increase rapidly as the number of SIO's per second increase beyound 100 per pair of control units. However, even at a loading level of 100 SIO's per second the delay time is in the millisecond range (l millisecond = .001 second). If a very conservative estimate of 2 ms is made regarding the amount of time during peak loading periods that each SIO can spend at the control unit without affecting the user, then it follows that the control units have the capacity to safely process about 100 SIO's per second (50 SlO's/sec. for each unit in a pair of 3830's). 7. Observations In reviewing the GTF data, an inter- esting phenomenon was observed. Figure 8 shows that there is a definite relationship between the number of SIO's that encounter a busy control unit and the central pro- cessor that issued the SIO. The data clearly indicates that SIO's originating on System A (TS0 and batch) have a very high probability of being executed immediately. While those originating on System C (Health Care Systems) have a much higher probability 195 Control Unit Time Per no (milliseconds 40 60 120 80 100 SlO's/Second FIGURE 7 Control Unit Load Effects on User Service 140 160 180 196 of being delayed due to a busy control unit. There is a definite priority mech- anism in effect here but it is not clear what the nature of this mechanism is. DASD Central SIO Control Unit Control Unit Busy Model Processor Count Busy Count Percent of SIO A 27583 U89 1. ,77 3350 B 29080 2737 9. ,in C 22965 3050 13, ,28 Total 79628 6276 7, ,88 A 23378 1501 6. ,k2 3330 B 30055 2222 7. ,39 C 36735 532U Ik. .k9 Total 90168 90**7 10, ,03 Figure 8. GTF Central Processor Data 1 Summary Part of the reason for this shift is accounted for by the speed of the 3033 pro- cessor. If two systems are notified at about the same time that an I/O path has become available, the faster of the two will be the one that gains control first while the slower one gets rejected when it finally attempts to use that path. This same effect can occur if CPU's of equal speed are notified in sequence rather than simultaneously. The first one notified will probably be the one to gain control. Another explanation for this effect was found in the electronic circuit diagrams for the 3830 control unit (see reference 3). The 3830 circuits are designed to give priority to the input ports at 3830 con- nectors in a certain fashion. The input ports are designated as "A", "B" , "C", and "D" on the hardware itself and in the hard- ware documentation. A spot check of the system cables indicated that system A was connected to input port "A" on all control units, and system B and C are similarly connected to ports "B" and "C" on all control units. The 3830 circuit diagram (Figure 9) shows that the hardware priority for the input ports is in the order ABC D, with A being the highest priority. Although this arrangement of cables may simplify the documentation of the control unit cabling layout , it appeared to be causing a performance bias in favor of the system which processes the batch workload. Therefore, it was necessary to recable the control units to properly balance the selecting priority defined in the hardware. References [l] IBM System/360 and System/370 1/0 Interface Channel to Control Unit Original Equipment Manufacturer ' s Information; GA22-697 1 +-1. [2] Arnold 0. Allen, Probability , Statistice, and Queuing Theory with Computer Science Applications . Academic Press, N^w York, 1978. [3] IBM Control Unit Model 3830, Automated Logic Diagrams (ALD) Volume 1, page CS101. 197 Select Out" Port "A" "Select Out" Port "B" "Select Out" Port "C" "Select Out" Port "D" "Enable OP in" Port "A" "Enable OP in" Vort "B" "Enable OP in" Port "C" "Enable OP in" Port "D" Figure 9 3830 Priority Circuit 198 A Note ii Ciapitir Systm Capacity Plmiii Thrinh Material Riiilriaiita Planlii K. Q. Salawu bell Laboratories Piscataway, New Jersey 0605^ A computer installation is likened to an industrial production factory. Demands for goods and services are either not met (shortage, disservice) or oversuppliea (inventory or waste). For given shortage and wastage costs, a smoothened, short-term production/service plan can be drawn from predicted demand or workload. This brief note employs simple graphical methods used by production engineers for aggregate production scheduling and the arithmetic of materials requirement planning to impute measures of disservice to users and poor utilization of computer resources. Noting that production/service objectives are hardly unique, the methods of goal programming that incorporate the satisfaction of multiple objectives are judged to be more appropriate for formulating production/ service plans. 1. Introduction : Computer Capacity Planning Objectives In manufacturing systems, sales forecasts guide maximum investment and minimum profit objectives as well as decisions on the total work-force levels, the production rate, net inventory levels and the stability of tnese rates and levels. here, our working concept of computer capacity planning is the efficient allocation of data processing resources to satisfactorily execute the offered workload hence we note that the characterization of an installation's workload directly influences the data processing objectives of the installation's management. As the speed and flexibility per unit cost of computer hardware decreases rapidly, and the unit cost of producing and maintaining software rises, we observe that a unit of the user's time is now more expensive than that of the computer, the tool used. It seems rational-economic then that the provision of high overall levels of service to the user, which conserves his time and effort, should now assume a much higher priority than cost and performance (utilization) considerations. in fact, the mechanistic performance/cost criterion which emphasized the evaluation of the 199 performance of the computer subsystem, is giving way to a systemic or holistic view of the performance of the computer and its environment iFerrari, 197o]. Notwithstanding the fact that workload characterization remains one of the most formidable problems in performance evaluation for capacity planning, it is now widely accepted that the most important input for computer capacity planning is a clear definition of user service objectives, and to a smaller extent, estimates of the availability and utilization of the total computer system. Furthermore, these overall objectives vary in their degrees of importance, depending on the utility preferences of the installation's management iGrochow, 1972}. ritu. I-l i u..o„_ .« cm t»Kl. b, tniUMom u .o lauuuu, HISCEUMCaB O 0»T» PMESSIK COIH ■ ACcourriK □ wa WEMTIK SKIP tiie cm aasuFTiai iv m\uum. T«ble £.1: » CPli Consumption by * Representative application During Window 2. Short-term Capacity Planning Figure 2-1 is a histogram of the average CPU consumption by all applications at a computer installation [from Bronner, 1979}. A similar histogram may, for instance, have been constructed for the average daily printer utilization by the various batch job categories at an installation. Furthermore, the histogram in Figure 2-1 could be differentiated into its component histograms, i.e. one histogram per application. These latter histograms will show the extent and time of service receivea from the CPU resource by each application. Suppose our window is between 9 a.m. and 5 p.m. every day, and some representative application usually consumed hourly rates of CPU service as shown in line 2 of Table 2-1. Figure 2-2 shows the histogram of CPU utilization by this application. Employing simple graphical methods used by production engineers for planning for uniform production rates, we first plot the cumulative service requirements as curve OPQfi in Figure 2-3. The slope of the straight line OH would give the average requirements as well as the service rate per nour that this application would have. Tin of Day 9-10 -11 > CPw Consuaeo 19 9 i CPU Dedicated 10 40 Surplus (inventory) or SDortage win no Setup 21 52 -IS 23 10 -1 -2 57 T2 40 40 -3 90 40 -5 P. 12 40 69 52 20 -30 -28 As a feasible production plan must cover the aemand in every period, the broken line ST will be drawn parallel to OH to pass through the point it on the curve which is perpendicularly farthest from Oh Lbuffa and Miller, 1979}. however, the intercept OS that this line makes on the utilization axis is equivalent to "inventory" carried into the window at 9 a.m.. In practice, this would mean that CPU consumption was anticipated, provided and consumed before the demand for such service was made. in the case of batch processing, this scenario might be workable through workload shifting operations and for transaction processing, it may be necessary to stagger the work hours of some terminal operators themselves. In either event, one may assess tnese rescheduling operations as "setup costs" at the beginning of every window, however, no buffer inventories of 3» CPU time will be offered hour by hour in audition to the constant average hourly service "dedicated" to each application. While an attempt to smoothen service is good, penalties must be borne for our inability to shift the workload backward (inventory) or forward (backlogging) , at will, to fit the constant service rate. 200 100 U|ur« 2-21 Sialogram of CP0 Consumption bjrr tapraaantatlva Application So Snrpluaas ' I (Poor utilisation) Shortagaa (dlssarvlca) 10 11 12 1 2 Hm of Day, (Prima Ihlit) Flfora 2- Si Cumulative Consumption of CPO lumrci by On* Application T / / / / / / / / / / ? i/f At i •H a-f 41 e § V M /// *M 1 If a / // i / / £ / // / // £ III g / / / III ■ q III ' / 1 o. * III II 1 Is •a If I 1 I / 1 sVI f II 1 o 1 / 1 m If 1 t 41 If 1 m If 1 I if 1 If f If / If / if f i 1 f Jl 9a.m. Tin* of Da? (Prima Shift) 5p.m. 201 The last line of Table 2-1 gives the surplus or shortage of service for each hour in the window without using any beginning inventory at 9 a.m.. The entries on this line are simply the differences between the cumulative > CPU dedicated ana the cumulative > CPU consumea by the application by the end of each hour. The provision of any amount of beginning inventory will increase every entry on this line by exactly that amount. In our example, it would be wasteful to start with an inventory of more than '60> CPU, the most negative inventory balance for this application. This value of 30 is the same amount indicated in figure 2.3 as the intercept OS. We propose that shortages be related to measures of aisservice while surpluses should be related to poor utilization of resources. Aggregate computer Capacity Planning Using iisai Programming Suppose x . , is it, . consumption k it the amount of resource consumption requested by the application with priority i, (i = 1,2, . ..,k) in period (hour) t, during a window of T hours. _„Let 2 t=i X i ~ T ' be the average hourly > CPU dedicated to the application. Using requirements planning formulation, let S. be the amount of > CPU usea as setup or beginning inventory for this application and let net inventory in hour z be represented by _ t=z l iz = S i + z x i - 2 x ifc for z=1, 2, ...,T. The average inventory per window will then be given by z=T 2 1 . a=JL iz l" = - 1. iff 1. < and otherwise. The total deficiency in the supply of the resource to the application would be z=T z=1 Let c~ be the unit cost of shortage or disservice for the application with priority i; c. the unit cost of holding unused resource in inventory per unit per window and c. the cost of setting up beginning inventory of 1% CPU. These unit costs may vary with the time of day but we shall not introduce these complexities here. We note however that c~ 2. c~ if priority i is higher than j. This means that we assume that the value of service to each class of users, as evaluated by the installation management, is consistent with the priority given to that class of users. The resulting multidimensional objective would then be to Minimize Z. = cS. + c. 1. ♦ c + 1_. for each of the i = 1, 2, ...,k applications. i=k subject to 2 * it S. 100 for t=1,2,...,T 1=1 and x. 2 Using this approach, the solution procedure successively seeks the achievement of the objectives in order of priority. In effect, higher priority goals may very well be achieved at the expense of the lower priority objectives, LLee and Moore, 1974], For instance, when a resource is known to be totally consumed, i.e. 100} utilization, one expects that service requirements of higher priority users are met possibly to the exclusion of those of lower priority users. T while the shortage in hour z will be 202 4. Suggests KYhpn.oirm« A different pair of associated with each 1 . and 1 . is S.. In our illustrative application, for instance, four examine the following i~i triplets or bundles: one may IS. , 1. , 10*19. ^Stt}; 1 (10,29.5,3b); 120,39.5,1b) and 130,49.5,0). An installation management's preference or indifference among the bundles will be used to order the operationally feasible combinations and this ranking may then be used to impute the _numerical relationships between c . , c" and c . . 11 i So far, objectives have been formulated and priorities assigned along job class lines. A more useful framework may be one where management's multiple goals are specified in terms of turnaround/response time, availability/utilization of (sub) systems, workload balancing and queueing problems in the system. For example, a top-priority, one-sided goal that is not open-ended may be that the turnaround time for a given class of batch jobs be no more than a numerically specified interval of time. Most companies now view distributed systems to be more efficient in providing better service to the users than are centralized systems LSvobovoda, 197b j because of the former's virtually superior availability, reliability, flexibility and security. One of the most troublesome problems now introduced by the addition of telecommunication (switching and transmission) interfaces to the existing hardware/software interface in centralized systems is how to distribute shared resources such as databases both physically (hardware topology) and logically (software protocol) in a way that minimizes congestion and deadlock in the system. resources in these computer networks. Furthermore, other forms of goal programming which permit one to make progress toward all objectives simultaneously, LBradley, Arnoloo and Magnanti, 1977 j, may also be used. L1] Bradley, S.P., Arnoldo, C.h. and hagnanti, T.L., Applied hajjasmatical Programming Heading, Mass.: 1977. L2] Bronner, LeeHoy, Capacity Planning implementation , IBM Technical Bulletin UG22-9015-00, January 1979. Lb J Buffa, K.L. and Miller, J.G., Product ion-Inventory Systems: Planning .and. Control, homewood, Illinois: Richard D. Irwin, 1979. 14] Ferrari, D. Computer SjafcSBa Performance Evaluation, Bngiewooo Cliffs, New Jersey: Prentice-Hall Inc., 197b. L5J Urochow, J.M. , "A Utility Theoretic Approach to Evaluation of a Time-Sharing System," in Freiberger, w. (ed.) Statistical Computer Performance evaluation, New lork: Academic Press, 1972 pp. 25-50. Lb J Lee, S.h. and Moore, L.J., "A Practical Approach to Production scheduling," Production jand Inventory Management . 1st Cuarter, 1974, pp. 79-92. L7J Svobodova, Liba, "Performance Problems in Distributed Systems," Proceedings nf. CI PS Session '2b_, Bdmonton, Alberta, Canada; May 23-25 197b. The goal programming approach suggested here, being a multidimensional form of linear programming, will be well suited to the allocation of scarce 203 Statistical Methods 205 Adaptive Load Control in Batch-Interactive Computer Systems Samuel T. Chanson and Prem S. Sinha Department of Computer Science University of British Columbia Vancouver, B.C. V6T 1W5 Canada This paper presents a systematic approach to estimate the satura- tion point of a large computer installation using operational analysis. An expression for saturation point as defined by Kleinrock is derived in terms of measurable and operational quantities. Using stochastic programming and time ser~ _ analysis the optimal number of batch jobs that should be activated within next interval is then computed, so that the system is neither underutilized nor over-saturated. Keywords: Multiprogramming; response time; throughput rate; saturation point; load control; queuing theory; operational analysis; optimziation. 1. Introduction Most large scale computer systems employ some form of load control to maintain a high throughput rate and/or to provide an acceptable level of service to the users. For paging systems this is often accom- plished by manipulating the degree of multi- programming or equivalently the size of the resident sets of the active processes to keep the system from becoming saturated. Previous work directed towards this end is primarily represented by the development of the working set policy [1,2,3,4] where working set is defined as minimum number of pages needed to run a program efficiently. More recently, efforts to optimize the system work capacity lie mainly in keeping some measures related to program behavior (usually paging behavior) within some pre- determined bounds [5,6,7,8]. The 50% criterion [7] for example, aims at main- taining the utilization of the paging device to around 0.5. The L=S criterion [6] pro- poses to keep the system life time to approximately that of the page swap time. The knee criterion [5,8] suggests that the mean resident set size of each process should be maintained at the value associated with the primary knee of its life time function, where life time is defined to be expected time between two successive page faults [5]. Though the most robust of the three, the knee criterion, is also the most costly to implement and involves the largest amount of overhead. Though these criteria are not based on mathematical models and cannot be proved to be optimal, they aim at increasing the throughput rate by loading the system up to the point when the measured indicator suggests further increase in system load may cause ' thrashing ' . The methods cannot be appli. i to non-paged systems. Furthermore, for interactive systems and combined batch-: interactive systems, one is interested not only to maximize the system throughput rate but also to guarantee good response times to the interactive jobs (possibly at the expense of the batch jobs). Landwehr [9] studied a combined batch-interactive system and 207 proposed a scheme to activate batch jobs based on the terminal load. The emphasis of the study, however, was on model formu- lation and validation. There was no attempt to prevent the system from saturation or to optimize performance. As well, there is no easy or systematic way of determining the values of the break points. Hine et. al. [10] studied the problem from a slightly different viewpoint. Their goal was to control the main memory allocation for each class of jobs to provide different response times to each while maximizing the CPU utilization. They employed a mathematical model but optimization was achieved by an exhaustive search technique. A heuristic was also given which provides good but not optimal results. "mean response time saturation load No. of active terminals Figure 1. Mean Response Time vs the Number of Active Terminals In this paper we study the performance of a combined batch-interactive computer system using the operational analysis technique proposed by Denning and Buzen [14]. The control algorithm determines from time to time the number of batch jobs (if any) to be activated from the batch queue. The control criterion aims at keeping the system from saturation (to be defined in the next section) while minimizing the mean number of jobs waiting to be activated. The effect is to maximize the throughput of the system while maintaining good response time for interactive jobs. 2. Estimation of System Saturation Definitions of system saturation have been proposed [12,13,14]. Invariably the system is considered saturated at the point the response time vs system load curve starts to rise rapidly. Kleinrock [12], for example, using the number of active termi- nals as the load, defined the system's sat- uration point to correspond to the inter- section of the mean normalized response time curve asymptote and the horizontal line corresponding to the minimum response time (i.e., when there is only one active terminal). (See Figure 1). If a system is not allowed to get saturated according to this definition, the mean response time of the active jobs will not exceed an accep- table level. However, the implicit assump- tion is that the program population considered is both homogeneous and station- ary. Our approach is to compute the system saturation load at small intervals (such as a few seconds) during which the stationary assumption is justified. The homogeneous assumption is discussed below. The computer system is often modelled as a central-server model [11,13]. Consider such a model with M service centres and a degree of multiprogramming equal to N. Each service centre consists of a device and its associated queues (Figure 2) . ^1 -*CP^L ? 2 ^3 ^M -d- :s 12 q 3 M O-i Sm_ Figure 2. Central-server Model with M Service Centres The service centre S^ is the CPU service centre (the central server) . On completion of the CPU service a job either leaves the system or joins another service centre. A job leaving service centre S^, i=2,3,...,M must join the central server. 2.1 Notation The quantities defined here and computed in section 2.2 are mean values within an observation period and as such are functions of time which is omitted for clarity. 208 T : observation period X^ : observed number of completions at centre Sj during T B A : the total amount of time during which the service centre S^ is busy during T C-£ : observed number of requests for centre S^ during T q^ : request frequency, the fraction of jobs proceeding next to service centre S^ on completing a service request at the central server =c i /x 1 , m 2.2 Operational Quantities We now compute the operational quanti- ties in order to obtain an expression for saturation point. Mean service rate of server S^ = u^ = X^/B^ Utilization of server S.^ = p ± = Bj/T System throughput rate f = (X-, • q-i)/T X^ B! x = — • -J • qi Bl = v p q x 11 (i) Utilisation f ± = h ± /T x l B i Xi A i L A l »1 Xl B ± X ± Under the job flow balance assumption X ± = C ± m •*■ Pi = h " !r * « ± . & 1 < 2 ) M 1 M Vi .*- Z f ± - f ! + I r x • »r • q ± i=l x i=2 L M i 1 When there is only one job in the system M I h - ' ■ i=l M y, i=2 "i ^ ^(D - 1 1 ^ ' q ± + i] -1 (3) R(N) = N/f , by (1) N Nqi y l q lPl ~ P iPi q l ill 1 M ^i by (3) R(l) =~r il ~r — [ Z — • q, + l] ? q l 1=2 *± * (4) (5) The equation of the asymptote (as N appro- aches infinity) is more difficult to derive. Let us first consider the simple case of a non-virtual memory system. The asymptote occurs at the point when the utilization of a service centre (i* say) reaches unity (i.e., it becomes the system's first bottle- neck) . From Buzen's analysis [11], i* is that service centre which has the highest utili- zation in an interval (i.e., i* may vary from interval to interval as the work load characteristics change) . If it is the CPU the equation of the asymptote is simply ROO = Mi (6) Otherwise, using equation (4) and noting that u . as well as the ratio (q-^/q^) remains unchanged as N increases, the equation of the asymptote is RCO = Nq i* p i* q l i*^l (7) For a paging system, the eventual bottleneck as N approaches infinity must be the paging device but it need not be the first device to saturate. Case (i), the paging device is not the first to saturate. In this case, as the system is satura- ted before the paging device is fully utilized, the asymptote should be computed based on the first device to reach satura- tion and equation (7) is still valid (see Figure 3). We use Little's law to compute the mean response time 209 Mean response time paging device becomes the bottleneck , a non-paging device is saturated saturation load degree of multiprogramming, N Figure 3. Mean response time vs N, a non- paging device is the first to be saturated Case (ii), the paging device is the first to saturate. The ratio q^^/q^ continues to increase as N increases and approaches infinity as N approaches infinity. A realistic approach consistent with the one used in Case (i) is to use the value of q^^/q^ corresponding to the point the paging device first becomes fully utilized. However, this ratio is not easy to estimate. The observed value of q.^/q-^ can be used if the system is close to saturation (i.e., N* = N, see below) when the parameters are measured. Otherwise the saturation load will be under-estimated . This is not a problem when the work load is light. As can be seen subsequently, if the system work load then gets heavy, the control policy will adjust to it and the observed ratio will again approach the desired value. The saturation load N* satisfies equations (5) and (6) M V N* = 1 + 7 -q. if the CPU is the i«2 bottleneck or it satisfies equations (5) and (7), N* = *i* M y. mq u l M i* 1=2 «o V. q ± + i] (8) (9) All of the above equations can also be de- rived using queuing theory. Most proposed schemes assume a fixed saturation load. The Michigan Terminal System [15] for example computes the values of five load factors at fixed intervals and if one or more exceeds the corresponding predetermined static saturation value, the system is assumed to be saturated. For the 50% criterion, the saturation point corres- ponds to when the utilization of the paging device exceeds 0.5+c, where c is some small positive constant. The L=6 criterion to a certain extent assumes the system to be saturated when the system life time is below the page swap time, which is fixed for a given paging device. In a previous report [15], we have shown that the saturation load is really a function of the characteristics of the current work load and cannot be very well represented by some constant measures. For the present model, these work load charac- teristics are q^ and y i , i=l,2,...,M. Any model which does not take this into consi- deration will sometimes over-estimate and sometimes under-estimate the system satura- tion load. The fact that the over-estimation on the average is equal to the under-estima- tion provides no comfort when the goal is to optimize performance at all times. 3. Load Control The first criterion for load control is to keep the system from saturation. From Figure 1, it is seen that the mean response time increases rapidly beyond this point. Furthermore Denning [16] has shown that 'thrashing' (and thus reduced throughput rate) occurs when the paging device is saturated. For multiprogrammed paging computer systems, the simplest way to accom- plish this is to keep the number of active jobs below N* given in equation (8) or (9). Since the system throughput rate is a non- decreasing function of N before the system saturates [8] , activating N* jobs whenever possible will also maximize the throughput. There are three cases to consider: (L) the system is saturated (L.e. N>N*) , (Li) the system is under-utilized (L.e., N«N*) , (Lii) the system is close to but not yet saturated. Case (iii) is the interesting case since if the system is under-utilized, it is unnecessary to apply any control measure but to activate each job as it arrives until the condition for case (Lii) is reached. If the system load is then properly controlled, the system should attain saturation (case (L)) infrequently and only for brief periods. The control, when the system is saturated, could simply consist of not activating any more batch job until the system comes out of saturation. If the system is in the satura- tion state frequently and for extended 210 durations then it is highly probable that the hardware is inadequate to handle the normal work load and should be upgraded. Thus we shall consider only case (iii) in this paper. Ve note that many systems (e.g. the Michigan Terminal System [15]) do not apply any control until saturation is detected. This, in our opinion, does not constitute proper load control. 4. Optimization Ve define the following variables all of which are functions of time T which is omitted for clarity: I : number of batch jobs waiting to be activated during T, J : number of terminal requests during T, E : expected number of terminal arrivals in c the next interval. E, : expected number of batch arrivals in the next interval, D : expected number of job departures (both batch and terminal) in the next interval, N, : optimal number of batch jobs that should be activated, N : optimal system capacity (measured in terms of the number of jobs) to be reserved for expected incoming terminal jobs, S : remaining system capacity (defined as the number of additional jobs that can be accommodated without saturating the system) . S can be approximated by S = N* - N + D The problem is to determine how many of the S jobs should be filled by waiting batch jobs (Lf any) . Our objective is to maximize the mean system throughput rate without saturating the system. This is equivalent to minimizing the expected number of jobs that have to wait at each interval because admitting them would saturate the system. Ve shall minimize a weighted sum of the waiting batch and terminal jobs which is a more general problem. Let the weights be C^ and C2 for batch and terminal jobs respectively. The opti- mization problem is therefore (10) (11) min{(E t -N t )C 1 + (E b -N b )C 2 > subject to N + N < S N t C2 (i.e., terminal jobs are favoured) , it is easy to see that the solu- tion to the above optimization problem is N t = E t if E t < S N ■ S if E fc >_ S N b = S - E t N, = D (12) In the above computation, it is assumed that E t and D are available at the beginning of the interval. If E t N* for a short period at the beginning of the interval. This problem can be alle- viated by spreading out the activation of the N^ batch jobs throughout the interval instead of all at once at the beginning. It remains to show how E^ and D can be computed using smoothed statistical estimates. Let P t be the expected prediction of the parameter for the period [t,t+l]. Let x t be the observed value of the parameter at time t. P t can be expressed as p t = (1-3) (x t + ex^ + 8 2 x t _ 2 + ...) = (1-6) I B ± X t _ i i=0 where the exponential weight factor 3 is a constant between zero and one. Similarly, 00 P - (Hlh^ 1=0 p t = u-e)x t + BP t _ r (13) Now let the error made at time (t-1) in predicting x t be e , then « t = X t - P t _ x (14) substituting in equation (13) , p t = x t - tet = P t-1 + (l-g)e (15) 211 Now if e fc is relatively small, we do not recompute the value of 6. If E t is large, we find a new value of 8 which will minimize the sum of the squares of errors given by I {x -a-*) l e j x ,r j=0 X J ± (16) i=t In practice, the summation in (16) does not have to involve many (k,say) terms before 3* approaches zero, g does not have to be very accurate and standard techniques exist for its efficient computation. To summarize, the control procedure consists of the following steps: 1. During an interval T, observe D, J, N, qi» V ± , i=l,2, . ..,M. 2. Compute N* using equation (8) or (9). 3. Estimate the expected number of terminal arrivals and total departures in the next interval using equation (15). 4. Compute the number of batch jobs N^ to be activated in the next interval using equation (12) . 5. Terminal jobs are immediately activated upon arrival. 5. Conclusion A model to estimate the saturation of a computer system has been presented which is capable of adjusting to varying work load characteristics. Based on this, the number of batch jobs that should be activated to minimize a weighted sum of the number of jobs that will have to wait upon arrival without saturating the system in a combined batch- interactive environment is computed using optimization theory. The approach thus provides good mean response time to the terminal jobs and maximizes the system throughput rate under that condition. The second level of load control - that of the selection of the type of jobs to be activated has not been discussed in this paper. Work has started in this direction. However, because of the difficulty of predicting accurately the resource demands of a job before it is executed and because of the adaptiveness of the proposed scheme which is capable of correcting itself it may be that equally good results can be achieved without it. The use of the number of jobs to charac- terize the system load is of course not pre- cise as jobs do not necessarily have the same resource demand characteristics. However, it has been shown to produce useful results [17,18], It is made even more acceptable for this application because of the built-in adaptiveness of the proposed policy. If the average resource demand of the activated jobs in the next time interval is lighter than that of the previous inter- val (on which the remaining system capacity was estimated) , then the updated remaining system capacity will increase and more jobs will be activated in the next interval. On the other hand, if the average resource demand is heavier, then the remaining system capacity will decrease and fewer (or no) jobs will be activated in the next interval. References [1] Denning, P.J., "The working set model for program behaviour", Comm. of ACM 15, 5 (Hay 1968), 323-333. [2] Denning, P.J., "Virtual memory", Computing Surveys 2,3 (Sept. 1970), 153-189. [3] Denning, P.J., "Third generation computer systems", Computing Surveys 3, 4 (December 1971), 175-216. [4] Rodriguez-Rosel, J. and Dupuy, J. P., "The design, implementation, and eva- luation of a working set dispatcher", Comm. of ACM 16,4 (April 1973), 247-253. [5] Denning, P.J., Kahn, K.C., Leroudier, J., Potier, D. and Suri, R. , "Optimal multi- programming", Acta Informatica, 7 (1976) 197-216. [6] Denning, P.J. and Kahn, K. , "An L=S criterion for optimal multiprogramming", Proc. Int'l. Symp. on Computer Per- formance Modeling, Measurement and Evaluation , (March 1976), 219-229. [7] Leroudier, J., Potier, D. , "Principles of optimality for multiprogramming", Proc. Int'l Symp. on Computer Perfor- mance Modeling, Measurement and Evalua- tion (March 1976), 211-218. [8] Graham, G.S . and Denning, P.J., "On the relative controllability of memory policies", Proc. Int'l. Symp. on Computer Performance, Modeling, Measure- ment and Evaluation, (August 1977) , 411-428. 212 [9] Landwehr, C.E., "An endogenous priority model for load control in combined batch-interactive computer systems", Proc. Int'l. Symp, on Computer Perfor- mance, Modeling, Measurement and Eva- luation , (March 1976), 282-287. [10] Hine, J.H. , Mitrani, I. and Tsur S., "The control of response times in multi-class systems by memory alloca- tion", Comm. of ACM 22,7 (July 1979), 415-424. [ll] Buzen, J. P., "Analysis of system bottle- necks using a queuing network model", Proc. ACM-SIGOPS Workshop on System Performance Evaluation (April 1971) , 82-103. [12] Kleinrock, L., "Certain analytic results for time-shared processors", Infor- mation Processing (Proc. IFIP Congress 68), (1968), 838-845. [13] Ferrari, D. , Computer Systems Perfor- mance Evaluation , Prentice Hall, 1978. [14] Denning, P.J. and Buzen, J. P., "The operational analysis of queuing network models", Computing Surveys 10,3 (September 1978), 225-261. [15] Chanson, S.T., "Saturation estimation in interactive computer systems", Technical Report 79-7, Dept. of Computer Science, University of British Columbia, (June 1979) . [16] Denning, P.J., "Thrashing: its cause and prevention", Proc. FTPS , 33, (FJCC, 1968), 915-922. [17] Scherr, A.L., "An analysis of time shared computer systems", Ph . D . Thes is , Dept. of Electrical Engineering, M.I.T., Cambridge, Mass. (June 1965). [18] Chanson, S.T. and Ferrari, D. , "A deterministic analytic model of a multiprogrammed interactive system", NCC, AFIPS Conference Proc , 43(1975), 645-652. 213 Sensitivity Analysis and Forecasting for Large Scale IBM Computer Systems: A Methodological Approach and Case Stuu V Carl Steidtmann Staff Consultant Mountain Bell Denver, Colorado This paper outlines the development and use of a new method of forecasting computer capacity for large scale IBM computer systems at Mountain Bell. The model that was developed to accomplish this task uses three different statistical techniques. A representation of the computer system is developed using two stage least squares multiple regression. An autoregressive integrated moving average process is then used to individually forecast the level of use for each of the components of the system. These values are then placed into a sys- tem of simultaneous equations which are then solved to give the desired results. The data that was used in this project was collected over the course of a year by hardware monitors off of an IBM 3033 that was the global processor in a MVS/JES3 triplex in Mountain Bell's Colorado/Wyoming processing center. Key words: Box-Jenkins; forecasting; multiple regression; sensitivity analysis; simultaneous equations. 1. Introduction In data processing two of the most dif- ficult forecasting problems that management must face are: 1. When will our present computer system run out of capacity? 2. If we make a change in our present computer system, what will be the impact of the change on the different components of the system? By developing the proper answers to these questions management in data proces- sing can provide the best possible service to the organization at the least cost. The model presented in this paper seeks to provide data processing management with the answers to the above questions. In answer- ing these questions the model gives its user an understanding of when new hardware needs to be ordered and what the impact changes in the present computer system will have on the other components of the system. By forecasting computer utilization on a component by component basis through the use of simultaneous equations, sensitivity analysis can be conducted to ascertain the impact of a change in one system component on all of the other components of the system. 2. The Theoretical Model In developing the theoretical dimen- sions of this model two design criteria were constantly kept in mind. The first criteria was that the output of the model had to pro- vide useable forecasting information. The second criteria was that the model had to have the capability of answering 'what If questions. While it is a relatively simple process to develop a forecasting model for computer capacity, creating a model that 215 gives its user the capability of performing sensitivity analysis is a bit more difficult. In trying to meet these two design criteria it was felt that the best results could be obtained through the combination of three separate forecasting techniques. The three techniques that are used in this model are multiple regression, Box-Jenkins time series analysis and simultaneous equations. Multiple regression alone can be used as an effective forecasting tool if the data conforms to a rather strict set of assump- tions. Since the data in question did not conform to several of these assumptions mul- tiple regression was used only to quantify the interrelationships that exist between the different components of the system. A second statistical method, time series analysis, was selected to make point fore- casts about the level of system component utilization. While time series analysis would by itself provide an adequate means of forecasting component utilization, it fails to take into account the dynamic nature of the process that is under study. To incorporate this perception of a dynamic process into the model, the equations that were developed in the multiple regression portion of the study were merged with the point forecasts that came from the time series analysis. This merger of the output of these two statistical techniques produced a set of equations that were solved simulta- neous to produce the final forecast. 2.1 Model Development A methodological flow chart that out- lines each step involved in the development of the forecasting model is given in Figure 1. The first step that must be taken in the model design process is an evaluation of the data that is available for forecasting pur- poses. This data can come from System Management Facility (SMF) data, hardware monitors, or Resource Management Facility (RMF) data. If other sources of data are available these can also be used. Once this data evaluation process is completed the selection of the system components that are to be used in the model can proceed. There are no restrictions as to which system com- ponents can or should be used in the model. Selection of system components depends entirely on the interests of the model builder and the availability of the data. Some of the components that can be used include: 1) Central Processing Unit Busy 2) Information Management System (IMS) Transactions 3) 4) 5) 6) 7) 8) 9) 10) ID a) Rate b) Volume c) Response Time Problem Program State Level Supervisor State Level Swapping Rate Paging Rate Channel Busy Control Unit Busy Device Busy TSO Usage a) Number of Users b) Response Time c) Volume of Transactions Resource Usage by Application Once all of the data has been collected and the appropriate variables have been identified, the next step is to determine the correlations and the interrelationships that exist between the selected system com- ponents. This task is accomplished through the use of the statistical tool of two stage least squares multiple regression. FIGURE I FORECASTING METHODOLOGY Data Box-Jenkins Identification Stage I Regressions Nelder-Mead Function Min. Stage II Regression; Marquardt N-L Least Squares Box-Jenkins Forecast 7 Simultaneous Equations Model Forecasts Sensitivity Analysis 216 2.2 Multiple Linear Regression If there are n different system compo- nents that have been included in the analysis then each individual system component will yield an equation of the following form: Using matrix notation this set of equations can be reduced to: Y txl = B txt X txl U txl (4) Y n = B +B l X l +B 2V-- +B n-l X n-l +IJ n where: (1 ) The beta values in this last equation repre- sent the technical coefficients of the system under consideration while the X values repre- sent the different system components that were used. .X U. is the dependent system component is the intercept value are the regression coefficients are the observations of the independent system components represent the error terms Using matrix notation for each system compo- nents we get: Y = XB + U (2) Where Y Y Y yH . x= ...x tl " B l U l ...x t2 B 2 U 2 ••• X t3 B 3 U 3 ... 3 = LJ= . -^tn B t \ _ — — 1 ' X 21 * X 31 i .x 23 ,x 33 1,X 2n' X 3n Where: t = The number of system components under consideration n = The number of data observations being used in the analysis Extending the application of multiple regres- sion to all t components of the computer system we get the fundamental equation of the model. This can be expressed as: Y l = B 01 +B ll X l +B 2lV • B t-l,l X t-l +U l Y 2 = B 02 +B 12 X l +B 22 X 2-" B t-l,2 X t-l +U 2 Y 3 = B 03 +B 13 X l +B 23 X 2--' B t-l,3 X t-l +U 3 The beta values must be interpreted within the context of the X and Y components of the equation in which the beta values are found. In their totality the beta values can be viewed as indicators of overall system performance and balance. Generally speaking, positive beta values should be viewed as indicators that the system is in balance. Similarly, negative beta values should be viewed as possible indicators of system bottlenecks. For example, if the Y value represented Central Processing Unit (CPU) busy, a negative beta coefficient on an X value that represents channel busy for some Direct Access Storage Device (DASD) channel tells us that as channel busy Increases, CPU busy declines. This situation would seem to indicate that the processor in question was I/O bound on the channels that had negative beta coefficients. Further analysis would be required to confirm this conclusion. Under different circumstances if the Y value represented channel busy a negative beta coefficient on an X value that repre- sented channel busy for another channel indi- cates that as channel X busy increases channel Y busy decreases. This situation would tend to indicate that the channels in question were in contention with each other. A similar anlaysis and conclusion could be drawn for individual devices and control units that exhibited the same X,Y and beta coefficient values. Positive beta coefficients under the right circumstances could also be viewed as an indication of potential system problems. If the Y value represents IMS response time and the X value in question represented channel busy a positive beta coefficient would tell us that as the channel busy (3) increased so would the IMS response time. The channel in question could represent a bottleneck for the IMS application. Again, further study would be necessary before a definitive conclusion could be drawn. B 0t +B lt X . +B 2t X 2 B t-l,t X t-l +U t 217 2.3 Assumptions 2.4 Box-Jenkins Time Series Analysis In using the statistical tool of multi- ple regression, the analyst must make a number of assumptions about the nature of the data and the relationships between the variables that are involved 1n the analysis. The first assumption implicit in one's use of multiple linear regression is the assumption that the relationship that exists between^the Y and X values is a linear one. A low R value is a statistical way of saying that the independent variables do not explain the variation in the values of the dependent variable. [1] The second assumption that is implicit in the use of multiple regression is that the error terms (U ) exhibit homoscedasticity, that is to say that the errors have a constant variance over the range of empirical observations. A third assumption that is made in using multiple regression is that the error terms (U ) are not autocorrelated. The existence of autocorrelation in error terms can be tested via use of the Durbin-Watson statistic. The absence of autocorrelation means that the level of one set of observa- tions of independent values does not influence any other set of observations. In using data that measures device utilization of the different components of a computer system this assumption is often violated since what transpired on a computer system today can influence what will happen tommorrow. [2] The fourth assumption that is made in using multiple regression is that the error terms are normally distributed. In dealing with a large number of data observations as is the case in doing computer capacity fore- casting this assumption does not pose a problem. [3] The fifth assumption is that collinearity does not exist between the independent vari- ables of the multiple regression equation. It would be virtually impossible to select independent variables that exhibited no collinearity. If several variables are found to be collinear then they should be dropped from the analysis since they offer no additional explanation of the variation of the dependent variable. [4] The final assumption of a multiple regression model is that the postulated equation or model is in fact the true model. This assumption of model significance can be evaluated through the use of a F-test. Once the structural characteristics of the computer system have been ascertained, the next step in the model building process is to forecast a trend in the use of all of the components of the system that make up the model. Collectively these trended values will represent the X vector in the equation: Y = B * X + U (5) The statistical technique that was selected to perform this task was Box-Jenkins time series analysis. In developing the forecasted values for the X^ vector, using the Box-Jenkins approach involves a three step process. The first step is to identify the functional form of the Box-Jenkins model to be used. Once the form of the model has been identified the next step is to estimate the value of the parameters of the functional form. Once both the functional form of the model and the parameters of the model have been estimated the final step in the process is to make a forecast. 2.5 Model Identification The purpose of model identification is to determine the appropriate class of model that the data represents within a general class of models. In making this determination the value of three different parameters that represent the nature of the data must be identified. The first parameter or order that must be identified is the degree of differencing that is required to make a time series stationary. A stationary time series is one that has a fixed mean value while a non- stationary time series such as the utilization of a computer system have an increasing or decreasing mean value. A stationary process can be squeezed from a nonstationary one by differencing the nonstationary process 'd' times. Thus we have: vx t = x t - Xl>1 (6) The number of times that the time series must be differenced to achieve stationarity becomes the value of 'd' . Once the time series has been differenced as many times as necessary to produce a 218 stationary process, the autoregressive para- meter, p, and the moving average parameter, q, can be estimated to give us a model that describes the time series model of order (p,d,q). The process of determining the order of the autoregressive parameter requires a combination of applied statistics and tempered judgment. The process begins by calculating the values of the autocorrelation coeffi- cient. These values represent the ratios of the covariances of the time series at dif- ferent time lags. Thus the autocorrelation coefficient for lag j is: p j V. v o (7) Where: »j ■ E[Z t - r)(Z t+ . - r)] r - l./t * z Z t Z. = The data observation of the time series at time t. The order of the autoregressive process is generally assumed to be 0,1, or 2. If the time series in question has an autore- gressive order whose value is zero then only P-, will be a nonzero value. All of the remaining P. values will equal zero. For an autoregressive time series whose order is 1, the value of the autocorrelation coefficients will tail off in an exponential manner as j increases. For an autoregressive process whose order is 2 the values of P. will exhibit either a mixture of exponentially increasing and decreasing values or they will exhibit a damped sine wave pattern. Partial autocorrelation coefficients can next be derived using the autocorrelation coefficients that were calculated in the last step. For a time series whose autoregressive order is K we have: p j =0 ki p j-i + --- kCk-D p j-k + i + Vj-k (8) 5=1,2 k Where all of the above values represent the partial autocorrelation coefficients. Putting these equations into a matrix struc- ture yields the Yule-Walker equations that can be expressed as: 1 PI P2 Pi 1 Pi P P «— 1 K-2 Pi 1 K-l "\r "pi" K-^ \i P2 - P3 1 • Ak P < These equations are then solved for all relevant levels of K, the order of the autoregressive process. [5] Once all of the values of have been determined the order of the moving average process can be determined. Like the order of the autoregressive process the moving average order is generally assumed to be o,l, or 2. The order of the moving average process is assumed to be zero if all of the values except 0,, are zero. The order of the moving average process is assumed to be 1 if the values decline in an exponential manner. The order of the moving average process is assumed to be 2 if the values represent either a mixture of increasing and decreasing exponential movement or if they exhibit a damped sine wave pattern. [6] It should be stressed in concluding this section that the process of model identification is an iterative process that in many cases must be done repeatedly in order to get the desired results. This is due largely to the fact that time series model identification remains an art, and not science, that is based on rules of thumb rather than unbreakable laws of science. 2.6 Model Identification Once the values of p,d, and q have been identified for the time series in question, the next step in the process is to identify the values of the parameters of the process that was identified in the previous step. The parameters to be estimated depend on the characteristics of the model that has been identified. A simple moving average model of order q, MA(q), takes the form: Z t = A t t-1 q t-q (9) Where: A t = Z t " Z t-1 The parameters of the model to estimated are: 219 9 1 ,0 2 ,...9 q Mt 3bl (13) A simple autoregressive model of order p, AR(p), takes the form: Z t =0 l Z t-l + 02 Z t-2-Vt- P +A t < 10 > The parameters to be estimated from this model are: Mz-'-A A more complex model that mixes both the autoregressive and the moving average model of order p,q or ARMA(p,q) takes the form: h=h Z t-l +0 2 Z t-2- • - + Vt-p +A t- X - • Vt-q (ID The parameters to be estimated from this model are: P-i » vp > • • " *n 1 '?'" ' * n Finally, an autoregressive integrated moving average model of order p,d,q or ARIMA(p,d,q) takes the form: (12) Z t =(1+0 1 )Z t-l + - • Vd Z t-p-d- 9 l A t-l ' -Vt-q +A t The parameters to be estimated from this model are: r 2 •Vd'W .9 Where: a t represents the residual values of the estimated function. b. represents each of the parameters to be estimated. From the X.. values a I x J matrix, referred to hereafter as X, is formed where: x. = zx 1t x 1 . (14) The next step is to form a Y vector where the elements of Y are: Y. « EX n A t (15) The X vector and JK matrix produce a set of modified linearized equations much in the same fashion as found in multiple regression and in a like manner the values of the B_ vector are solved for where: Y = XB (16) The values of the B vector, the Y_ vector, and the X matrix are scaled before the system of equations is solved for B. The scaling factor that is used in this step is: C. = /X The statistical methodology that was used to estimate these values was a Marquardt Algorithm for nonlinear least squares. The purpose of the algorithm is to fit a function to a given set of data points and in so doing estimate the values of the parameters in the function. The algorithm begins by asking for preli- minary estimates of the parameters. These initial values were estimated using the Nelder-Mead Function Minimization technique which uses a simplex technique to find a mini- mum value for a known function given the constraints as defined by the user data. [7] In cases where there is only one parameter to estimate the Nelder-Mead technique proves sufficient by itself. Given the initial values to start with, the following derivatives are evaluated for each step in the Marquardt process. Thus B * i * Y i * X ii li ■ V c i c j - B./C. (17) (18) = Y./C. = 1 + Q The new parameters for the B_ vector are next determined and the sum of the squares is cal- culated for the new set of B_ values. The new B values, are derived as: II Bq+1 (19) 220 Where: EL = The original estimates of B B_, = The new estimates of B_ * JL = The values derived from the pre- vious step. If the sum of the squares of the new estimate is less than the old, the parameter correction, B*, are tested for convergence. If all are smaller than the user set conver- gence value then convergence is assumed and a covariance matrix of the B, estimates 1s produced and the process is complete. If convergence has not occurred then the value of n 1s reduced by a user set factor of M and the entire process is repeated using the original EL values. If the sum of the squares of the new estimates is greater than that of the old estimates then n is incre- mented by a factor of M and the entire pro- cess 1s again repeated using the original values of Bg. [8] 2.7 Forecasting Once the time series model in question has been identified and the parameters of the model have been estimated the last step In this phase of the model building process is to generate a forecast for what will become the values of the X vector variables of our general model . The methodology that was selected to make the forecasts for each of the values 1n the _X vector of the general model was the difference equation approach as sug- gested by Box and Jenkins. The difference equation approach to time series forecasting begins by taking a condi- tional expectation of the difference equation form of the model that was identified for some time origin t. For example, if the model that was Identified in the first step of the time series process was an autoregres- sive integrated moving average model of order 0,1,1), the difference equation for this type of model would be: W^VWVt-Nt W^MVVt-i+VVt-i C20) Taking the conditional expectation of this difference equation would yield: z t + r (1+ 0. )z t-¥t-i +A f 9A t-i < 21 > (22) (23) Z t+i =(1+0 l )Z t+i-l" 2 Z t+i-2 for all i>2 For all forecasts beyond the time period of t+2 the A t and 9A. elements of the difference equation fall out because the future expecta- tion of all of the error components A. equal zero. Thus: E(A t ) ;?4) The A. value has a direct influence on the forecast for the first two time periods of the forecast since the actual value of the A t values can be used. For forecasts beyond time period three in the above equation the actual A t values cannot be used since they have not taken place and as such are not known. This is not to say that the influence of the A. values is not embedded in the fore- casts at lead times greater than two as all of the future forecasts are indirectly dependent on the forecasts in time periods one and two which in turn are dependent on the value of A t> [9] The Z. values that are used as seed values to start the time series are derived from a linear time series least squares. The mean value of the time series could be used if the time series was stationary. Since this is not the case for any of the components in a computer system, these seed values must be derived before the final forecast can be made. Once the forecast for each of the inde- pendent _X vector variables has been made, the forecast values are plotted out into the future and the general model itself can now be solved. 2.8 Simultaneous Equations One of the shortcomings of both multiple regression and time series analysis is the requirement that a distinction be made between explanatory and dependent variables. Once this distinction is made these statistical techniques imply that there is a certain degree of causality whose direction goes from the explanatory to the dependent variables. In a computer system maintaining this distinc- tion of explanatory and dependent variables based on assumptions of causality is a difficult if not spurious task. This is the 221 primary reason for turning to the technique of simultaneous equations where the values for all of the variables in the model can be determined at the same time. In dealing with the statistical technique of simultaneous equations the steps in the forecasting pro- cess that must be made are those of model identification, estimation, and the solution. 2.9 Model Identification The crux of the identification problem resolves around the statistician's ability to estimate the functional equations of the system under consdieration from the data that is available. In layman's terms the identi- fication problem asks the question of whether a solution exists to the system of equations that is both meaningful and unique. Whether or not a system of equations can be identified depends on its ability to meet the standard rank and order condi- tions that have been established for simul- taneous equation systems. The order condi- tion, which is a necessary but not sufficient condition for system identification states that in a model of M equations each equation in that model will be identified if it excludes at least M-l of the variables appearing of the variables appearing in the model. The rank condition which is both a necessary and sufficient condition for model identification states that in a system of M equations with M variables that are to be determined internally to the model an equa- tion within the system of equations is con- sidered to be identified if and only if at least one nonzero determinant of the order (M-l), (M-l) can be constructed from the co- efficients of all of the variables in the model that were excluded from that particular equation in question but included in the other equations in the system. [10] Once it has been determined that the model in question can be identified, the next step in the process is to estimate the equations in the system. 2.10 Model Estimation The method of model estimation that was used for this step in the model building pro- cess was two stage least squares multiple regression. Since multiple regression was discussed in length earlier in this paper, comments about its use will be limited. A two stage least squares approach is taken so as to minimize any correlation that may exist between dependent variables that appear in each regression equation and any error term of the other independent variables. The elimination of this correlation which would lead to bias in the final system of equations is accomplished through the use of two stage least squares regression. The first stage regression produces estimates of all of the dependent variables based on the original data. The second stage regression produces estimates of all of the dependent variables based on the estimates of the first stage. [11] 2.11 Model Solution The system of equations is solved by a reduced form algorithm that expresses each of the unknown variables of the system as a func- tion of the predetermined values of the model. This is done through a process of substitution that reduces each equation down to the point where there is but one unknown variable in each equation. Once the process of reduction through substitution has taken place the solution of each of the remaining equations for the value of the unknown variables becomes a trivial task. 3.0 The Data A listing of the data that was used in the case study is available on request from the author. In total there were 323 obser- vations where each observation represents the average percentage utilization of a parti- cular set of system components over the course of a day for the entire year of 1979. The data were captured by a hardware monitor and stored by ten minute intervals on a data base. For the purposes of this study, each of the 144 daily observations were aggregated and then averaged to produce the actual data observations. The different data elements that were used in this study are: Jul ian Date Central Processing Unit Busy Problem Program State Busy Idle-Stop Channel 1 Busy - DASD Channel Channel 2 Busy - DASD Channel Channel 3 Busy - MSS Channel Channel 4 Busy - DASD Channel Channel 5 Busy - Tape Channel Channel 7 Busy - DASD Channel Channel 8 Busy - DASD Channel Channel 9 Busy - MSS Channel Channel A Busy - DASD Channel Channel B Busy - Tape Channel 222 The major shortcoming of the data exists in the periodic gaps that can be found. These gaps in the data were due to either a failure in the hardware monitor, a failure in the system that was under study, or an error on the part of the hardware monitor operators. In total these gaps represent less than 10% of the total data that could have been cap- tured and in no way should bias the overall results of the model. 4.0 The Fitted Model In the multiple regression segment of the model building process a two stage least squares approach was taken to insure that the equations that were produced for the simul- taneous equation portion of the model were unbiased. The second set of regressions were run using the estimates generated from the first set of regressions. In the second stage regressions all coefficients that were not significantly different from zero at a confidence interval of 90% using a two-tailed t-test were dropped from consideration. The set of beta coefficients that were produced by this two stage process can be found in Table I. In addition to the beta coefficients there are a number of other vital statistics that were derived from each regression to ascertain whether or not any of the original regression assumptions were violated. A listing of these statistics for both the first and second stage regressions can be found in Table II along with their corre- sponding dependent variable. The statistics of interest in determining the usability of the regression equations are the values of R , F, and the Durbin-Watson statistic. An examination and comparison of these statistics between the Stage I and Stage II regressions will reveal that the critical statistics in question are pretty much the same. The F values from the Stage II regressions were across the board higher than the F values from Stage I although this is probably due to the dropping of the insigni- ficant variables in performing the Stage II regressions. The F values in Stage I varied from a low of 97 to a high of 541 and in all cases represent highly significant values at a 95% confidence interval. The F values from the Stage II regressions varied from a low of 130 to a high of 1643 and were also in all cases highly significant at a 95% confidence interval . 2 The R values for both Stage I and Stage II regressions were similar. In both cases the R values varied from .77 to .95 which would tend to indicate that a high degree of the variation in the dependent variables can be explained by the independent variables. The statistic that casts a dark pale over the results in general is the Durbin Watson statistic. As discussed earlier, the Durbin- Watson statistic is a measure of autocorrla- tion that might exist in the data under study. As anticipated the Durbin-Watson statistic in both Stage I and II regressions were in the range of. 9 to 1.7 which indicates that auto- correlation exists. The existence of auto- correlation will not bias the estimates of the beta coefficients which is our primary concern. At worst, the existence of auto- correlation will bias the variance of our estimates of the beta's which in turn would mean that the values computed for the F and R statistics are not as good as we had pre- viously believed them to be. Plots of the residuals from both Stage I and II regressions are avail bale from the author. A visual examination of these plots indicated that several of the standard linear regression assumptions about the nature and distribution of the error terms have been at the very least violated in spirit if not in fact. The assumption in question here relates to the lack of autocorrelation among the re- siduals, the existence of a constant variance and the selection of the X values by a fixed and repeated sample. Again, as in the case of the spurious Durbin-Watson statistics it was felt that these violations of the standard assumptions would only bias the variance of the regression coefficients and not the coefficients themselves. 4.1 Time Series Model Identification The process of identifying the structure of a time series is an art and not a science. In trying to make a determination of the values of the parameters p, d, and q there must be a trade off made between the differ- ent criteria that can be used for the purposes of identification. Given this disclaimer, the following is an explanation of the structure of the time series that were developed as inputs into the general model . The output listings for the autocorrelation coefficients and the partial autocorrelation coefficients are available from the author. For the simplicity, the identification notation used by Box-Jenkins to characterize the different types of models will also be incorporated here. Thus a simple first order autoregreesive model becomes AR(1) while a comparable moving average model would be MA(1). An integrated autoregressive model would be ARI(l.l) while a comparable moving average model would be IMA(1,1). Combining 223 Dependent 1 Variable Table 1 Beta Coefficients 5 6 7 8 1 2 3 4 5 6 7 8 9 10 11 12 13 44.06 -2.92 77.00 2.33 .83 -3.48 -.53 .89 .03 -3.83 .07 -.45 2.8 ,61 .20 .21 .08 .05 .06 .40 .69 .02 .04 .04 .04 -.36 -.82 .36 .42 .31 .82 -.27 -.39 -.09 -.14 .29 .23 ,03 .11 .27 .21 .62 .08 .10 .67 .41 .37 .24 .05 .28 .34 .34 .65 ,24 ,17 .25 .17 .05 .11 .17 .44 .09 .12 .08 .94 .91 -.37 -.39 -.17 .38 .42 .23 .13 10 .29 .09 ,24 .17 .56 .16 .17 .06 .10 .29 .14 11 -.39 .17 -.54 .79 -.21 .49 .18 -.02 .43 .09 12 13 ,30 .16 .09 ,24 ,14 ,14 ,40 .25 .05 -.46 -.07 .08 .82 .04 -.15 .07 Table II Stage I Statistics Regression Number Dependent Variable F R 2 Durbin-Watson 1 CPU Busy 109 .79 1.149 2 Problem Program State 111 .81 1.471 3 Idle-Stop 97 .77 .933 4 Channel 1 Busy 333 .92 1.264 5 Channel 2 Busy 137 .84 .949 6 Channel 3 Busy 255 .91 1.008 7 Channel 4 Busy 160 .85 1.019 8 Channel 5 Busy 540 .95 1.408 9 Channel 7 Busy 408 .94 1.345 10 Channel 8 Busy 164 .86 1.758 11 Channel 9 Busy 385 .94 .963 12 Channel A Busy 222 .90 1.090 13 Channel B Busy 541 .95 1.310 1 2 3 4 5 6 7 8 9 10 11 12 13 CPU Busy Problem Program State Idle-Stop Channel 1 Busy Channel Channel Channel Channel Channel Busy Busy Busy Busy Busy Channel 8 Busy Channel 9 Busy Channel A Busy Channel B Busy Stage II Statistics 130 .79 269 .81 215 .77 499 .93 207 .84 383 .91 295 .85 1643 .95 616 .94 180 .86 516 .94 298 .90 857 .95 1.167 1.442 .926 1.283 .920 .988 1.003 1.391 1.341 1.764 .956 1.084 1.291 224 both the autoregressive and moving average forms would yield ARMA(1,1) and in its inte- grated form this would be ARIMA(1 ,1 ,1). An examination of the autocorrelation and partial autocorrelation coefficients from the CPU busy time series revealed this data to take a structure such that p=2, d=l , and q=o. As such we can call this an ARI(2,1) model or an autoregressive integrated model of order 2,1. None of the partial autocor- relation coefficients of this time series at any degree of differencing had either an exponential or a damped sine wave pattern thus eliminating the moving average compo- nent of time series from consideration. In examining the autocoreelation coefficients after a first differencing had been taken, a damped sine wave pattern could be discerned. This pattern in the autocorrelation coeffi- cients tells us that the data is autoregres- sive of order 2. Taken with the need to difference the data once to achieve this pattern we can conclude that we have a time series of the structure ARI(_2,1). As could be expected, the other components of the system that were directly related to the central processing unit also took on an AR I C2 , 1 ) model structure. These components were the problem program state and idle-stop. For channel 1 , a DASD channel , the auto- correlation coefficients of the original series, before taking a first difference, were not significantly different from zero after the first lag. These coefficients, how- ever, exhibited a damped sine wave pattern after both a first and second difference was taken which suggests a model that could have at least an ARI(2,1) structure. An examination of the partial autocor- relation coefficients after both a first and a second differencing revealed that they both declined in a smooth exponential manner. As such we have a moving average component in this time series with the total model taking the form of either ARIMA(2,2,1) or ARIMA(2,1, 1). For the sake of parsimony the second model structure was used. For channel 2, also a DASD channel, an examination of the autocorrelation coeffi- cients revealed that regardless of the degree of differencing they did not follow any type of set pattern. As such it was assumed that this time series did not have an autoregres- sive component to it. An examination of the partial autocorrelation coefficients revealed that after a second differncing the coeffi- cients declined in an exponential manner. As such this time series was assumed to take the structure of a first order moving average model with two degrees of differencing or simply IMA(2,1). All of the other DASD and mass storage channels produced similar auto- correlation and partial autocorrelation coefficients and as such were all classified as IMA(2,1) type models. For channel 5, a tape channel, the par- tial autocorrelation coefficients exhibited no discernable pattern thus eliminating the moving average component of the model from consideration. The autocorrelation coeffi- cients of the original time series exhibited a damped sine wave pattern which made this time series a second order autoregressive time series with no degrees of differencing or AR(2). Channel B, the other tape channel, also took on a AR(2) model structure. A summary of the different model struc- tures can be found in Table III. It is interesting to note the commonality of model structure taken on by the different system components. All of those components that had anything to do with the central processing unit took on a common AR(2,1) structure while all of the DASD and MSS channels took on an IMA(2,1) structure with the exception of channel 1. This could be due to the fact that channel 1 has the system residence packs on it which would conform to very different performance characteristics when compared to the average DASD channel. Both tape channels took an AR(2) model structure. 4.2 Time Series Estimation Having identified the structured form of the time series to be forecasted, the next step in this process is to develop the dif- ference equations for each of the four dif- ferent types of time series that were identi- fied in the previous step. Once the appropri- ate difference equation has been determined for each of the time series in question, the parameters of each difference equation can be estimated. The four difference equations that were used for forecasting the input values for the X vector of our general model are derived from the functional structural forms that have already been identified. These four equations are: For the AR(2) model: Z = 1 Z t . 1 +0 2 Z 2 . t +A t For the ARI(2,1) model: Z t =( 1 ^ ) Z t _ 1 - 0] Z t _ 2+ ( 1 + 2 ) Z t . 2 -0 2 Z t _ 3+ A t (25) (26) 225 TABLE III BOX-JENKINS TIME SERIES SUMMARY TIME SERIES NATURE OF THE SERIES MODEL STRUCTURE 1. CPU Busy 2. Problem Program State 3. Idle-Stop 4. Channel 1 Channels 2-4 Channels 7 - A Channel 5 Channel B Second Order Autoregressive Integrated with 1 degree of differencing ARI(2,1) ARIMA(2,1,1) Second Order Autoregressive Integrated First Order Moving Average with 1 degree of differencing First Order Integrated Moving IMA(2,1) Average with 2 degrees of differencing Second Order Autoregressive with degrees of differencing AR(2) TABLE IV Equation Model Form 1 - CPU Busy ARI(2,1) 2 - Problem Program State ARI(2,1) 3 - Idle-Stop ARI(2,1) 4 - Channel 1 ARIMA(2,1,1) 5 - Channel 2 IMA(2,1) 6 - Channel 3 IMA(2,1) 7 - Channel 4 IMA(2,1) 8 - Channel 5 AR(2) 9 - Channel 7 IMA(2,1) 10 - Channel 8 IMA(2,1) 11 - Channel 9 IMA(2,1) 12 - Channel A IMA(2,1) 13 - Channel B AR(2) Parameter Estimates 41926 -.39292 NA 45068 -.44883 NA 45091 -.45946 NA 50073 -.29656 -.29915 NA NA 1.05 NA NA 1.05 NA NA 1.05 99999 7.45E-9 NA NA NA 1.05 NA NA 1.05 NA NA 1.05 NA NA 1.05 99999 6.49E-7 NA NA = Not Applicable 226 For the IMA(2,1) model Z t =2Z t-l" Z t-2 +A t" 9 l A t-l For the ARIMA(2,1 ,1 ) model (27) (28) Z t =tl + 1 )Z t _ r 1 Z t _ 2+ (H0 2 )Z t _ 2 -0 2 Z t _3 + A t -O 1 A t . 1 Given these difference equation forms the next step in the model building process is to estimate the and 9 parameters of the models 1n question. This part of the model building process was accomplished through a two step approach that used a Nelder-Mead function minimization process to develop initial estimates and then a Marquardt non- linear least squares algorithm to develop the final estimates. For the IMA(2,1) models that had only one parameter, the Nelder-Mead process produced estimates that were more than adequate for our purposes. A summary of the parameter estimates can be found in Table IV. 4.3 Time Series Forecast Having both identified the proper model form for each of the thirteen system vectors the last step in the time series analysis portion of the model building process is to develop a point forecast for each system vector. The point in time that was selected as the focus of this forecast was one year from the point in time when the empirical data came to an end or roughly speaking January 1981 . In making this type of time series fore- cast the most important step is selecting an appropriate starting point or "seed" value from which to begin the analysis. If the time series 1n question are stationary and have a fixed mean then the mean value of the time series is the appropriate starting value. Unfortunately for our purpose none of the time series that we are manipulating are stationary and as such the appropriate starting values must be derived. For our purposes these starting values were derived using a simple two variable least squares linear regression. The output from these regressions are available from the author. In each of the regression plots the Y axis represent the dependent value of one of the thirteen system components that are under consideration while in every case the X axis represents time. The dotted line on both sides of the regression line represents a 95% confidence interval. The extremely low value of the R statistics for all but one of the regressions is an indication as to why the method was used only to establish a starting point for our forecast and not to make the full forecast itself. Using this method the starting values that were used for the time series analysis for each of the system components is as follows: Equation Description 1 2 3 4 5 6 7 8 9 10 11 12 13 CPU Busy Problem Program State Idle-Stop Channel Channel Channel Channel Channel Channel Channel Channel Channel Channel Busy Busy Busy Busy Busy Busy 8 Busy 9 Busy A Busy B Busy DASD DASD MSS DASD Tape DASD DASD MSS DASD Tape Starting Value 48.33 23.80 27.72 14.33 15.36 9.99 12.99 30.21 13.59 9.11 10.64 12.57 36.39 The final step of the forecasting process for the time series analysis portion of the model is to place the starting values for each of the system components into their appropriate equations and then to plot out a forecast for the values in question for an entire year. The forecasted values that were used derived from this process and used in the final simultaneous equation solution pro- cess are as follows: Starting Equation Description Value 1 CPU Busy 74.23 2 Problem Program State 36.14 3 Idle-Stop 33.63 4 Channel 1 Busy ■ - DASD 21.73 5 Channel 2 Busy ■ - DASD 26.22 6 Channel 3 Busy ■ ■ MSS 13.61 7 Channel 4 Busy - - DASD 16.61 8 Channel 5 Busy • - Tape 30.10 9 Channel 7 Busy • - DASD 16.48 10 Channel 8 Busy • - DASD 13.82 11 Channel 9 Busy • - MSS 17.88 12 Channel A Busy • - DASD 16.19 13 Channel B Busy • - Tape 36.25 4.4 Simultaneous Equations The final step of this entire process was to bring together both the Box-Jenkins estimates and the linear regression estimates into a system of simultaneous equations. The equation that was used to accomplish this step in matrix form was: Y = B X 227 The level of utilization for the differ- ent system components that was derived from the solution of this equation were as follows: Component "^Utilization 1) CPU Busy 76.15 2) Problem Program State 145.69 3) Idle-Stop 81.91 4) Channel 1 21.74 5) Channel 2 59.29 6) Channel 3 26.08 7) Channel 4 33.26 8) Channel 5 36.63 9) Channel 7 31.06 10) Channel 8 50.02 11) Channel 9 26.64 12) Channel A 47.53 13) Channel B 14.41 While the Channel and CPU busy forecasted estimates have the appearance of reasonable- ness both Problem Program State and Idle-Stop do not. This anomaly could be due to any number of factors and more will be said about it in the concluding remarks. 5.0 Concluding Observations In a very literal sense the output of this model represents point estimates of component resource utilization for Mountain Bell's Colorado/Wyoming processing center's 3033 JES3 global processor for January 1981 based on the assumption that no major hard- ware or software changes are made before that data. Should such changes be made, however, the real value of the model will come into play as such changes can be reflected in changes in the value of the X. vector. These new X vector values can then be used to once again solve the system of simultaneous equa- tions to ascertain the impact of the change in one of the system components on the entire system. As such the value of the model is three fold. First, the model does have the capability of forecasting future computer resource utilization. Secondly, the model also has the capability of pointing out current as well as potential system bottle- necks. Finally, the model has the ability to answer 'what if questions about proposed system changes. In using this model for capacity planning at Mountain Bell there are a number of direc- tions that will be taken in the future. The first such step must be to refine the current model . The case study that has been presented in this paper illustrates the potential pit- falls that are involved in trying to develop a dynamic system model. The results of the simultaneous equation system points to two areas in particular, problem program state and idle-stop, where additional work needs to be done to refine the technical coeffi- cients that were developed in the multiple regression portion of the model building process. The final results of these two system components should not be viewed as proof of the inappropriateness or weakness of the methodology but rather as opportunities where further work should produce more useable results. The inappropriateness of these values indicates a need to reexamine the variables that were used to forecast these values in the model . Another task that needs to be addressed with respect to the future use of this metho- dology at Mountain Bell is the task of model verification. Before this model can be used with confidence for the purposes that it was designed, it must be tested using historical data to ascertain its useability and its accuracy. Once the model variables have been stabilized, and the model results verified, the last task that must be accomplished before this methodology can be broadly applied is that the software that was used to develop these results must be rationalized, consolidated and simplified so as to make it possible for the average technician, who is unfamiliar with the statistical techniques involved, to produce a forecast. 5.1 Acknowledgements It would have been impossible for me to have completed this paper without the generous contribution of time and effort by all of the members of the Hardware Forecasting and Per- formance Evaluation Group at Mountain Bell. I would like to give a very special thanks to Ms. Esther Martin for her secretarial skills and unlimited patience. The errors that remain in this document despite these indivi- dual 's efforts remain my sole responsibility. References [1] Bhattacharya, Gouri and Johnson, Richard, Statistical Concepts and Methods , John Wiley & Sons, Inc. New York, c. 1976, p. 396. [2] Johnston, J., Econometric Methods , McGraw- Hill Book Company, c. 1972, pp. 246-249 [3] Box, George, and Jenkins, Gwelyn, Time Series Analysis , Holden-Day, c. 1970, pp. 88-89. [4] Box and Jenkins, pp. 173-177 [5] Box and Jenkins, pp. 64-65. 228 £6] Box and Jenkins, pp. 169-176 17] Nelder, J. A., and Mead, R.A., "A Simplex Method for Function Minimization", The Compten Journal, 7, 1965, pp. 308-313. [8] Marquardt, D.W., "An Algorithm for Least Squares Estimation of Non-linear Parameters", Journal of the Society for Industrial Applied Mathematics, June 1963, pp. 431-441. [9] Box and Jenkins, pp. 129-132. [10] Gajarati , Damodar, Basic Econometrics , Basic Books, New York, c. 1977, pp. 360-363. [11] Johnston, pp. 346-350. 229 Measuring System Performance 231 A Performance Evaluation Study of UNIX Luis Felipe Cabrera Computer Science Division Department of Electrical Engineering and Computer Sciences and the Electronics Research Laboratory University of California, Berkeley CA 94720. Different performance aspects of computer systems on which the time- sharing operating system UNIX runs are presented. A comparison of the performance of three installations is made and the method discussed. The effects of distinct upgrading changes made in the systems, namely, the addition of a cache memory, of a disk drive and of main memory are also reported. Key words: changes. Comparison; performance of computer systems; UNIX; upgrading 1, Introduction Since its first release by Bell Laborato- ries in 1974, the timesharing operating system UNIX [ l] has run on a wide variety of main- frames. Primary among them is the PDP-11 family manufactured by Digital Equipment Corporation. Moreover, UNIX also runs on Digital's recently released VAX 11/780, even though its hardware presents drastic differ- ences with the PDP-11 family, and has also been ported to the INTERDATA 7/32 and to the UNIVAC 1100 series machines. It has then become of interest to measure the effect that the underlying hardware has on the performance of the various computer sys- tems on which UNIX runs. In what follows a study of three computer systems is presented. Each was deliberately chosen to be substan- tially different from the other two. UNIX is a trademark of Bell Laboratories 2 This research was supported in part by the National Science Foundation under grants MCS78-24618 and MCS78-07291, by a research grant of the Pontificia Universidad Catolica de Chile, and by the Italian National Research Counsil under grant CNR 78.02548.07 When we compared systems, we tried to assess the effect on performance of leaps in technology and design. On the other hand, when we analyzed an upgrading change, we observed the effect this new hardware com- ponent had on the performance of the given system. Moreover, the upgrading changes made to each of the systems provide good examples of what may be achieved when such a change is implemented in an installation. All the systems studied operate in the Electrical Engineering and Computer Sciences Department of the University of California at Berkeley. Measurements were taken over a span of five months (April through August 1979) beginning by the third week of the Spring Quarter of 1979. The rest of the paper is divided as follows. Section 2 presents the basic design decisions about the experiment as well as the measurement techniques and the reduction of the data. In section 3 we present and analyze diagrams which compare the performan- ce of the systems. Section 4 shows the re- sults of the upgrading changes and section 5 contains our conclusions. 233 2. Measuring UNIX Performance 2.1 Preliminaries When comparing the performance of differ- ent installations, many problems are resolved if the same operating system runs in all of them. This is because the identical user in- terface makes it posible to define a high- level language benchmark and use it unmodified in each of the systems. When this is the case, functional equiv- alence of the benchmark is immediately ob- tained [4]. Whether this equivalence corre- sponds or not to a resource consumption equiv- alence will depend on the actual implementation of the operating system. Determining whether or not it yields a responce time equivalence is the main purpose of our comparison study. Choosing response time as the main ob- served performance index of our study is jus- tified by our belief that in any timesharing system, from the user's point of view, what counts most is the responsiveness of the sys- tem to user supplied tasks. Nevertheless, using standard UNIX instrumentation, we have also monitored system time (the cpu time spent in the operating system) and user time (the cpu time spent executing the command) for each of our tasks. This has proven very helpful in analyzing possible causes for effects noticed while studying the performance of each system. Characterizing the work load is a central problem in any benchmarking experiment where a comparison between systems is made. The question about the conditions under which the experiment should be performed has been given a number of different answers. Perhaps the best way to answer it is to design an ex- periment where a benchmark is executed in each system on a stand alone basis and where com- plete control of the work load is achieved by loading the system with some kind of internal or external driver. The main advantages of this method are total reproducibility and absolute control of all the activities in the system. The main difficulties it presents are related with the availability of tools to drive the system and with the design of the work load that those tools will implement . The (artificial) work load under which the system is to be studied must be such that results obtained from its usage should yield information about the system's performance under its natural work load. Lack of such tools at the time when we had to make this, decision led us to the less sophisticated approach of monitoring the sys- tems periodically under their natural work load with the aid of a shell script 3 . Then, in order to analyze our results we had to find satisfactory characterizations of load with respect to which the performance indices corresponding to a given task would be meas- ured. We decided to use single variable char- acterizations of load for this purpose. We have characterized work load by taking two different yet related viewpoints. One is an outside viewpoint (how much work is being done on the system) and the other is an inside viewpoint (how much work is being demanded from the system) . For comparisons as well as for analyzing upgrading changes, the first viewpoint yields two rather coarse but very natural character- izations of load: the number of users (logged in while our benchmark begins to run) and the number of active users (those which are executing some task when our benchmark begins to run) . For comparing installations we found that the most interesting character- ization based on the second viewpoint was that based on the number of real processes. This is the number of processes which are likely to run while one of the tasks in our benchmark will be running. Its precise definition is the following: a process in the process table is a real process if it is not a login shell or a. sleeping shell 1 *. It is clearly less coarse than the above two char- acterizations . The three systems we monitored had the C shell [ 2] running in addition to the ordi- nary shell. To make the experiment portable in our environment, we wrote a script for the C shell which was run on the background (with the same priority as any user process) in each of the systems. Its text is repro- duced in [ 8] together with the text of all programs and files used by it. Documentation 3 In UNIX the shell is a command language interpreter through which the user inputs its tasks to the system [ l] . Moreover it is also a programming language [2] thus shell programs {shell scripts) can be written with it. ''A login shell is a process waiting for input from a terminal in which no user is logged in. In UNIX, whenever a user inputs a command, the shell forks spawning a child shell which executes the command. While the child shell executes the command, and unless the command has been issued followed by an &, the parent shell sleeps till the completion of the command. We call sleeping shells this last type of process. 234 regarding the usage of the script and about changes needed to convert the script into one for the standard UNIX shell can be found in [8]. 2.2 The Script Driver Our strategy for monitoring each system's responsiveness was to run periodically a set of predefined benchmarcks. This was achieved by using a shell script which contained these tasks together with commands which gathered statistics about the work load and the time it took the tasks to complete. The script was run every other week during the normal operations of the systems. The script, after cycling through its instructions once, would go to sleep for twenty minutes and then wake up to initiate the cycle again. This data gathering method can be cate- gorized as a time-sampling tool [3] and in fact is eery similar to Karush's [7] terminal probe method. By using it we measure the work load of the system as well as the depend- ency of our performance indices on the under- lying equipment. Thus we are actually eva- luating the installation. Although running our script affects the load of the system, and thus its responsive- ness, it was felt that this was irrelevant for a comparison study because all systems were going to be presented with the same script. In fact the main purpose of the comparison experiment is precisely to observe how each system reacts to this stimulus. Our commitment to use standard UNIX fea- tures, for portability reasons as well as for assuring the functional equivalence of the benchmark, made us decide upon the usage of the time command [5] as the measurement tool for our tasks. The time command returns, upon completion of the command with which it is called, three measurements: response time (the elapsed time during the command), system time and user time. These last two are accu- rate to one tenth of a second while the first is accurate to one second. The time command truncates, does not round off. This low resolution of time together with our desire that no individual measurement be off by more than 10% led us to consider tasks which would never take less than five seconds to complete. On the other hand we could not come up with a set of tasks that would over- load the system every time they were run, if we were to run them periodically for an ex- tended period of time. Our script was desig- ned as a compromise between these requir- ements. Four tasks were timed at each run: a C compilation, the execution of a CPU-bound job, the retrieval of the manual page of the on-line copy of the UNIX Programmer's manual, and a mix, which included the above three plus some I/O bound tasks and two system commands; the mix was called "Script", even though it did not correspond to the actual script we were running. 2.3 The Tasks Chosen Our desire was to measure a set of tasks that would be representative of the user's tasks and that would stress distinct aspects of a configuration. The lack of a Pascal compiler or interpreter for the VAX at the time made us use the language C [6J. Two of our tasks were then the compilation of a short CPU-bound program and its execution. Our CPU-bound program executes its inner sequence of instructions 100,000 times. In all performs 1,200,000 integer arithmetic operations plus those needed for the for loops. It was decided against using floating point arithmetic because one of the measured systems, the 11/40, does it in software. This precluded any meaningful hardware-ori- ented comparison. As for our third task, rather than having a strictly I/O bound task we felt it would be more interesting to choose a task which would involve more features than just the speed of the disks and the efficiency of the I/O subsystem. Thus, the command man man, which retrieves the entry for the manual page out of the on-line copy of the UNIX Program- er's manual, was chosen, because, as the on- line copy is kept in compact form on disk to save space, this copy is retrieved using the formatting program nroff, a utility program widely used in text processing. To avoid problems when running the script, the output of man man was sent to /dev/null instead of sending it to a real terminal. This has the effect of discarding the already formatted text of the retrieved page. The last task timed is what we called Script. For technical reasons we used the date command to determine total elapsed time. The only drawback is that it does not yield system time nor user time for this task. Script, besides including the above three tasks, also includes six short l/0-bound tasks, a couple of date commands and two commands which gather extensive data about the system. Script is thus a fairly bal- anced task and can be considered as an in- dicator of throughput. 235 2.4 The Systems Measured Our main motivation when choosing which systems to monitor was diversity. Our choices were the following: (1) A PDP 11/40 with 200K bytes of main memory, one DIVA disk controller with three DIVA disk drives which have 50M byte disk packs. This system has 23 ports and no floating point arith- metic unit. (2) A PDP 11/70 with 1.3M bytes of main memory, a 2K byte cache memory, one DIVA disk controller with four DIVA disk drives which have 50M byte disk packs and an RS04 fixed head disk used as swapping device. This system has 81 ports. (3) A VAX 11/780 with 512K bytes of main memory, an 8K byte cache memory, one Digital disk controller with one RP06 disk drive with 177M byte disk packs. This system had 16 ports. Throughout the rest of this paper we shall refer to these systems as the 11/40, the 11/70 and the VAX respectively. The configurations described above are those which each system had during most of our data gathering period. The work loads of the three systems are as diverse from each other as their under- lying equipment. The 11/40 is mostly used for administrative matters. The 11/70 is mostly used by undergraduate students in coursework-related activities. The VAX is primarily used by advanced students in re- search related tasks. Given this diversity of the natural work loads, the problem of work load characteriza- tion for the comparison of the responsiveness of the different systems becomes very impor- tant as well as more difficult. Our char- acterizations described in 2.1 proved to be reasonably adequate. Of course, the analysis of our results was inevitably influenced by considerations about the work load of the individual systems. defined as the square root of the ratio between the variance and the cardinality of the sample. For response time we found the standard error to be consistently large and thus ploted 90th-percentile curves. All the diagrams mentioned above can be found in [8]. To prevent outliers from appearing in our curves we chose the minimun size of sam- ples by the iterative method of plotting the curves with samples of increasing size until outliers would not appear. Whenever we did not have enough points in a sample we clus- tered samples corresponding to consecutive values of the variable. From this larger sample the 90th-percentile was chosen. The x-coordinate used for such a point was the average of the x-coordinates corresponding to the clustered samples. We used samples of size sixty. We believe this approach to go a long way in solving the conflicting problems of insufficient number of data-points, presence of outliers in the samples, large variance of some samples and finite amount of resources. Table 1 displays the amount of data points gathered. Only for the 11/70 we felt had an insufficient amount and this was due to the large number of ports this installa- tion has. Taking our experience as a model, one should try to obtain fifty measurements per value of the variable characterizing load to avoid clustering samples. Table 1. Number of Sample Points per System. SYSTEM Number of Sample Points 11/40 11/70 VAX 536 1007 1368 3. Comparison of the Installations 2.5 Reducing the Data The data points provided periodically by our script were gathered in appropiate files and latter processed using time stamps. We were thus able to plot mean user-time, mean system-time and mean response-time versus Users, Active Users, Processes and Real Pro- cesses for each task in each system together with the standard error of the corresponding sample. The standard error of a sample is Rating different installations with our method presents several difficulties. They all emanate from the diversity of the work loads and the lack of satisfactory charac- terizations for them. Those characterizations based on an ex- ternal view of the system fail to be accurate because they are dependent on too many fac- tors. Clearly, a student doing research submits jobs which are of a different caliber 236 from those submitted by a user editing a file. Thus we believe that our comparison curves based on this view penalize the VAX because of the quality of its users (see 2.4). Like- wise, our characterizations of load based on an internal viewpoint fails to distinguish between types of jobs. For comparing response time, we found that our characterization of load in terms of real processes provided us the best overall view as to how the systems performed relative to each other. Moreover, it also seems to be the best single indicator of the true level of activity of each system. Finally, the disparity of configurations is also an important degrading factor for the VAX. This system was supporting all of its I/O activity, including swapping, in one arm. In contrast the 11/70, which swaps on drum, has several other spindles to handle the rest of the I/O. We have chosen to present here only two of the four tasks monitored because they are representative of the behavior observed. The other results may be found in [8]. 3.1 CPU-bound Job As mentioned in section 2.3 this task was the execution of the code generated by a very simple C program which had two nested for loops. Given its small size (the C code has 27 lines, including comments), the pro- bability of being swapped out from core while on the system is minimal. Thus, assuming same scheduling in all three systems, its processing time only depends on the lengths of the CPU and memory cycles, the effect of the load in terms of I/O interrupts and the size of the ready queue. In figure 1 we see that the VAX ranks uniformly first while the 11/40 ranks third. This same ranking was observed under our ex- ternal characterizations of load, but there the slopes of the curves of the 11/70 and the VAX were almost identical. This is the task where difference between the three systems is most visible. We could observe the amount of user time the CPU-bound job took in each system. At the five user level the 11/40 took on the average 21.0 seconds, the 11/70 8.5 seconds and the VAX 5.66 seconds. At the fourteen user level the 11/70 took 8.61 seconds while the VAX took 5.81 seconds. We feel that user time for this task is a very good indicator of the speed of computation of the machines. The VAX does show to be a substantially faster 100 in M PERCENTILE CU«»ES i 1W40 — 11/70 i vn> 10 20 30 HO SO 60 NUMBER 8F REAL PROCESSES Figure 1. Response Time of CPU-bound job versus Real Processes. machine than the other two. Given that no 1/0 is done, we expected to observe the fairly linear behavior we obtained. This was true for all of our char- acterizations of load. Response time for the 11/40 consistently showed to have a larger rate of growth than for the other two systems. 3.2 The Script Figure 4 contrasts with figures 2 and 3 in that it shows the VAX no longer being slower than the 11/70 throughout the whole range of load. Script includes several tasks which make extensive use of the I/O subsystem, and so the inferior VAX 1/0 subsystem must be a heavy degrading factor. Figure 4 also suggests that with a better configuration the VAX might outperform the 11/70 at all loads. One would clearly need to add more main memory to increase the threshold of load that saturates the system and also disk drives to increase parallelism in I/O service. The results observed for this task are similar to those obtained for the C compila- tion and the command man man. The only difference with the pattern observed here is that in the latter task, when ranked using our external characterizations of load, the VAX ranks third, while when using Real Pro- cesses it ranks second. A possible reason 237 U3? (A •J 350H renccnTiLC cuniej — II /MO — 11/70 »=• vui X 115- U2 NUMBER OF USERS 40 50 SO NUMBER OF REAL PROCESSES Figure 2. Response time of Script versus Number of Users. Figure 4. Response time of Script versus Number of Real Processes, Figure 3. 8 12 18 24 30 36 NUMBER OF ACTIVE USERS Response time of Script versus Number of Active Users. for this was given at the beginning of sec- tion 3. 4. Effects of Upgrading Changes All diagrams presented in this section are 90th-percentile curves, and the size of samples has been chosen, as before, so as to exclude outliers. We have analyzed these changes using the characterization of load which is best understood: the number of (logged in) users. 4.1 Adding a Cache Memory to the 11/40 Perhaps the most spectacular use of a histogram of all measurements taken on a given system for a given task is that of witness to a basic configuration change. The phenomenon of multiple peaks in performance indices for a given task such as user and system time is almost certainly associated with a basic configuration change. In effect, the addition of the cache memory to the 11/40 was responsible for shifting to the left the peaks in each task's histogram. Most remarkable is the case of the CPU- bound job, where the peak in user time went down from 23.0 seconds to 19.6 seconds, the mean user time decreased from 23.13 seconds to 19.71 seconds, and the standard deviation of the user time sample decreased from 0.25 238 to 0.20. The system time also improved dra- matically for this task: its mean decreased from 0.68 seconds to 0.33 seconds, and its standard deviation decreased from 0.88 to 0.36; figure 5 depicts the shifting of peaks in the case of user time for the CPU-bound job. inf- er) 1 1 1 1 — I 1 g 120- 1 •_ 1 »— 1 a £«>- - UJ V) a • 72- - Ik k o A Em- an 1 1 1 X 1 1 1 3 \ i i *2<4- K. i\. 1 i i i i i 19.1 20.0 20.9 21.6 22.7 23.0 24.: USER TIME CSECU Figure 5. Histogram of User Time for the CPU-bound Task in the 11/40. Figure 6 shows the remarkable effect of the change on the CPU-bound task, while in figure 7 we may appreciate the very positive effect of the change on the Script task. The variation observed in figure 7 is representa- tive of what happened with each of the other two tasks. The curves were obtained using 239 measurements without cache memory and 303 measurements with the cache memory. A minimun sample size of 60 was used for this plot. Because of figure 7 it can certainly be stated that both responsiveness and through- put are enhanced when a cache memory is added. It is only a matter of cost-benefit analysis to determine its size and type. 4.2 Adding Main Memory to the 11/70 It is unfortunate that we only had 175 measurements before this change was made and that they were obtained at the beginning of our data gathering period. In our university environment, the systems are not "pushed" by the users as hard as they are latter in the quarter, when the users have become more sophisticated. Thus, for this change, our curves actually show the effect of two sources of variation: addition of 1M byte of main memory and increase of the load. The 830 measurements we had after the new memory was installed would have allowed us to apply statistical techniques, such as the analysis of variance, that would have provided us with the relative weighs of the two sources of variation. However, this kind of analysis was precluded by the minimal amount of samples we could gather before the addition of memory. Thus figure 8 must be interpreted care- fully. In it the 0.3M byte curve may be too nS3 u UJ to l_l 49 8«H £ HI o 0. m UJ ec 37 H 33 - 29 - 25- T M PERCENTILE CtMVCS — — M1TH0UT CRCME RCHSnr — WITH CAOC HEHMT 1 r 4 5 NUMBER OF USEBS Figure 6. Response Time of CPU-bound job versus Users, 239 24 2- bJ in 229- 20*" 189- IU7 129- 109- 90 71 tQ PCMMTILE — MITH0UT wmmmm WITH CACHE « s NUMBER OF USERS Figure 7. Response Time of Script versus Number of Users in the 11/40 optimistic, given that the users were not performing tasks as resource consumming as those in the 1.3M byte curve. Nevertheless, the response time is much lower in the 1.3M byte configuration. It is also seen that the saturation point of the system occurs at a lower level of load with a smaller amount of main memory. The addition of main memory certainly increases the amount of multiprogramming the system handles before saturation, and this has a very positive effect on the response time of trivial tasks. Unfortunately, this effect is not measured by our script and so we can not report on it directly, although it influences the response time of the Script task. nSSO- u in " U9H 388- 334" 280- 229- 172- 11( 6U BO rCftCENTUt CURVES — — 0.3* ITTCS at MIM MCNOTT — - 1.3* STTE5 Of (Mm NEHOftT 1 1 1 1 1 1 10 13 16 19 22 25 2B 31 34 37 40 NUMBER OF USERS Figure 8. Response Time of Script versus Number of Users in the 11/70 240 4.3 Adding a Second Disk Drive to the VAX the responsiveness on the system, as seen in figure 10, but for one type of task it proved The addition of a CDC 9762 disk drive with to be disadvantageous, a CDC 9400 controller and 80M byte disk packs to the VAX configuration was a mixed blessing. CPU-bound jobs took longer to finish with It proved to yield an overall improvement in the two disk configuration (see figure 9). r.3° u UJ l_l 27 - jgM' UJ •£ 21 O a. in UJ (C 16 - 15 - 12 - 1 90 PERCENTILE CURVES — — ONE DISK WUt a— TW 01SK MI«ES i~ — i T 1 2 3 T ~i r S 6 i 1 r 7 8 9 "I 1 1 1 10 11 12 • 13 IV NUMBER OF USERS Figure 9. Response Time of CPU-bound Job versus Number of Users in the VAX r-,26* 5> '6^ 18 sectors, two disk revolutions are missed; for B >^ 16 sectors, only one disk revolution is missed per I/O call. These results are interesting but not characteristic of most I/O routines on the Cray-1 and CDC 7600. The most fre- quently encountered sets of coefficients were M = and M = 1 (one disk revolution missed per I/O call) and S = 18 for the Cray-1 or S = 20 for the CDC 7600 (maximum number of sectors per disk revolution) . 248 3- P 2- 6 ^..,,--"" l- ....•'•' y «of^-^^ 0- . — ■ i 05 1/BUFFER SIZE FIGURE 1. Disk revolutions per sector for BUFFER IN /OUT as a func- tion of the reciprocal of the buffer size. An example of results obtained on the CDC 7600 is shown in Fig. 2. Routines per- forming random-access I/O transmission were used for this test. The solid line repre- sents data for sequential access and corresponds to one disk revolution missed per I/O call and a transfer of 20 sectors per disk revolution. l/BOFFER SIZE FIGURE 2. Revolutions per sector for transfer by WDISK/RDISK (7600) as a function of the reciprocal of the buffer size. Using a random number generator to determine the disk addresses at which to start transmissions , the test was performed a second time using the same routines. Data for these results are also fitted by a straight (dotted) line, with the same zero intercept, corresponding to 20 sectors transferred per disk revolution. Due to more frequent seek operations, 1.4 disk revolutions are missed per I/O call. To interpret results obtained for over- head CPU times, it is reasonable to assume that the overhead CPU time per sector T (B,N) consists of two contributions, the On CPU time T required to actually trans- trans fer the data, and 1/B times the CPU time T ., , needed to initiate and complete the call system call for I/O: T._(B,N) - T + 1/B*T .. OH trans call (8) Experimental results indicate that the assumptions leading to Eq. 8 are valid in most cases. Plots of T n (B,N) versus 1/B are straight lines with a zero intercept of T and a slope of T ... trans call Figure 3 shows data obtained for the Cray-1 using random-access 1/0 routine RDISK. The lower curve represents data obtained using a scalar compute-and-test loop, with T ,, = 660 + 10 ys and T = 6 + 2 ys. call — trans — The tests were also run using a vectorizable compute-and-test loop. Because of memory lockouts during the vector calculations, the CPU overhead times are slightly higher, as observed in the upper curve. Data obtained from CDC 7600 tests is shown in Fig. 4. Both sequential and random tests were performed as previously described. The zero intercept is the same for both tests; the overhead CPU times per sector are slightly larger for the random test due to more frequent seeking. 249 1/BUFFER SIZE FIGURE 3. Overhead CPU times per sector for RDISK (CFTLIB) as a func- tion of the reciprocal of the buffer size. l/BUFFER SIZE FIGURE 4. Overhead CPU times per sector for WDISK/RDISK (7600) as a function of the reciprocal of the buffer size. 4. Notes on Results Measurements made on both the Cray-1 and the CDC 7600 employed a wide variety of operating systems, libraries, and I/O routines. Both sequential and random writing/reading, buffered and nonbuffered I/O, Fortran versus assembly language (CAL) , and overlapped/non-overlapped tests were run and analyzed . Tests performed on the Cray-1 used pro- gram buffer sizes of 1, 2, 3, 9, 18, 36, 90, and 180 sectors to ensure that any peculiar- ities occurring at track and cylinder boundaries would be observed. In almost all cases, overhead CPU times can be represented by Eq. 8. The random access routines IZDKIN/IZDKOUT, RDABS/WRABS, and RDISK/WDISK are all very efficient with very small overhead CPU times. In some cases, the implementation of a very short compute-and-test loop (< 30 ys) ensured that no disk revolutions were missed, even at buffer size B = 1. Buffered operations were measured using BUFFER IN/BUFFER OUT and BINARY READ/WRITE statements. Some of the tests were run with system buffer sizes of 1/4, 1/2, 1, and 2 times that of the program buffer size to determine the effect on rates . It was ap- parent that in all cases where this approach was taken, the resultant rates were a direct function of the system buffer size. Program buffer sizes for tests run on the CDC 7600 were set at 1, 2, 3, 10, 20, 40, 60, 100, and 200 sectors. The results of these tests were generally analogous to those seen on the Cray-1. Overhead CPU times were somewhat larger, with the smallest exhibited by RDISK/WDISK routines with T nn = 1050 + 50 ys and T = 46 + 3 ys. call — trans — Again, on random access routines, transfer rates are consistent with a transfer of 20 sectors per disk revolution and one disk revolution missed per I/O call. Tests that generated random-disk addresses by a random- number generator showed an increase to 1.4 in the number of disk revolutions missed per I/O system call , due to more frequent and longer seek operations. For BUFFER IN/BUFFER OUT tests, as with the Cray-1, the rates were solely dependent on the size of the system buffer used in the 250 transfer. For read requests, two revolu- tions were missed per processing of a system buffer, except for the minimum system buffer size of B = 2 sectors, for which four disk revolutions were missed. The situation for write operations was more pronounced: four disk revolutions are missed per system buffer processing for B > 4. For B = 2, it was found that nine disk revolutions are missed per call. This is caused by excessively high overhead times associated with this buffer size. The raw data from these tests can be obtained from the Computer Science and Services Division's Research and Applications Group at the Los Alamos Scientific Laboratory. 5 . Summary Characteristics of all routines tested are summarized in Table 1 . The following general observations can be made. With the. exception of some BUFFER IN/ BUFFER OUT routines, the maximum number of sectors is transferred per disk revolution: 18 for the Cray-1, 20 for the CDC 7600. For the Cray-1, maximum transfer rates of 503 kword/s for both overlapped and non- overlapped reads and writes were achieved on the COS operating system. The overhead CPU time for an I/O system call T ni = 480 ys call is smallest among the operating systems tested; the CPU time required for the trans- fer of one sector T = 42 us is slightly trans higher than that observed on the CTSS oper- ating system. The CTSS operating system has library routines that achieve maximum transfer rates for both overlapped and non-overlapped writes. The reads are more sensitive to proper timing of I/O requests than the writes. Non-overlapped reads always miss one disk revolution per call. For over- lapped I/O, using the most efficient routines available, maximum transfer rates can be achieved only by testing I/O completion at < 100-us intervals. Observed overhead CPU times are T 1n = 650 ys for the 1/0 system call call and T = 7 ys for a one-sector trans transfer. On a single channel, these are short enough not to degrade I/O performance; for multichannel tests and small buffer sizes, some rate degradation will occur. At least one disk revolution is missed per I/O call for all routines on the ALAMOS operating system. This is attributable to the high overhead CPU time of about 4000 ys for the I/O call. This is considerably higher than the 926 ys required for one sector to pass under the read/write head. CDC 7600/LTSS operating system routines also miss at least one disk revolution per I/O call. The minimum overhead CPU times for initiating an I/O call are T in = 1050 ys for RDISK/WDISK routines, call just slightly longer than the 833 ys required for one sector to pass under the read/write head . The minimum measured transfer CPU time per sector is T =46 ys. trans It is apparent that maximum I/O rates can be achieved only if one chooses I/O routines carefully and performs frequent enough testing for I/O completion on small buffer sizes. 251 Table 1 Performance Characteristics of Cray-1 and CDC 7600 I/O Routines Machine Operating System Routine Library Sectors per disk revolution Disk revolutions missed per I/O call T call (Us) trans (ps/sector) CRAY-1 CTSS IZDKIN IZDKOUT BASELIB 18 18 or l a) 630 630 6 10 RDISK WDISK CFTLIB 18 18 or l a) 660 660 6 6 RDABS WRABS FORTLIB 18 18 or l b > 650 650 7 9 BUFFER IN BUFFER OUT CFTLIB 18 18 1 or 2 3 1580 14000 50 150 BUFFER IN BUFFER OUT FORTLIB 18 18 1 or 1 — — READ WRITE CFTLIB or 1 730 730 10 10 READ WRITE FORTLIB 18 18 > 1 2 ... _._ ALAMOS System calls (read) System calls (write) 18 18 l c > l c > 4050 c) 4050 c) 155 C) 155 c) BUFFER IN BUFFER OUT 9 9 2 1 — ... COS System calls (read) System calls (write) 18 18 480 480 42 42 CDC 7600 LTSS IZDKIN IZDKOUT BASELIBF 20 20 1 1 1600 1700 50 50 RDISK WDISK FTNLIB 20 20 1 1 1050 1050 46 46 BUFFER IN FTNLIB 20 2 -1750 d > ~100 d) BUFFER OUT 20 4 * d > * d > READ WRITE FTNLIB 20 20 1 1 * d > * d > * d ) * d ) BUFFER IN BUFFER OUT ORDERLIB 20 20 5 2 — a) Depending on length of compute-and-test loop for overlapped I/O. Always 1 for non-overlapped I/O. b) for B<24, 1 for B>36. c) for B>9 d) Equation 3.5 does not apply. 252 Appendix A Example of Code for Timing of the Compute-and-Test Loop and File Transfer C SET NUMBER OF CHANNELS DO 1000 NCHAN=1,MAXCH 1COMP=0 DO 110 N=1,NCHAN NEXT(N)=0 DONE(N)=.TRUE. 110 CONTINUE CALL IDLE C C TIME COMPUTE-AND-TEST LOOP FOR EACH NUMBER OF CHANNELS T0=TIMEF(DUM) T1=TIMEF(DUM) C 115 CALL COMPUTE (NCOMP) ICOMP=ICOMP+l C DO 160 N=l, NCHAN I0C=N+4 1ND=DSP(4,N) IF((IFTBL(3,IND).GE.0).OR.DONE(N))GO TO 160 IF(NEXT(N).LT.MAX) GO TO 150 DONE (N)=. TRUE. GO TO 160 150 CALL RDISK (IOC, BUFF, NWDW, NEXT (N)*NWDW) NEXT(N)=NEXT(N)+1 160 CONTINUE C ALDONE=.TRUE. DO 170 N=1,NCHAN 170 ALDONE=ALDONE. AND. DONE (N) IF (.NOT.ALDONE) GO TO 115 C T2=TIMEF(DUM) TLOOP (NCHAN ) =1000 . * (T2-2*T1+T0 ) C C C SET BUFFER SIZE, ARRAY ISECT CONTAINS BUFFERSIZES DO 900 NBF=1,9 NWDW=ISECT(NBF)*512 MAX=NWDS/NWDW ICOMP=0 DO 310 N=l, NCHAN NEXT(N)=0 DONE (N)=. FALSE. 310 CONTINUE CALL IDLE C C C C GET TIMINGS FOR OVERLAPPED I/O C T0=TIMEF(DUM) C GO TO 355 340 CALL COMPUTE (NCOMP) ICOMP=ICOMP+l 253 C TEST EACH CHANNEL WHETHER I/O BUSY, IF NOT REINITIALIZE 355 DO 360 N=1,NCHAN I0C=N+4 IND=DSP(4,N) C IF CHANNEL N BUSY OR DONE GO TO 360 IF((IFTBL(3,IND).LT.0).OR.DONE(N)) GO TO 360 IF(NEXT(N).LT.MAX) GO TO 357 DONE(N)=.TRUE. GO TO 360 357 CALL RDISK( IOC, BUFF, NWDW, NEXT (N)*NWDW) NEXT(N)=NEXT(N)+1 360 CONTINUE C C TEST FOR FINAL I/O COMPLETION ALDONE=.TRUE. DO 370 N=1,NCHAN 370 ALDONE=ALDONE.AND.DONE(N) IF (.NOT.ALDONE) GO TO 340 C T1=TIMEF(DUM) TIO(NCHAN,NBF)=1000.*(T1-T0) C 900 CONTINUE 1000 CONTINUE 254 Forecasting Computer Processing Requirements: A Case Study Ronald D. Tomberlin Environmental Protection Agency Research Triangle Park, NC 27711 This case study describes a recent experience the author had in updating a computer workload forecast. It presents an example of a workload forecast developed without regard for future update requirements and the subsequent problems encountered when trying to perform an update on the original study. The update methodology used is described and a series of recommendations is provided to help the reader avoid problems of the type experienced by the author. Key words: Benchmark; capacity management; capacity planning; workload forecasting; workload update. 1. Introduction Workload forecasting is inherent in all competitive procurements of computer equipment and services. There must be some measure to indicate the amount of processing required to support the future requirements of an organization. As a result, most organizations have been involved in at least one attempt to forecast their computer work- load requirements. But, situations arise when neither the time or funding is avail- able to support a detailed study. When this is the case, organizations tend to enhance, expand, or update their most recent forecast . Often these updated forecasts do not employ the detailed analytical methods nor- mally used in forecasts of this type. Intu- itive reasoning replaces technical excel- lence. If results appear "reasonable" they are generally accepted. More often than not these results prove to be adequate. But, developing a process to determine these results can sometimes prove to be quite a challenge. Lessons are learned in these situations and, if shared with others, can make this process of updating a workload forecast easier. Such is the case with the presentation of this study. This case study discusses a situation in which an organization contracted for a workload forecast and later decided that this forecast should be updated with more recent historical data. A gross estimate of processing growth was all that was re- quired. Unfortunately, the original study was not designed to facilitate an update. As good as the original study was, it proved to be extremely difficult to update. What follows is an outline of some of the major problems that had to be overcome in order to perform an update of this original study. This case study does not demonstrate how an initial workload fore- cast should be performed. The technical literature [1][4] provides many articles that offer guidance in this regard. This study deals only with one organization's attempt to update its outdated workload forecast . There are two primary reasons for dis- cussing this case study. First, the study is presented to acquaint the reader with the 255 types of problems an organization had to consider in updating a workload forecast. In this respect, only enough of the detail work is presented to explain how the prob- lems were solved. Second, and most impor- tant, this case study presents a series of recommendations to aid those organizations who might be preparing an initial workload forecast or who might want to update a study at some future time. It is hoped these recommendations prove beneficial in aiding the development of workload forecasts with increased utility for other organizations. 2. Background Information The organization requiring the update, in 1977 as today, was facing tremendous growth in the need for ADP support to meet the many demands of the user community. In such a dynamic environment it was diffi- cult, at best, to meet the ADP requirements of the user community. At the time of the original study the organization was support- ing six major offices at the national level and ten regions, each with varied require- ments. Two major computer centers, one with Sperry Univac equipment, the other with IBM equipment, were providing the major- ity of the support to the user community. The original workload forecast was an extremely detailed study performed in 1977. Historical data for the period October 1976 - April 1977 were used. In addition to historical data, management surveys were taken at various levels of the organi- zation. These surveys were used to quantify management's position on future organiza- tional growth. Based on this historical data and management survey information, the workload forecast for the organization was generated. Three different scenarios were developed to cover minimum, normal, and maximum periods of growth for the organiza- tion. Each scenario included processing projections for each computer center and each regional and national office for the 8 7 Upper Limit 10.00 IBM Growth Overall Growth 4.60 UNIVAC Growth 4.08 Lower Limit 3.00 1.00 "IT 00 ASSUMPTION: The IBM Processing Power is equal to the UNIVAC Processing Power in 1977. 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 YEAR Figure 1. Original Study Processing Requirements 256 target years of 1979, 1981, 1985 and 1990. 3. The Update Due to the large volume of data pro- vided in the original study, data were extracted and consolidated in a graphical format to depict the agency's projected growth in processing for both computer centers. For ease of comparison, all data were "normalized" to a base year of 1977 (See figure 1). Figure 1 depicts the future processing requirements of the organization as derived from the original study. The various lines are not representations of continuous func- tions. The points at each of the target years are the "normalized" processing require- ments of the organization. Figure 1 is presented only to show the original chart to be updated. Other problems associated with figure 1 will be discussed later. The basic idea of the update was to take 1979 historical processing data, develop a new graph, and compare it to the original information to determine the accuracy of the projection methodology of the original study. At first glance the problem seemed simple enough. Just accumulate the histor- ical data, process it with the software package which generated the original results and prepare a new chart. But as it turned out, the actual update was not quite that simple. There were a few "minor prob- lems" that precluded a simple solution to this problem. 4. Problems Encountered The first problem associated with the update was to locate anyone who might pro- vide insight as to how the graphs from the original study were prepared. In a period of only two and one half years, the exper- tise associated with the study had virtual- ly disappeared. A second discovery was the software package used in developing the original forecast was not available. No provision had been made to retain a copy of the support software for future use. The original study used management sur- vey information as an integral part of the original forecast. But due to the short period of time allocated for the update, new management surveys were not feasible. As a result, the original manage- ment information had to be used in the update . The majority of workload forecasts, once completed, become the official basis for any future references to workload growth. This study was no exception. It was the basis for depicting processing growth for the organization. Once a study becomes established within an organization, it be- comes almost impossible to replace it. It can be updated, enhanced or expanded, but never discredited or discarded. During the period between the original study and the update there were major sys- tem changes at both computer centers. Also, several times during this period the account- ing algorithms were changed to reflect the fact that both computer centers were non- profit organizations. Each time a change was made, the previous accounting data, relative to workload forecasting, was in- validated. Benchmarks or test programs could have been utilized to supply the linkage across configurations, but this was not done. The majority of problems encountered were directly associated with gaining ac- cess to data needed to perform the update. Data were either fragmented, inaccessible or non-existent. Also, data were recorded in different units in 1977 and 1979 at one of the centers. There were changes in contractor support which made location of some required data impossible. The final problem was deciding how to combine the work at each of the centers to develop one measure of agency workload. In the past (see figure 1) these processing requirements had simply been added together but this was considered an unrealistic assumption for the update. In the update the Univac and IBM systems were treated separately. Other problems were encountered in the course of this update. But these were the major problems that had to be resolved before the original processing forecast could be updated. 5. Update Methodology Given the list of problems that had to be resolved, one might assume an update of the original workload forecasting study was, if not impossible, at least unrealistic. Nevertheless, the original study had to be updated, and the update had to be performed without benefit of the original software package and new management survey informa- tion. The gross nature of the update and an insufficient amount of time were the two primary reasons new management surveys were not taken. 257 Since the Univac system accounting data for 1977 and 1979 were reported in SUP hours, they were used for that portion of the update. SUP hours included central processor usage, input/output usage, and executive services. In 1977 the IBM system accounting data were reported in Resource Hours. Resource Hours included central pro- cessor, input/output, and memory usage. How- ever, in 1979, after a change in contrac- tor support, the data were reported in Computer Utilization Units (CUU's). A fac- tor relating CUU's in 1979 to Resource Hours in 1977 had to be developed to provide the transition between the update and the original study. This will be duscussed later . Because of the differences in account- ing data, it was decided that an update would be performed for each of the computer centers. With the many assumptions made, no attempt was made to combine the informa- tion from the two centers to develop an overall organizational forecast. The following seven step procedure was developed for the update. The procedure for each computer center was essentially the same until the final step. For the purpose of explaining the procedure, all references will be to the Univac system. The first step was to determine the total resources projected for each of the ten regions and the six national offices for each of the three scenarios mentioned earlier. These totals were required for each of the target years of the original projection (1979, 1981, 1985, 1990). Each computer center was treated separately. All of this information was extracted from the original study. Step two was to generate a "normalized" total processing value for each target year. Once tables for total processing for each of the three scenarios were created, the totals for each office and region for each target year were totaled to get the total for each of the target years. These values were then divided by the 1977 total to get a set of "normalized" values. These values are represented as scenarios 1, 2, and 3 in figures 2 and 3 which will appear later. The third step required each of the values for each office and region for 1981, 1985, and 1990 in the table just created to be divided by its corresponding value for 1979. This created a table of multipli- ers representing the original management survey information for each scenario for the target years of 1981, 1985, and 1990. Step five of the procedure used this table of multipliers. The fourth step was concerned with the actual data for 1979. The SUP data in the accounting report for calendar year 1979 were averaged for the same offices and regions mentioned earlier. Since overhead was just a lump sum figure in the reports and accounted for approximately forty to forty-five percent of the total it had to be included. It was proportionally spread across the figures for each of the offices and regions. Step five involved taking the monthly average figure from the actual 1979 data for each office and region and multiplying it by the corresponding values from step three to develop the new projections based on the actual 1979 data. This procedure allowed the original management projections to be applied to the 1979 data. In step six the new projected values for each scenario for each target year were totaled. These totals were also di- vided by the actual total for the base year of 1977 to "normalize" the data for comparison to. the original study. The final step in the procedure was simply to plot these "normalized" values for comparison purposes. Figure 2 is a graph of the "normalized" values for the Univac System. All three scenarios from the original forecast are included. Only the scenario 1 values (normal) are plotted for the update. There is a large difference in the actual and predicted values for 1979. Be- cause of this difference a graph of the new values plus an accumulated error of ten percent per year was developed. Due to the numerous assumptions made in the up- dating procedure it was felt this second graph could be considered a "reasonable" upper bound for the new workload forecast. These new forecasted values are represented by dashed lines in figure 2. The same procedure was followed for the IBM system. All references to the 1977 processing data were stated in Resource hours and references to the actual 1979 data were stated in Computer Utilization Units. In attempting to form a bridge between the two units, consideration was given to going back to the original data and developing an artificial algorithm. Time 258 F A C T 5 R S NOTE: Vtorkload projections based on 1979 data represented by dashed lines. / Scenario 1 + 10% (1979) /8. ,76 / / 1.59 1.00 Scenario 2 12 " Scenario 1 (1979) 5.44 Scenario 1 4.04 Scenario 3 2.80 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 YEAR Figure 2. 1977 Univac Workload Projections constraints made this option unworkable. As a last resort, CPU hours for the years 1977 and 1979 were used to develop a conver- sion factor that could be used to compare the original and updated data. Figure 3 is a graph of the "normalized" values for the IBM System after the conversion factor had been applied. The labeling on the graph is the same as for figure 2. The data for figures 2 and 3 were pro- vided in the same fourteen year format as the original study. However, workload fore- cast information projected beyond three years has historically been shown to provide very little utility. At this point the obvious question must be asked. Just how good are the new projections of processing workload? This is a very difficult question to answer. If the original management surveys were good then it is probably acceptable. A re- examination of the procedure in 1981 is the only sure way to get a reasonable measure of its accuracy. At best, it can be no better than the original information. Time, as always, will be the ultimate judge. 6. Recommendations Obviously, if certain things had been done, the update of the computer workload forecast could have been accomplished with less effort. The following list of recom- mendations has been developed as a result of this update. This list is by no means complete. It is only meant to serve as a starting point for those organizations con- sidering a workload forecast. Workload characterization is essential. Select those measures relevant to your ADP operation. For those uncertain about what measures to select, a review of the recent technical literature is recommended [2] [3] [5]. It is difficult to know where you are going or how fast you are getting there without knowing where you have been. 259 9 7 F A C T 5 O R S NOTE: Workload projections based on 197 9 data represented by dashed lines. Scenario 1 + 10% (1979) 1.00 7.74 Scenario 2 6.03 Scenario 1 (1979) Scenario 1 5.14 3.56 Scenario 3 77 78 79 80 81 82 83 84 85 86 YEAR 87 88 89 90 91 92 93 94 95 Figure 3. 1977 IBM Workload Projections Second, do not consider the workload forecast as a single, one-time operation. Make it just one part of an on-going capaci- ty management/capacity planning process for the organization. Third, if the workload forecast is to be performed by a contractor, insist that all input data files and support software be made a part of contract deliverables. You will need these later. Fourth, require detailed documentation on the methodology used in the forecast. Eventually someone will wonder how it was done. Fifth, benchmarks, though expensive to develop and maintain, should be used if hardware and software systems change frequent- ly. These changes invalidate accounting data relative to workload forecasting but a benchmark can provide the bridge between the old and new data. Sixth, if benchmarks are not available, then a set of sample programs can be used. As with benchmarks, the linkage is essential if the data are to be used. And finally, for those looking for a starting point for reviewing the technical literature, the IBM System's Journal, volume nineteen, number one, 1980, entitled Instal- lation Management/ Capacity Planning is re- commended. It is an excellent source of information as well as a good source for additional references. 7 • Acknowl edgements I gratefully acknowledge the assistance of Dr. Thomas F. Gatts of the University of Alabama, and Dr. David F. McAllister of North Carolina State University who felt this study warranted documentation and en- couraged me to do so. Thanks also to Sharon Walker, Cheryl Mooring, and Leslie Batten without whose help this case study would not have been possible. 260 REFERENCES [1] Agrawala, A. K., Mohr, J. M. , and Bryand, R. M. , "An Approach to the Workload Characterization Prob- lem," Computer , Vol. 9, No. 6, June, 1976, pp. 18-32. [2] Bronner, L. , "Overview of the Capacity Planning Process for Production Data Processing," IBM Systems Journal , Vol. 19, No. 1, 1980, pp. 4-27. [3] Cooper, J. C. , "A Capacity Plan- ning Methodology," IBM Systems Journal , Vol. 19, No. 1, 1980, pp. 28-45. [4] "Guideline for Writing Computer Procurement Support Documen- tation," U.S. Department of Agri- culture Handbook Supplement: DIPS Manual Chapter 3 , March 1977, pp. 7-25. or McNeese, James E., "Computer Work- load Forecasting," Proceedings of the 15th CPEUG , San Diego, CA, Oct. 1979, pp. 113-120. [5] Tripathi, S. K. , Gordon, K.D. , and Agrawala, A. K. , "An Optimal Sample Size Allocation Scheme for Benchmark Design," Proceedings of the 15th CPEUG , San Diego, CA, Oct 1979, pp. 105-111. 261 Human Interfaces 263 Data Processing User Service Management P. S. Eisenhut IBM Data Systems Divsion, East Fishkill Route 52, Hopewell Junction, N.Y. 12533 After 30 years since the advent of the electronic computer, we are fast entering the era of the information society. The trend is towards increasing use and dependence of business upon data processing. Data processing professionals must provide ap- plications which perform adequately with respect to the business needs of users. Data processing professionals who concern them- selves with the functional and technological aspects of data processing must now concern themselves with the performance or effectiveness of data processing in the business environment. To do this requires a coordinated management process. This management process starts during the development of new data processing applications with user oriented service objectives. The process includes measurements, problem diagnosis, reporting of service and follow-up action, all on a regular basis. It also includes a formal capacity planning process which relates to com- mitted levels of service as well as computing load. Making the transition to an information society requires a new attitude among data processing professionals and a commitment from data processing management. Good data processing managers will view User Service Management as being critical to their own success. Key words: Computer performance evaluation; data processing per- formance; data processing service; information system design; in- formation system usability; performance management. 1 . Introduction Computing technology has developed at a \ery rapid rate during the 30 years since its beginnings. During this time, more computing capability has been pro- vided at dramatically decreasing costs. Yet, it seems much less attention has been focused on the Data Processing - User Interface depicted in Fig. 1. For any given enterprise, a Data Processing Organ- ization (D/P) will provide service to various user organizations within that enterprise. Ideally, this service should be in response to user needs. The inter- face breaks down because users find it difficult to express needs in terms meaning- ful to D/P; and because D/P finds it dif- ficult to determine those courses of action which best satisfy user needs within limited D/P resources. The following are examples of this breakdown in the D/P - User Interface. Example #1 This example is provided by Richard Nolan (Nolan, July-August 1977). The D/P department of a medium size company auto- 265 mates the marketing organization's order entry system. The vice president of market- ing is alarmed when he sees his monthly bill from D/P rise to 25% of his budget and to over 3 times the cost of order entry prior to computerized order entry. He demands an explanation from D/P. The representative of D/P explains that D/P bills so much for CPU, so much for elapsed time, so much for kilo- bytes per minute, adjusted to reflect virtual storage, and so much for EXCPS. He asks the vice president of marketing which of these he would like to control. This explanation is the last straw for the vice president of marketing. He phones the president of the company with the following complaint: "I'm strapped. Data Processing is charging me for services that are essential, and I can't do anything about the cost. When I try to get down to how to control the costs, all I get is technical gibberish...." Example #2 A large manufacturing site of an inter- national corporation decides to improve pro- ductivity by installing a computerized dis- patching system for equipment maintenance personnel. It is designed to have on-line terminal input and output and operate in real time. Another location already has such a system, and D/P estimates one systems ana- lyst and one programmer can complete the implementation in one year. Three years later, three programmers are still re- writing programs to make the system usable to the maintenance organization. Response times at the terminals are over 15 seconds, and terminal access to the data is only possible 75% of the time. Under pressure from manufacturing management, the main- tenance organization gives up on D/P. Within three months the maintenance organ- ization implements their own manual pro- cess using work order tickets and tele- phones. Other examples of the breakdown in the D/P - User Interface include the following: Reluctance by D/P to provide users with higher speed lines for fear that addi- tional load will result on the central computer. Removing a transaction from a perform- ance report because its response time was unacceptable. Sixteen weeks of unacceptable transac- tion response times because a program- mer changed the procedure for purging the data base just prior to leaving the company, but didn't tell anyone! User reporting unacceptable performance to top management based on their own measurements, and D/P attacking the measurement process. In 1975 the IBM users group, SHARE Inc., sponsored a study to extrapolate the trends in the data processing industry through 1985. The authors of the SHARE Inc. study make the importance of the D/P - User Interface quite clear when they state: "Our main conclusions are that over the next decade, the major tasks of the data processing industry will be to improve: ° The quality of data processing services - as perceived by the end users of these services;..." (Dolotta, et al , p. 9, 1975) According to the SHARE study, the trend is for the operation of business to rely more and more upon data processing. It is imperative that data_ processing organizations take steps to better' meet the needs of users. This paper introduces the framework of a man- agement process to accomplish this. It then discusses some of the key problem areas en- countered in trying to implement this manage- ment process in a typical D/P organization. 2. The Management Process 2.1 New Application Development The management process starts by man- aging the design and development of new user applications. A typical project management system to control the progress of a new appli- cation calls for project milestones and phases as indicated in Fig. 2. Key phases are problem definition, design specifica- tion, development (programming), and trial production followed by a final acceptance. In a D/P environment, this process often falls short of the mark because it manages function at the exclusion of performance . Users require both performance and func- tion. Functional requirements are those de- scribing what processes the program, equip- ment, and hopefully, people do. The perform- ance requirements are quantifiable expressions of how well the functions are performed. Figure 2 shows how performance management relates to the project phases. Management must require that design specifications include objectives for per- 266 Needs Service FIGURE 1. Data Processing-User Interface Phase Performance Milestone Completed Management 1 Problem Understand User Definition Performance Requirements 2 Specification Agreed Design Objectives and Design for Performance 3 Development and Developed Measurement Test Process 4 Accepted Trial Performance Meets Design Implementation Objectives Objectives Become Commitments FIGURE 2. Application Project Management 267 formance. Figure 3 illustrates a possible categorization of user performance require- ments in terms meaningful to users. If these are the categories of requirements, then they also must be the categories of performance objectives to be specified as part of the design. Design specifications should also include how performance will later be measured. The application project should not be allowed to proceed past milestone 2 until this requirement is met. Like- wise, the application should not be form- ally accepted as completing milestone 4 until the design objectives have all been satisfied as measured during the trial production period. At milestone 4, these design objectives may now become "commit- ments" by D/P. Objectives must be jointly developed and agreed by all parties affected by the outcome. Various departments or organiza- tions may be involved. Programmers, data base administrators and hardware configur- ators can control response time at a user's interactive terminal. Data center opera- tions and operating system support person- nel can control availability and reliability of the applications to the user. Data cen- ter operations and job schedulers can con- trol turnaround time for job completion or the degree to which scheduled reports are on time to the user. Users, as well as programmers, control workloads. All parties control cost. All affect the profits of the enterprise. Therefore, all are involved in the management process. 2.2 Overall Process Figure 4 illustrates the overall user service management process starting with new application development as just discuss- ed. By the time the application achieves milestone 4, there should be a committed set of performance objectives. Given this set of committed performance objectives, there must be regular (e.g., weekly) performance measurements of how well D/P and the users are performing relative to these objectives. These performance mea- surements must then be analyzed and evalu- ated to determine the reasons for deviations from the agreed objectives. Then, as per- formance versus objectives is reported to the D/P and user management, reasons for the deviations and recommended actions may also be reported. Management must then make final decisions on the actions or projects to undertake. These action projects result in the attainment of old objectives and in the establishment of new ones, thus closing the cycle. The D/P Plan, usually done once or twice per year, is shown in the center of Figure 4. Measurements determine existing load versus D/P resource capacity (manpower, CPU, DASD, etc.) and performance level. Given the existing load, users and application program- mers participate in the projection of load. The D/P Plan is at least a two year projection of load and necessary D/P resources required to handle the load within the limits of the agreed performance objectives. The perform- ance objectives associated with the Plan are those currently committed as well as design objectives that will become commitments. The result of the Plan is a set of actions, which may be managed as milestone type pro- jects that support these objectives and the projected load. The management process described in Figure 4 is dependent upon its many subpro- cesses for success. D/P management must understand the large amount of coordination required, and they must be willing to assign staff responsibilities for performing these subprocesses. 2.3 Benefits There are several key benefits to the User Service Management Process. Improved performance means that users will be more productive. For example, improved response time at a terminal means less user wait time and potentially fewer terminals and users for the same job or task. Also, for example, a manager that receives a manufac- turing line trouble report prior to the start of a shift can assign people to resolve the trouble before those people get involved in other things. Because it benefits them, users are motivated to participate in the process. Increased user satisfaction results not only from improved performance, but also from improved trust and communications. If a re- porting system allows both D/P and users to see performance in like terms, there will be fewer escalations to higher levels of manage- ment to resolve disputes. "Management by Objectives" becomes pos- sible if measurements and reporting are done against performance objectives. The objec- tives give purpose to actions like data base reorganization, adding higher speed communi- cation lines or a faster printer, reschedul- ing batch jobs, or program modification. 268 Actions • Scheduled Availability of Service • Unscheduled Down Time Mean Time of Service Unavailability % of Schedule Not Available Maximum Time of Outage Number of Failures Or Service Interrupts • Response Time (Interactive) Average Percentiles Tolerances • Turnaround Time (Unscheduled Batch) • Report Timeliness (Scheduled Batch) % of Reports Available to User on Time • Workload Capability Average Peak • Cost Charged FIGURE 3. Performance Requirements User Inputs New Application Development D/P Inputs Objectives Agreements Formal Plan Measure Diagnose Report Periodic Assessments FIGURE 4. User Service Management Process 269 Rather than fight fires, management can make changes based at least partly on the effects those changes will have in user productivity and satisfaction. It also becomes possible to talk to users about the tradeoffs be- tween more function and more performance. Justification of plans and hardware pro- posals becomes easier. It is quite often necessary for D/P management to justify ac- tions to higher levels of management. Such actions might include procuring major pieces of hardware like computers, mass storage de- vices, or printers. If higher levels of management understand that objectives are important for the business, and if D/P man- agement can show that these actions contri- bute to attaining the objectives, then D/P management will find these actions easier to sell. 3. Problem Areas The successful implementation of the management process requires overcoming sev- eral problem areas. Perhaps of even greater importance is that before general acceptance is possible, higher levels of management, or at least the manager of D/P, must understand and support the process and the methods to be used to overcome the problem areas. The following discussion concerns these problem areas and suggestions for dealing with them. 3.1 Defining Objectives Generally the objectives should be grouped according to a categorization simi- lar to Figure 3. Specific definitions, how- ever, will vary from one enterprise to an- other. The following general suggestions may help to avoid problems with acceptance or workability: (1) Only define objectives that can and will be measured. (2) Be sure that the results of measure- ments against objectives can and will be reported to users. (3) Attempt to define service objectives which can be directly linked to the business objectives of the user organ- ization. For example, if a D/P ser- vice for process control is out for more than X hours, it may impact the user's ability to produce product. Defining the maximum outage allowed would be a critical objective. (4) Attempt to define objectives as close to the D/P - User Interface as possi- ble. For example, define a response time at the terminal as seen by the user rather than at the host computer. (5) Allow the key users to choose objec- tives most meaningful to them. (6) Pick a time frame long enough to smooth out the effect of short term and random events, and short enough to allow at- tention and action before too much damage is done. (7) Remember, "To have an objective and not meet it is not nearly as bad as never to have had one." 3.2 Specifying Values of Objectives What values should be set for the objec- tives? For example, should average terminal response be 2 or 15 seconds? Should it be the same for data input as for graphics? The fol- lowing suggestions for setting values may be helpful : (1) Values for design objectives should be reasonable as determined by similar applications or technologies. Values for commitments should be achieved over a sustained period as determined by measurements. However, commitments should not be lowered because of de- graded performance. (2) Values should be related to business needs. For example, the timing of a report required to assign priority to work moving through a manufacturing line may be more critical than an employee education record. Therefore, the manufacturing report may be re- quired daily by 7:30AM 95% of the time, whereas the education report may not be required at any particular time of day. (3) Terminal response times should not exceed 15 seconds if the user is to remain at the terminal. (Martin, p. 326, 1973.) Faster user "think" times demand faster terminal re- sponses. (Doherty, p. 154, 1979.) This is why graphics requirements are in subseconds. Consistency of response is also important. The spread of response times is speci- fied by a percentile value (90 per- centile) as well as the average. 270 (4) Tolerances should be established around the objective values to recognize ran- dom events. For example, if the aver- age actual response is 5 seconds, and if results are reported daily, then some days may average 6 seconds and others 4. The tolerance (possibly 1 second) should recognize this fluctua- tion. Action would be taken only when the objective value plus the tolerance is exceeded. (5) Objectives should offer some challenge. "You don't make progress looking back- wards." 3.3 Form of Agreement How formal should this agreement of ob- jectives be? Should it be a "contract" with a specific user organization? Initially, users may balk at the term "contract" which may imply a formal and legally binding obli- gation. On the other hand, the complete ab- sence of a written document would imply a lack of agreement and would also preclude communicating a common understanding of what the objectives really were. The ideal solution seems to be to make available to all parties to the agreement, a common "Do- cument of Understanding" (DOU). This docu- ment should contain the performance objec- tives agreed upon as well as a statement of the philosophy and reasons for the agreement. It may avoid future arguments if the docu- ment also contains references to the source of measurements and manner of reporting actual performance, and the conditions for taking corrective actions. With what lev should D/P have a lose participation Too low a level wi all subject to cha Administration cou level can also mea between user organ possible to manage tion without affec el of user organization DOU? Too high a level will by the actual end users. 11 mean too many contracts, nge with reorganization. Id be costly. Too low a n too much interdependence izations. It may not be service to one organiza- ting service to another. There is much evidence to indicate a trend in the years ahead to more and more interdependence between user organizations. (Nolan, March-April 1979.) This interdepen- dence is characterized by many organizations using common data bases and applications, and the passing of information between organ- izations. An example of this is an online IMS application for tracking the flow of work-in-process through manufacturing. Jobs are released to manufacturing departments by various production control departments who make entries into the IMS data base. Claims of progress through successive fabrication steps are made by various manufacturing de- partments. Information is passed to an ac- counting department for product, costing, and to inventory control departments for in- ventory adjustments. In this data base ap- plication, performance is controlled by changing the application or by changing IMS. For example, a change to the data base struc- ture or a new version of IMS are possible actions. However, these changes affect all users of the application or all users of IMS. One way of dealing with this interde- pendence is to have separate DOUs for each D/P service offering, for example, one for IMS, one for TSO, one for unscheduled batch, etc. A DOU could have separate objective values for each application if values were not common across the service offering. The DOU would then be a common agreement for all users of the same service or application. This application orientation is, of course, fully consistent with the idea of setting performance objectives for new applications as discussed earlier. "All players on the team must play according to the same game plan if the team is to win." 3.4 Measurement Tools It has been said, "If you can't measure it, you can't manage it." A corollary is that it is meaningless to have an objective, if there is no way of knowing where one is relative to that objective. On the other hand, it is also meaningless to measure where you are relative to an objective that is irrelevant. It is often said that mea- surement tools in current use provide too much data and not enough information. What is really being said is that the measure- ments are compared to objectives whose rel- evancy is non-existent or at lease unclear. For example, to a user who has not been re- ceiving reports on time, the channel utiliza- tions from 2:00 to 3:00AM on the host com- puter would not be relevant. The user only understands that his report schedule has been met only 50% of the time this past week. What is needed are tools and proce- dures to bridge the gap between the user's view of performance against an objective and the evaluation of the causes of that performance. 271 First, there is a need for tools which measure actual performance against user agreed performance objectives. Tools of this type are almost non-existent. There- fore, measuring the percent of time that reports were available to a user on schedule must be done by a manual process. Response time at the terminal is another example. Use of hardware monitors have been unsuc- cessfully attempted. Alternatives have been to create a typical terminal trans- action and invoke it automatically. Another alternative might be to measure response only at the host computer and estimate the telepro- cessing time. Yet the most relevant alter- native may be to have the terminal user re- cord responses in a log according to a pre- arranged sampling plan. This certainly in- volves the user in the process. Secondly, there is a need for tools to relate higher level performance measures to lower level measures. In a preceding exam- ple, the reason that the user reports were available only 50% of the time may have been due to channel utilization, to DASD contention, to high central processor load, or to people in the printout room not tear- ing reports off and putting them into the user's pickup bin. Reports are needed to summarize this type of information over a period of time and make it available to a problem analyzer on a timely basis. Un- fortunately, such tools are not standard, and home-grown programs and manual data synthesis are the rule. Thirdly ized approac tive to the to by users, must invest use the avai new ones, fo nosis. This ever actual agreed objec propriate ac process. , there is a need for an organ- h to. analyzing problems rela- performance objectives agreed This means D/P management in qualified people who can lable tools, or can program r the purpose of problem diag- diagnosis is necessary when- performance deviates from user- tive. Recommendations for ap- tion should come from this 3.5 Reporting Assume that "What the user doesn't know can hurt you." Unless performance is reported to them, users may blow minor problems out of proportion and escalate them to higher levels of management as a means of guaranteeing acceptable levels of perform- ance. Therefore, performance should be reported to users. D/P services users, not the other way around. Users have the right to know. Developing DOU's or contracts with users is a formal process specifying user- oriented performance objectives. It makes sense that the process of reporting actual service back to the users should also be a formal process. D/P management must assure that the report includes performance ex- pressed in the same terms as those agreed to in the DOU, and is available regularly according to the agreed frequency. D/P management must also assure that problems and corrective actions are reported rela- tive to the user service. One of the problems with reporting is that there are no standard program packages to generate a comprehensive report. Each location or enterprise may have different formats and contents depending upon the DOU or contract. Furthermore, the variety of input required and the scarcity of mea- surement tools means that the collected data input must come from a large number of first line departments. To a large ex- tent, the collection of data and the sum- marization of data into a meaningful re- port will be a manual process. The first line departments involved in the reporting process may hesitate to devote resources to work that is not directly in line with their specialized missions. It may be necessary for management to redefine de- partment missions to include responsibi- lities for data collection and reporting. Generally, D/P management may find that a great deal of coordination is required to assure effective and timely reporting. Figure 5 shows a logical organization to accomplish coordinated reporting. Some- one will be assigned the clerical responsi- bility for pulling together the data col- lected by the various first line departments doing measurements. This person will sum- marize the data into predefined formats for regularly scheduled presentations and formal reports to users and D/P mangement. An ana- lyzer will also be assigned responsibility for problem diagnosis and action followup. This analyzer will make use of the detailed and traditional performance measurement and modeling techniques to explain poor user service shown in the management presenta- tions. He will recommend courses of action to improve user service. The formal reports 272 to user and D/P management should include the status of actions being taken to cor- rect poor user service. Management coordi- nation is accomplished by establishing a reporting process, by formally assigning responsibilities for the reporting tasks, and by following up on action items. 4. Mentalities Perhaps the biggest obstacle to suc- cessful implementation of the user oriented process of management are the mentalities of the organizations involved. 4.1 The User Mentality Users are of many different person- alities and professions. They may be ac- countants or clerks, or they may be engi- neers or management staff. Users are specialists in their own profession. The user works with output units, such as in- voices processed, inventory part status, production schedules, etc. Users tend to be alike in their view of D/P. According to the study by SHARE INC., (Dolotta, et al, p. 43, 1976.): "As a rule they have no interest in how the system works as long as their needs are met; that is, as long as the data pro- cessing system provides the services they desire at what they consider to be reason- able cost. A failure to provide these services is a failure of the entire- data processing system in these users' eyes." 4.2 D/P Mentality The D/P mentality is best described by Louis Fried (Fried, p. 30, 1979). "The programmer's programmer is one who can design a program that will operate faster using the least amount of memory, to produce a desired result. The programmer's interest is directed toward ultimate effici- ency in use of the computer. As might be expected from one drawn to a highly individ- ual activity oriented toward abstract and machine-related problem solving, the pro- grammer is generally introverted. He finds his best expression in dealing with situa- tions that do not include human relation- ships." The D/P mentality has great difficulty understanding user needs and has great difficulty adopting the user service manage- ment concept. Traditionally, professionals in D/P be- gin as programmers and so this mentality is extended to programmer-analyst and to systems analysts dealing with hardware architecture and systems software. Lower levels of D/P management who are brought up in this envi- ronment may also exhibit the same mental- ity. Lower levels of D/P management may be afraid to "dictate" objectives to their staffs lest they lose popularity. Staff may fear management evaluation or lack of support in their efforts to achieve objec- tives. The following typical comments were made by lower levels of D/P management, when asked to agree on user-oriented objectives, and are indicative of the D/P mentality: ° "The users aren't complaining, service levels must be OK. Don't need all this." "I won't sign unless they sign." "I can't agree on objective unless I am certain I will achieve it." O II I don't have control over service." "I should not be responsible, they should." Can't commit to anything, things are always changing." ° "User will use the objective for ammuni- tion and turn me in everytime I miss it." "My manager told me not to sign anything unless he sees it first." "I can't sign anything unless my staff first commits to it." 4.3 Renaissance Man The D/P mentality is necessary and vital, but there needs to be more of another type of mentality in D/P. This new menta- lity is one that can bridge the user menta- lity to the D/P mentality. It is a total systems view. It is a business view. En- ter "Renaissance Man." This new mentality parallels very closely, what Louis Fried calls the System Analyst's Analyst. (Fried, pp. 31-32, 1979.) 273 — Coordination User D/P Mgmt T Coordination ~l Service/Action Report I Report Preparation Problem Diagnosis First Line Departments FIGURE 5. Organization for Performance Reporting Manufacturing Production Prod./lnv. Control Product Engineering Process Engineering Tool Engineering Industrial Engineering D/P Data Center Operations Operations Support Application Development Software Support Hardware Support Plans, Measurements and Reporting FIGURE 6. Manufacturing-D/P Analogy 274 "The 'Systems Analyst's Analyst' is one who can design and implement a system that is appropritate to the problem, make the op- timal use of the organization's resources, be integrally related to the other systems in the organization, and do the job intended. Ideally, he'll function as an extension of the organization's management...." " He is keenly aware that all sys- tems are run by people. His concern is cen- tered around aligning the goals and activi- ties of the people with the goals and ac- tivities of the system. Characteristically, the analyst is an extrovert. He is oriented toward interpersonal relationships." It is time to view D/P as a business, or at least as a business within a business. Figure 6 shows that there is a direct anal- ogy between organization of a manufacturing enterprise and the organization of D/P. "Customers" become "Users;" "Product fabri- cation" becomes "Data Center Operations," etc. The same professional skills that make the manufacturing enterprise successful can therefore make D/P successful. Where does Renaissance Man come from? It may be necessary to change hiring prac- tices. Traditionally, D/P has hired pro- grammers with no prior experience or with computer science degrees right out of school. These neophytes were then required to acquire company-sponsored eduction and work up through the ranks of programming responsibilities to become systems analysts and into lower levels of D/P management. This is an unlikely source for Renaissance Man. Perhaps a better source is D/P pro- fessionals from application areas or with business backgrounds. For example, MBA's, production control analysts and industrial engineers are possible candidates. Renaissance Man's first step must be to reorient the top level of D/P management to the new business view of user service. According to Geoffrey Goodman (Goodman, p. 69, 1979.): "The first steps in establishing a computer performance system are critical to success. Lack of attention to organ- izational and management needs are as likely to lead to failure as lack of at- tention to technical details." The top manager of D/P is best able to see the business view. He doesn't get to the top by being myopic. In fact, the good D/P manager will view user service as critical to his own success. Indeed the top D/P manager may himself become Renaissance Man! Once the top D/P manager has the business view of user service, the lower level of D/P management and the rest of the organization will follow. The change will occur through a combination of direction from the top, education, reas- signment of job responsibilities, and re- staffing. Following this, the technical part of managing user service will then be rela- tively easy. The first step to reorient D/P management may prove to be Renaissance Man's biggest contribution. 5. Conclusion If Data Processing is to continue to advance, a new process of performance manage- ment is required. The systems analyst, the industrial engineer, or even the information administrator, will prove to be the Renaisance Man who brings about the birth of usable data processing. The technology of flying machines came 30 years before commerical passenger travel by air. Likewise, after 30 years the technology of computing machines is about to enter its own era of usability. Some will call this period the Information Revolution. The need for Renaissance Man is here now. Each of you is challenged to assume this role! 6. References (1) W. J. Doherty and R. P. Kelishy, "Managing VM/CMS Systems for User Ef- fectiveness," IBM Systems Journal, Vol. 18, No. 1, 1979. (2) A. Gayle, "Perspectives on Business Management - Data Processing a Manage- able Resource," IBM Systems Research Institute Publication, TR73.003, June 4, 1979. (3) G. H. Goodman, "Methodology for Es- tablishing a Computer Performance Management System: A Case Study," CPEUG, 15th Meeting, N.B.S. Special Publication 500-52, pp. 69-76, 1979. (4) S. Lowry and D. Schafer, "Implementing a Performance Management System," Pro- ceedings Guide 48, Philadelphia, pp. 875-898, May 23, 1979. 275 (5) R. L. Nolan, "Managing the Crises in Data Processing, " Harvard Business Review, March-April 1979. (6) R. L. Nolan, "Controlling the Costs of Data Service, " Harvard Business Review, July-August 1977. (7) R. L. Nolan, "Computer Managers to Data Resource Managers," Proceedings Guide 48, Philadelphia, pp. 1797- 1805, May 24, 1979. (8) R. A. Sills, "Data Center Performance Reporting," Proceedings Guide 48, Philadelphia, pp. 469-474, May 23, 1979. (9) D. P. Norton and K. G. Rau, "A Guide to EDP Performance Mangement," Q.E.D. Information Sciences Inc,. Wellesley, Massachusetts. (10) C. B. Wilson, "Technology Assessment: ADP Installation Performance Measure- ment and Reporting," N.B.S., September, 1979. (11) L. Young, "IE-The Ideal Interface with MIS," Industrial Engineering, November, 1979. (12) Data Processing in 1980-1985 , A Study of Potential Limitations to Progress, T. A. Dolotta, et al, John Wiley & Sons, by SHARE Inc., 1976. (13) Practical Data Processing Management L. Fried, Reston Publishing Company, 1976. (14) Design of Man-Computer Dialogues , J. Martin, Prentice-Hall, Inc., 1973. 276 Installation Management 277 Performance Evaluation of Computer Operations Procedures Patrick A. Drayton Southwestern Bell Telephone Company St. Louis, Missouri 63101 A discussion of the measurements required for an evaluation of the operating procedures of a large computer system data center. Included are the guideline values which should be achievable with good operating techniques and solutions to frequently encountered problem situations. Key words: Performance, Operations, Measurements. 1. Introduction 1. Today's data processing hardware is experiencing an exponential advance in technology. Coupled with this is an increasing trend towards more sophisticated operating systems and user software (data base systems, intelligent networks). In marked contrast, there has been negligible attention placed on the operating procedures used in data centers. This is unfortunate since DP operations personnel must cope with the advancements in technology. Some examples should help illustrate the point. As CPU hardware becomes more powerful, more jobs are processed simultaneously which places increasing pressure on the operations personnel in the areas of scheduling and resource management. In addition, all inefficiencies in tape mounting and WTOR replies are magnified by the larger multi-programming level. As the peripherals become faster and more complex, the operator's actions become more important. For example, in many installations the introduction of non-impact printers displaced numerous impact printers because of the five-fold increase in print speed. To operations this means that a five minute job setup time has the impact of 15-25 minutes of total print throughput based on previous hardware. Operating systems, have progressed from single thread real storage systems to virtual machines. This complex environment has added problems to operations such as swapping, paging more powerful commands, intricate messages, and job entry subsystems. But there is still only one person sitting at the master console who must digest all the messages on the screen and make the appropriate response, or take the correct preventive action. One additional area of technical advancement is user software systems. The increased use of on-line data base systems and time sharing systems has resulted in increasingly difficult operational problems. Instead of simply two or three batch jobs active at one time, operations is now responsible for 279 hundreds of terminal users. Each of these could have need of operator interaction to resolve response time delays, data base problems and hardware failures. In summary, there is an increasing potential for user-related performance problems. It is clear that many areas of the DP environment have greatly advanced in the last ten years. However, the ability of operations to cope with these new technologies has not always kept pace. Recently a number of software products have entered the market whose aim is to automate portions of the operator environment. These include tape management systems, disk space control systems, automatic scheduling systems and operations interface subsystems. SMF (System Management Facilities) - accounting type data on all jobs including resource consumption and termination status. RMF (Resource Measurement Facility) - IBM's system monitor which reports on device usage, paging, CPU usage and SRM (System Resource Manager) activity. Timings - stopwatch timings of events. Observations - monitor data may indicate a problem, but without observing the situation, the cause may go undetected. This discussion attempts to explore the areas of performance as they relate to the actions of the operators. This includes the following: - Sources of measurement data Guideline values with which to evaluate operations performance Recommended solutions to problems. - Interviews - these include discussions with users, operators and data center managers. It is an attempt to get their perspective on what is right/wrong in the data center. 3. Measures, Guidelines and Corrective Action This section is divided into five areas: It is hoped this will both increase awareness of the importance of operations in the data center and provide a means of evaluating their impact on overall system performance and throughput. The specific tools and methods are aimed at large IBM MVS data centers. However, the basic principles should apply to all operating/hardware environments. Tape Operations - Disk Operations - Printer Operations Scheduling Quality Control In each area, the operation to be measured, the guideline value and potential corrective actions will be discussed. 2. Sources of Measurement Information 3.1. Tape Operations There are six major sources of information concerning operations performance. A brief description of each follows: Hardcopy log - this is a data set that contains a copy of all messages that appeared on any console plus all operator replies to program/system requests. All entries are time-stamped. For any data center with an appreciable amount of batch work, the operation of the tape pool is of vital importance. Four measures of performance are discussed: scratch tape mount time, input tape mount time, allocation recoveries and tape load failures. 280 3.1.1. Scratch Tape Mount Time This is a measure of the time between a request for a non-specific scratch tape and the satisfaction of that mount request by an operator. Both the request and the volume recognition message can be found on the hardcopy log. If the average time required to mount a scratch tape is greater than 60 seconds , there are problems in this area. Some possible solutions are: 1) Insure that a supply of scratch tapes are within easy reach of the tape pool operator. That is , the tape console and the scratch tapes should be as close to each other as possible. 2) Review labeling procedures. Do not prelabel scratch tapes and require that a particular volume is needed be satisfy the request. If a physical label is attached after each tape file is created, the use of a label printer driven by the mount messages greatly reduces mount time. 3) Insure that scratch tapes are easily distinguished from used tapes. An operator should have no doubt as to the status of a particular volume. 4) Use of a tape management system relieves the need for rings in scratch tapes, since it will protect unexpired data sets and it relieves the need for external labels except for tapes shipped to other locations . 5) Review the tape drive configuration. The ideal is a square or rectangle with all drives facing the operator. For tape pools which are shared between systems it is best to designate strings of drives as dedicated to a particular system. There will also need to be strings which are designated as sharable between systems to facilitate variances in workload. The dedicated drives should be localized in an area near the operator's console to reduce mount times. 3.1.2. Input Tape Mount Time This is the time between a request for a specific volume and its satisfaction. Unfortunately IBM systems without a local modification do not recognize the satisfaction of input tape mounts. There are two available sources for the data required to make this measurement. They are: A manual timing between the issuance of the message and the tape mounting - Average percent mount pending for all drives from RMF tape device usage report. If the manual timing results in an average greater than three minutes or RMF reports an average mount pending percentage greater than ten , a problem exists. Some potential solutions are: 1) If input tapes remain in the storage area until a mount request appears on the console, pre-staging tapes based on a job schedule will greatly reduce mount time. One method of accomplishing this is to have someone review all job run requests and through the use of a tape management system, note which input tapes each job requires. This will allow for the pulling of these tapes prior to the job executing. Some automated scheduling packages can generate a list of tape volumes that will be needed up to 24 hours in advance . 2) After an output tape has been created, it should not be refiled immediately unless it is known that no subsequent step/job will be using that volume as input. It is best to place newly created tapes on a holding cart for from one to two hours prior to refiling. 281 3) All tapes should have external serial numbers on the spine of the cartridge/case to facilitate quick identification. 4) For input tapes prestaged in the operator area, quick retrieval is enhanced if they are grouped in some logical manner. This could be in serial number bands or by jobname/application group. 5) The tape drive configuration should be conducive to quick access by the operator. (See #5 under scratch tape Mounts . ) 3.1.3. Allocation Recoveries An allocation recovery occurs when a job step requires more physical peripherals (in this case tape drives) than are physically on-line to the system. This job step and all subsequent jobs allocating tape drives will wait until the allocation request is satisfied. Therefore, a small number of these can greatly impact throughput. These can be measured by an analysis of the hardcopy log. NOTE: SMF has a record type for allocation recovery but does not time them, nor are all responses counted. If more than 10% of the tape mounts result in an allocation recovery message, a problem exists. Some possible solutions are: 1) Each job which requires tape drives should be so identified to the operator along with the highest number of drives required concurrently. This can be accomplished either by using a position in the jobname (i.e., seventh position indicates the number of drives), by use of the JES /* SETUP card, or by use of a special job class. 2) A master console operator should be constantly reviewing the number of drives available before releasing a tape job for execution. The standard IBM commands are available for this purpose. 3) If the tape pool is shared between two or more CPUs, excellent and constant communication needs to exist between the operators. 4) Review tape requirements of all jobs for possible reduction. Small tape files may be candidates for disk. Use of flip/flop on tape files should only be done when the file is greater than three reels. Allocate files to the same drive whenever they are not required to be open at the same time (i.e., SORTIN and SORTOUT files). Use FREE&CLOSE on any tape files which are closed an appreciable amount of time before EOJ (i.e., SORTIN when SORTOUT is to disk). 3.1.4. Tape Load Failures A tape load failure occurs whenever a tape fails to become ready the first time it is mounted on a drive. The operator is then required to remount the tape. An abundance of these are aggravating to the operators and degrade job throughput. If more than 5% of the tape mounts are resulting in load failures, a problem exists. Some potential solutions are: 1) Institute a practice of tape leader inspection after each load failure rather than just retrying the load operation. If the leader is damaged, it should be clipped before the tape is reloaded. Failure to do this will only cause the problem to get worse. 2) For tapes in "easy-loader" cartridges, be aware of situations where the tape is removed from the cartridge to accommodate non-cartridge loading drives. If these exist, they are probably the cause of leader damage. Removing the tape from the cartridge also causes the leader to not be positioned near the cartridge opening, which will degrade load performance. 3) For tapes not in cartridge, the leader should be retained with a 282 sponge rubber grommet to prevent damage . 4) The temperature and humidity in the computer room and in the tape storage area should be within manufacturers specifications. Temperature in the range of 65-75 degrees and relative humidity of 35-45% is usually acceptable. Deviations from these could cause excessive static and inhibit loading. 5) The hardcopy log can be used to identify problem tape drives. The vendor should insure that those drives are at the proper maintenance level and that they are operating within specifications for airflow and suction. 6) All tape drives should be cleaned at least once every eight hours. 3.2. Disk Operation Control of the disk pool is extremely important. Almost all jobs will either use permanent disk data sets or require temporary work space. Three measures of performance are discussed: allocation recoveries, space allocation and ENQ delays. 3.2.1. Allocation Recovery In a data center with mountable disks, allocation recoveries can impact throughput. An allocation recovery occurs whenever a mountable disk request cannot be serviced on the available drives. If more than 10% of the disk mount requests (as measured on the hardcopy log) result in an allocation recovery, there is a problem. Exclude from this count allocations for temporary space. Some potential solutions are: 1) Review scheduling procedures. The master console operator should be made aware of all mountable disk requirements either via JES /* SETUP cards or a list of all jobs' disk requirements. 2) Attempt to schedule together, jobs which require the same mountable disk. This reduces both the number of mounts and the potential for allocation problems . 3) If scheduling does not relieve the problem, evaluate the cost effectiveness of adding hardware and making the problem disks permanently resident. 3.2.2. Space Allocation Insufficient temporary disk work space will significantly reduce throughput as a job waits for another job to release space. This problem is difficult to measure, but the console operators are very aware when it is occurring. For this reason, interview and the operators note the level of complaints about space allocation problems. If these are high, some potential solutions are: 1) Review space requirements of all large users of scratch space. After identifying the jobs, certify that their space requests are justified. For SORT WORK data sets which are created in one step and passed to another, console/sysout information should be available from which actual space required can be calculated. For data sets the use of SMF EXCP counts on file creation plus block size information can be used to calculate space needed. For files entirely used within a step, SMF EXCP counts can give activity but the programmer will need to be questioned on the number of times the data is accessed. 2) Always allocate in cylinders or rounded blocks to increase performance and decrease fragmentation . 3) For files which vary greatly in their volume, use the primary allocation for the expected volume with a secondary allocation to handle unusual volumes. Potential problems exist with this if the secondary is not available in the 283 middle of a long running job. Use of RLSE in the JCL will allow freeing of unneeded space after the file is closed. 4) Dedicate disk packs for use as scratch packs and insure that no permanent data sets are allocated to them. 5) Regularly execute maintenance which scratches unneeded data sets and data sets remaining after a system crash. 6) Avoid scheduling two known large work space users at the same time. 3.2.3. ENQ Delays ENQ delays occur when two jobs require the same resource and one of them cannot share the resource. The most common occurrence are two jobs accessing the same data set and one has DISP=OLD. However, ENQ delays can also occur for catalogs and VTOCS. Often an ENQ will result in a swap out of the second job requesting the resource. If RMF indicates you are experiencing more than .01 ENQ swaps per second , a problem exists. Use detail RMF enqueue reports to identify the causes of the ENQ swapping. Based on these findings, the following solutions are available: 1) The ERV (enqueue residence value) in IEAOPTxx member of PARMLIB should be set at 500 to allow time to resolve enqueues before swapping occurs. 2) DSN - if many ENQ are occurring because two jobs are requesting the same data set, attempt to either reschedule the jobs or have their execution tied together via an IEBGENER step whereby a successful execution of one job will cause automatic submission to JES of the subsequent job. Also, review the use of DISP=0LD to insure that updating of the data is being done. If this is a read-only operation DISP=SHR should be used. 3) VTOC - ENQs occurring on the VTOCs of storage packs, indicates a need for more scratch packs. Contention for allocation of work space is lessened when there are more candidates for allocation. 4) Catalog - ENQ problems involving catalogs usually only occur on large TSO systems. They are indicative of the need for a split of the problem catalog into smaller user catalogs. The most logical breakdown is best and easiest to maintain (i.e., by department, application, location). 3.3. Printer Operation The performance of the printer subsystem is vitally important in that it produces the product the end-user actually receives. Quality, clarity and promptness are important in this area. The main measure is user satisfaction. If during the user interviews, complaints about printed quality are raised, the printer operation needs to be investigated. If delivery of printouts is sited as a problem, RMF can be used to measure printer usage. If the average percent not ready is less than 10% and the busy percentage is not in a range of 30-70% , printer operating procedures should be questioned. Some potential solutions are: 3.3.1. Quality Problems 1) For impact printers, the ribbon should be changed regularly. It is better to have a set interval rather than waiting until it is noticed that the printing is getting light. 2) For nonimpact printers, the toner should be refilled on a regular interval. 3) Cleaning of the printer either by the vendor or by the operator should be done regularly. Once a shift for impact, every hour for nonimpact. 4) Preventive maintenance is extremely important and deviations from the vendor's recommended schedule should not be allowed. It may be 284 inconvenient, but in the long run it will provide higher quality output and more sustained periods of up time. 3.3.2. Output Delivery Problems 1) Insure that the configuration is conducive to good work flow. Are the paper load/retrieve areas on the printer easy to access? For example, the 3800 paper is both loaded and removed from the front. Having two 3800s face each other will allow one operator to quickly service both. 2) Adequate storage area should be available around the printers to pre-stage both stock and special forms paper so that the operator does not have to retrieve subsequent boxes from the stock room. 3) Separator sheets between jobs should be employed. They allow job differentiation and eliminate the need to remove the paper after each job completes . 4) Some type of destination address should be attached before the printout leaves the printer area. This can either be computer-generated or be added by the operator by using a distribution list. 5) If the printers are averaging greater than 70% busy, look into acquiring additional hardware. Higher usage than this is difficult over extended periods due to maintenance, downtime and job setup. 3.4. Scheduling As has been previously mentioned, the use of an accurate job schedule is necessary for all data centers. If during the interviews with the user community, service complaints arise, scheduling should be investigated. SMF accounting data can be used to identify the flow of work through the CPU. This data, along with a list of commitments to the user can be used to determine the specific service problem areas. Once identified, the following action should be taken: 1) Perform an analysis of the abends experienced by the problem application(s) using SMF data. Depending on the abend code, different actions should be taken. USER and OCX - usually a programming problem, but could be caused by invalid input data/parameters. Review data verification procedures and redesign programs to delete invalid data and print appropriate message rather than abend. - 222 - operator cancels should be eliminated. Investigate the reason for any cancellations, inadequate resources, data sets unavailable or input media not ready and correct the situation. - X37 - usually a space problem. Assure that adequate work space is available at all times. This often occurs during runs of unusually heavy volumes. These problems can often be avoided by use of secondary space allocations. X13 - label problem. This usually indicates a problem in matching JCL information with the physical data set attributes. Use of the catalog and a tape control system, and eliminating the coding of volume and DCB parameters on input files should correct this. 001 - usually caused by read/write errors on hardware. Review PM procedures and use LOGREC data to pinpoint problem devices . 285 JCL errors - review methods of updating procedures. Use of TSO with initials or comments by each modification is best. Use of reader procs where the jobs are submitted via TSO eliminates card decks and many override problems. 2) Review critical path of application. Insure that operators are aware of which jobs are in a direct line with the required output and which are sideline jobs which can be processed as time becomes available. A special SRM performance group can be used to guarantee an application's service levels. control is two-fold; first user complaints will be heard when they are dissatisfied with their reports, second SMF can be used to report on job reruns. If more than 4% of the jobs are rerun, a problem exists. Some potential solutions are: 1) Install a position of quality control which monitors all traffic, both in and out of the computer room. This would include the following: Review all user report requests for accuracy and completeness. - Insure all JCL overrides are error free. 3) Review start-end time of each job in the application's cycle using SMF data. For each job, the following questions should be asked: Notify master console as input becomes available. Review all SYSOUTS for good EOJ. Is its current start time the earliest it can begin? Is the data actually ready sooner? Is this dependent job being started as soon as possible? For example, if job B requires a tape out of step 2 of job A, it should not wait for all the steps of job A to complete before beginning execution. Insert an IEBGENER after step 2 in job A, which automatically submits job B. 4) Review relative priority of applications, i.e., who gets delayed when a "hot" job is put on the system. These priorities should be documented and known by everyone. 3.5. Quality Control Control of a job's input and output is extremely important for a successful operation. The measure of quality File all SYSOUTS for later reference. Extract run control totals as needed. Forward abends/JCL errors to person responsible. Review all output for compliance with user requests and for quality. - Log all activity in and out. - Log all problems, including time, person referred to, solution and time lost. 2) Install a change control procedure. This should insure that both system program and procedure updates are reviewed for completeness and are applied per instructions. 4. Conclusion The procedure outlined above is not a one-time experience which is only 286 entered into when the wolves are at the door. For a data center to operate effectively, a system of continuous performance monitoring must be followed. This insures that few surprises arise and that adjustments are made in a calm, controlled atmosphere rather than a "red alert" panic situation. It is also helpful if the measures are done on a work shift basis with the results published to establish competition between the shifts. 5. Guidelines Listed below are the guidelines discussed in each area: Measurement Acceptable Range 1. Scratch tape mount time 2. Input tape mount time 3. Tape allocation recovery 4. Tape load failure 5. Disk allocation recovery 6. Disk space allocation 7. ENQ delays 8. Printers 9. Scheduling 10. Quality control 0-60 seconds 0-3 minutes 0-10% mount pending (RMF) 0-10% of tape mounts 0-5% of tape mounts 0-10% of disk mounts operator feedback 0-.01 ENQ swaps/seconds (RMF) user feedback 0-10% NOT READY (RMF) 30-70% BUSY (RMF) user feedback user feedback user feedback 0-4% jobs rerun 287 Evaluating Total Computer Performance for Top Management Richard L. Fidler Office of Procurement and ADP Management Office of the Secretary Department of Commerce Washington, DC 20230 Computer performance evaluations, installation reviews and EDP audits have all focused on an individual data processing system. It is rare that any evaluation is made of all computer systems and related functions on an organization-wide basis. Consequently, there is an abundance of technical and management information for use at the installation level but a dearth of management information for use at the organization's top level. At the Department of Commerce we are initiating a new program of installation reviews, to provide useful information not only to installation management but more importantly to the Assistant Secretary for Administration, the top Commmerce official responsible for the management of computer resources. A three-phase approach will be used. First, all data processing installations will complete an annual questionnaire about their operations. Second, several installations will be selected each year for on-site reviews. Other, more technical evaluations may be made as necessary. Finally, using the summary information provided by the bureaus and based on the installations' questionnaire responses, an annual report will be produced for the Assistant Secretary. Key wor d s : evaluation. ADP management, computer performance 1 Introduction We know how to evaluate one data processing installation, but how do we evaluate all installations, and on an annual basis, for top management? This is the question we wrestled with at the Department of Commerce. The Office of Procurement and ADP Management (OP&ADPM) is charged with evaluating all installations and making reports and recommendations to the Assistant Administration!' the ADP official. Secretary Department 's What We Did in the Past The Department evaluation program first heg3n in 1976. annual review schedule was set by for top an The the 289 Office of Audits because ours was an integral part of their overall review of a particular office. Typically one person from our office spent a week on-site evaluating the installation. Afterwards he wrote a report and sent it to the Assistant Secretary. On occasion the services of the Federal Computer Performance Evaluation and Simulation Center (FEDSIM) were utilized for a thorough technical evaluation while the Department of Defense Computer Institute (DODCI) was called in for ADP security reviews. With a complete turnover of OP&ADPM's top management in 1979, the new managers felt the program needed revision, for several reasons: The program was limited by resources to only two or three installations annually. The focus was on the optimization of computer utilization, thereby excluding functions such as programming and areas such as adherence to Federal Information Processing Standards (FIPS). It did not take into account the ADP security requirements mandated by the Office of Management and Budget (OMB) in its Transmittal Memorandum Number 1 to Circular A-7 1 • Therefore, management decided a new approach was needed and laid down the following conditions: 1. The evaluation should be on an annual basis. 2. It should include all data processing installations within the Department. 3. Travel funds for on-site reviews will continue to be limited. 6. The various security requirements, particularly those pertaining to reviews, must be included . 3. Putting Together the New Program We felt the easiest, cheapest and fastest way to implement such a program was to do what we did with our ADP security manual: copy an existing one from another agency. Therefore, we contacted about a dozen other agencies including the General Accounting Office (GAO) and asked if they either had such a program or knew of any other agency which did. To our disappointment and surprise we found that no agency was conducting a program similar to the one we proposed . We found most agencies had no program at all. Those that had programs did not suit our needs. For instance, one agency had a rather extensive computer performance evaluation program using hardware and software monitors. Another had a program conducted by its Office of Audits. Another had a comprehensive program but the results did not get to and were not intended for the agency's top management. Another looked at a specific topic such as ADP security but neglected what we thought were equally vital areas. Others did it on a sporadic basis, very similar to what we had done in the past. Another agency performed a review immediately prior to the installation of any new computer system. Some had a program which evaluated computer center operations but neglected such areas as top ADP management, programming and analysis and those users who did not have their own computer but instead had terminals connected to someone else's computer. Even consolidating all these programs would still not have yielded a program aimed directly at top management. All functions of the installation must be reviewed, not just computer center operations. All aspects of proper management security, budget, programming, user satisfaction must be evaluated, not just computer performance . Consequently we had to come up with a totally new program. For sources we used the various questionnaires we obtained from those agencies having some kind of review program, the audit program used by our Office of Audits, several volumes of the Faim Technical Library, magazine articles, GAO pamphlets. 290 conversations with other professionals and our own thoughts as to what information would be useful. 4. Objectives The objectives of the program are to ensure that all applicable regulations, particularly in the areas of security and standards, are being adhered to, to determine how effectively our office is doing our job and what we still need to do, to disseminate knowledge of problems and their successful solutions throughout the Department, to provide our office with very basic information about all our installations and to identify problems that cannot be resolved by installation management or the various bureaus. We want to evaluate every data processing installation in the Department. We generally are following the guidelines of OMB Circular A-7 1 , Transmittal Memorandum Number 1, as to what constitutes a data processing installation. Since their guidelines encompass nearly everything and everyone involved with data processing, our evaluation program takes into consideration the wide variations we have in Commerce. 5. Proposed Evaluation Program By consulting all our sources we came up with a program that has three distinct parts: an annual Commerce-wide questionnaire, on-site reviews of selected installations and summarized reports for use by the Assistant Secretary. The first part is a questionnaire consisting of approximately 250 yes/no questions and 25 questions requiring narrative answers. The 25 narrative questions cover such items as history of the installation, unusual conditions (proximity to volcano, area prone to flood or earthquakes, etc.) and the mission of the installation. To be completed by each data processing installation the questionnaire has sections for top ADP management, computer center management, remote programmers (not under direct control of this installation's management), other miscellaneous kinds of installations and technical and non-technical user satisfaction. Following is a sampling of questions. For ADP management: What is the stated mission of this ADP installation? Discuss the reasons for any overtime incurred over the past year. For computer center management: What programs and procedures do you have to continuously evaluate the performance of the center's hardware, software and services? Are console operator printouts reviewed daily to detect equipment and operator problems? Is the equipment being used for the purpose given in the justification statement when it was acquired? Has the fire department assigned to service this installation been notified that this is a data processing installation? Are tapes in the tape library stored vertically? Have turnaround standards been established for various classes of jobs? Are supplies of critical forms stored off-site as backup? Are procedures for using the telecommunications system documented for the users? For systems management : and programming Is the testing of new programs done by someone not involved in the design or programming of the new program? Is there more than one copy of program documentation? For terminals connected to this computer system: Is there a portable fire extinguisher 291 within 50 feet of all data processing equipment? For users: Describe your workload (cyclical or erratic, peak times, business or scientific, etc.). Do you maintain any manual system as a supplement to your automated system? Have you ever received a copy of the computer center's user's manual? The answers to all questions will first go to top management in the particular bureau. These people will summarize the information given by each installation for each of the yes/no responses and retain the completed questionnaires. These summaries will be responses to our general questions asked of each bureau's top ADP management. A general question may be "What problems have been found in implementing measures to detect fire and water hazards in the computer center and to protect the equipment from damage by these hazards?" The specific questions would ask if (1) there is plastic sheeting near the equipment, (2) fire extinguishers are present, tested regularly and easily accessible and (3) all operations personnel are instructed as to what to do in an emergency. Bureau top management consequently will have to become actively involved in this evaluation program, which they should be, of course. Bureau ADP management will also tell us how they plan to attack any problems they can solve at their level and will make recommendations on problems which can be solved only at our level. Consequently, at headquarters level we will receive only the responses to the general questions as summarized by each bureau and the narrative questions pertaining to each individual installation. Because these levels of departmental ADP management installation, bureau and headquarters are involved, each level becomes aware of the problems created by and which can be resolved at that level. This involvement by all levels of ADP management follows the philosophy that management action should be taken at the lowest possible level. The second part will be on-site reviews, similar to what we have done in the past. Because we will have a considerable amount of information including that installation's latest responses to our questionnaire before traveling to the site, we should be able to conduct the review in a time period shorter than under the previous program. It is possible that reviews of smaller installations will be conducted over the telephone rather than by an actual visit. The selection of sites will be somewhat subjective based on importance, size, impending acquisitions and other such factors. A report for each site will be sent to the top ADP management in the bureau involved. The purpose of the on-site review is to ensure that both our questions and the installation's responses are valid. The third part will be the production of reports for use by the Assistant Secretary. We do not intend to issue "report cards" on each installation. Since the Assistant Secretary is interested in the overall performance of all the Department's installations rather than in any single installation, we intend to produce summary reports. We will particularly highlight those problems which cannot be resolved at lower levels or which need additional emphasis at these higher levels. For instance, if we find many bureaus saying that the procurement of ADP resources is too slow, we will recommend actions to be taken at Departmental level to speed up the process . 6. Implementation The implementation of the program will take a year or more. We have produced a draft questionnaire; the next step is to get active user involvement in putting it into usable form, i.e., adding or deleting questions, changing the method of implementation, etc. We will then publish a version for use with the testing phase in which 8 or 10 installations of varoius types will take part. The program will be further modified based on the results 292 of the test. At the same time the non-test installations will get a copy of the questionnaire (a) to prepare for the time they too will have to complete the it and (b) to allow for their further input. After evaluating the results of the tests, the program will be modified again and the questionnaire sent to all data processing installations for them to complete in earnest. We hope to provide the Department's top ADP official with the kind of information needed at that level to effectively manage the Department's ADP resources. We also hope to enable all our installations to increase their effectiveness and to help them comply with Federal and Departmental requirements. We also expect to find out where this office can most effectively apply its limited resources. Since we do not have the benefit of others' experiences to draw on, we are relying heavily on involvement by users for help and guidance. With their cooperation we believe this program can be of benefit to them as well as to top management. 293 The Air Force Base Level Computer Performance Management Progrim John K. Graham, Jr., TSgt, USAF Office of ADPS Management Air Force Data Systems Design Center Gunter AFS, Alabama 36114 The Air Force base level Computer Performance Management (CPM) program is 10 years old this year. More than 100 operating locations around the world are within the scope of the program. This paper describes the history and current status of the program. It also speculates about the future course of Computer Performance Management within the base level environment. Key words: Air Force Data Systems Design Center; CPM Project Officer; Computer Performance Management Technical Center. 1. History In 1969 the Air Force began converting its base level computer support from the Burroughs B263 to the Burroughs B3500 com- puter. The B263 was an early second genera- tion piece of equipment with four thousand characters of main memory and no mass storage devices. The only peripherals were reader, punch, and printer. The B3500 is a third generation, multi-programming computer with disk and magnetic tape. The B3500 was supposed to provide the Air Force with trouble-free support for many years. It didn't. With the new hardware came new applications. Either the new applications or new hardware were incorrectly sized. Saturation was immediate at many locations. Compounding this problem was the lack of operations experience with multi-programma- ble hardware. Mix management seemed to be either run one job at a time or load the system as heavily as possible and let it thrash. Recognizing these problems, the Air Force Data Systems Design Center (see figure 1) began developing a CPM program. The scope of the program can be described in two ways. First, within the Air Force, there are 108 installations with about 120 Burroughs medium system processors. These computers support 14 Major Air Commands and Separate Operating Agencies (MAJCOM/SOA) . Second, within each installation, instead of looking just at the hardware, the program also focuses on such things as software, personnel, and facilities, in short, the entire installation. Anything that can degrade a system's performance, whether it be a disk channel bottleneck, poor mix management procedures, or an inefficient air conditioner, falls under the scrutiny of the program. In 1970 the Design Center purchased a D-7700 hardware monitor and trained analysts in its use. For the next year, the analysts evaluated different hardware monitors. The Dynaprobe D-7900 from COMTEN was selected and several were purchased. Our early CPE efforts were centered on the hardware monitors and on learning what the B3500 was capable of doing. Two travelling teams responded to crisis situations by conducting on-site studies. They also produced a series of Operations Research Reports that have since proved to be invaluable reference texts on B3500 capabilities. Although the support given individual installations was excellent, coverage was inadequate. Two teams could not respond to the needs of the entire Air Force. To solve this problem, the Design Center began developing software monitors. After several years of refinement, a monitor was developed that was the best 295 HQ AIR FORCE 1 1 1 1 1 MAJCOM/SOA 1 f 1 1 1 1 1 AIR FORCE DATA SYSTEMS DESIGN CENTER 1 1 BASE LEVEL DATA PROCESSING INSTALLATION 1 J FIGURE 1. The Design Center reports to the Air Force Communications Command (a major air command). It communicates with and supports HQ Air Force, the MAJCOM/SOAs and the base level data processing installations. 296 trade-off between overhead and the compre- hensiveness of the data collected. A great benefit was realized with the development of the software monitor. With just a little training, almost anyone with operations experience could perform their own studies. After development of the software moni- tor, the direction of the CPM program began to change. While still performing studies, the teams began training personnel at all the sites they visited in the use of the monitor. A major step in the program's development was the identification, assign- ment, and training of "CPM Project Officers" from each Air Force MAJCOM/SOA (HQ SAC, HQ TAC, etc.). These project officers act as the managers of the CPM program for their headquarters. With the addition of project officers, the CPM program was beginning to take shape. There was, however, one last major obstacle to an effective program. The software tools and documentation for those tools were "bootlegged". That is, they were dropped off at the locations visited and were not a part of the standard release provided to the operating locations by the Design Center's Quality Control Directorate. Not only did we have a problem getting the monitor to the field but it was impossible to keep updated once it was sent out. We just couldn't remember who had which version. Analysis and Modeling System (WAMS) is a data base compiled from five years of these use statistics. Sizing techniques are used to predict the impacts of new workloads. Through the use of sizing, it is possible to quantify new requirements and have equipment in place prior to implementation to meet those requirements. Simulation and benchmarking are used to evaluate such things as new processors, new operating systems, and mix management proce- dures. A benchmark that closely represents a typical workload is being developed now. It will be used to evaluate several software and hardware options. Models have been developed that are used in building the five-year master plan. The master plan predicts equipment requirements for the next five years. This plan is updated yearly and is used for budget requests to Congress. 3. CPM Applications. Good tools are important , but in order to be worthwhile, they have to be applied in areas where they get results. The best results of the base level CPM program have been realized from: In 1978 the Design Center released the software monitor, its reduction software, and the associated documentation as a por- tion of the standard base level software package. The standard release of the moni- tor removed the last major obstacle to our providing a tool for continuous use at the base level installations. 2. Tools. The software monitor is an Air Force developed version of the Burroughs operating system. It is an event driven monitor that captures processor, disk channel, and disk device statistics. The data reduction pro- gram allows a great deal of flexibility in computations, date-time selection, and out- put formatting. In addition to the software monitor, several other tools and techniques are available. One is a "snapshot" monitor that samples the active job mix. A series of tools provide reduction of the accounting log. The data collected by this method is used to compile a quarterly summary of use statistics, the Configuration Analysis and Projection System (CAPS). The Workload - Disk balancing and tuning. - Bottleneck analysis. - Mix management . - The hardware upgrade process. Special projects. The application that has brought us the best return for effort expended has been disk balancing and tuning. Disk configura- tions at base level installations range from two channels with two devices to seven channels with nine devices. The devices are a mixture of fixed head and moving head disk. Disk balance is simply placing files on the devices so that each path to disk (channel/device) gets equal activity. This increases the probability of simultaneous I/O activity and reduces the time that the processor is idle waiting for disk I/Os to complete. Disk tuning applies only to moving head disk. It is the placing of files on each device so that the highest activity files are contiguous. This reduces read/write head movement to a mini- mum. When the disk system is loaded with no regard to file placement, device times range from 40 to 50 ms. On properly tuned disk, device times are reduced to a range of 29 to 36 ms. This results in greatly 297 reduced I/O queue times. The combined results of effective disk tuning and balance can be impressive. User response time can be decreased by 20 to 30 percent with a corresponding increase in throughput. Each Air Force base level installation is strongly encouraged to initiate a regular program of disk tuning and balance and take periodic measurements to verify the results. When any component of a system restricts the performance of that system, a bottleneck has developed. Without detailed performance data, identification of a bottleneck is reduced almost to guesswork. A disk con- tention problem is sometimes seen by opera- tions personnel as one of not having enough memory to get the job done. Had memory increases been made in these cases, it would have allowed more jobs in the mix to compete with the already saturated disk access resources. The end result would have been either no improvement or a further degrada- tion of throughput. Accurate evaluation of a performance problem requires detailed information. As an installation develops some expertise in performance analysis, they are able to identify system bottlenecks before they degrade performance. A computer operator can generate up to 30 percent processor overhead by mismanaging the job mix. He does this by overloading the job mix and changing priorities, causing program swapouts. One major emphasis of the CPM program has been to learn the best ways of managing the job mix, then training as many operators at as many locations as possible. The computer upgrade process in the Air Force is rather involved. The basic philo- sophy is that each installation should have enough equipment to meet mission require- ments without wasting resource dollars. When requesting an equipment upgrade, the installation is required to complete a CPM study that shows: The requested equipment is needed. - The requested equipment will correct the problem. The current equipment is well managed. This upgrade process has generated a much improved ratio of dollars spent to information processed. Upgrades are now more orderly, more predictable, and more effective. "Special projects" covers a wide range of activities. They include evaluating new operating systems, new pieces of hardware, and evaluating the impact of new functional software. Benchmarking or simulation are usually used for these evaluations. 4. Todays Status. A wide range of applications are served by the base level CPM program. For these applications to be well served, an effective organizational structure is necessary. The Air Force CPM program has been formalized under direction from Headquarters Air Force (see figure 2) . The program is structured with a single focal point for all performance management activities within the Air Force. The focal point is the communi- cation link between headquarters and Comput- er Performance Technical Centers (CPTCs) . These are the managers of the CPM program; each is responsible for one hardware type. Each Major Air Command/Separate Operating Agency has assigned a CPM Project Officer. He is responsible for providing the CPM support for his command. The CPTCs provide training and assistance to the project officers as required. The directive also establishes a requirement for upward report- ing and a historical data base of perform- ance data. This need is partially filled by the CAPS and WAMS described earlier. The CPTC concept is working well. The structure is in place and the base level CPM Project Officers are all trained. Informa- tion is being shared and passed through the use of newsletters, working groups, seminars, and assistance visits. The AFDSDC has developed a "CPM Guide" for base level installations. This guide is an introduction to computer performance management for operations personnel. It is also used as a text for training seminars. The guide covers: - Developing a CPM study plan. - Disk balancing and tuning. - Disk file management. - Use of the software monitor. - Job mix management . - The upgrade process. Although designed as a stand-alone tool for operations personnel, the guide is most effective when used with training seminars. Training seminars are given at irregular intervals by either the CPM project officers or CPTC personnel. The sessions last about four days. They cover the material in the CPM Guide and there are discussions on the 298 tn ca w o M O H U W •-3 s 0) M 4J o 4-1 H •u g o H a) u s o 14-1 p* 8 I o o s g 299 operating system and the Air Force developed data communications handler. Training will continue to be a major objective in the future. 5. The Future. Major emphasis of the CPM program in the near future will be training. As the cur- rent base level hardware becomes older and more heavily used, good management practices become increasingly important. As more people are trained at the operational sites, the Design Center, as CPTC, will be less involved in the day-to-day operation of the program. The CPTC's role will be one of advisor. The long-term future is exciting. The Phase IV capital replacement program for base level hardware is becoming closer to reality. The Phase IV program is the largest computer acquisition ever undertaken by the United States government. This will give the base level performance analysts a whole hardware series to measure, evaluate, and learn. Planning for the CPM program to sup- port the new hardware has begun. Lessons learned in the past should prove valuable in developing this program. In the ten years the program has been in existence, there has been one lesson that has constantly repeated itself. To be effective, a computer performance management program demands an ongoing commitment from people at all levels within the organization. 6. Glossary. Air Force Data Systems Design Center (AFDSDC) - Analyzes, designs, develops, tests, implements, and maintains standard automated data systems. Acts as manager for hardware common to more than one MAJCOM/SOA. Configuration Analysis and Projection System (CAPS) - A quarterly summary of machine performance statistics. CPMe - Computer Performance Measurement. The use of specialized tools to measure system performance. CPE - Computer Performance Evaluation. The analysis of the data collected in comput- er performance measurement. CPM - Computer Performance Management. The application of evaluation results to the operation and management of Air Force hard- ware. CPM Project Officer - An individual designated by a MAJCOM/SOA to act as focal point for computer performance management. Computer Performance Technical Center (CPTC) - The focal point of the CPM program for a hardware type. MAJCOM/SOA (Major Air Command/Separate Operating Agency) - The major breakout in the Air Force organizational structure (HQ SAC , HQ MAC , HQ TAC , etc . ) . Phase IV - The capital replacement program for base level hardware. Standard Release - The monthly release of standard software (object code) and docu- mentation from the Design Center to instal- lations worldwide. Standard Software - Software common to more than one Air Force command. 300 Prototyping/Benchmarking 301 RTE's - Past Is Prologue Mitchell G. Spiegel International Computing Company 4330 East-West Highway Bethesda, MD 20014 This paper surveys the evolution of Remote Terminal Emulators (RTEs) . Major developments in RTE technology are separated into three "generations" of products. Each generation's unique applications and features are highlighted. Recent developments are noted and a prediction of future use for RTEs is provided. 1. Introduction Remote Terminal Emulators (RTEs) first appeared as sophisticated system drivers in manufacturing and university[l]l laboratories more than 10 years ago. Aside from the occasional report of a study done using an RTE, the device has remained a laboratory curiosity. RTEs have not been widely used because of the high cost of their use, lack of portability, requirement for sizeable resources for stress testing, and unavailability of trained personnel. RTEs have acquired a reputation for being useful primarily in the context of stress testing systems. A literature search performed as part of the research for this paper indicates that RTEs enjoy robust application to a wide variety of problems. The RTE is entering its "third generation" of design. Whereas early RTE systems had many undesirable qualities, and "second generation" systems were not generally available, the new generation of RTEs is expected to alleviate cost, portability, resource, skill, and image problems and open many new areas of potential use. The ^Figures in brackets indicate the literature references at the end of this paper. RTE likely will become a keystone for system quality assurance efforts. (In this paper RTEs are not differentiated by the manner in which test workloads are imposed on the system under test (SUT) — many approaches have been used to drive systems with remote terminal generated work including: central processor (CP) devices that reside in the host, also referred to as simplex drivers; front-end processor (FEP) drivers that reside in the communication control processor; message processor (MP) drivers that reside in concentrators or network switches; remote set simulator (RSS) drivers that reside in a microprocessor attached to the terminal-line interface or in terminal or cluster controllers; data pipe (DP) drivers that reside in remote job entry systems; and remote terminal emulator (RTE) drivers that reside in a separate external computer system and are capable of replicating the entire network as it is connected to the SUT.) 2. The "First Generation" "First-generation" RTEs were developed mainly as debugging systems that used simplex drivers, individual terminal simulators, and/or data pipes. A few RTEs were built in the laboratories of several 303 computer manufacturers to test systems that supported large numbers of mixed terminal types. Reasons cited by Honeywell [2] for developing its Honeywell Communications Environment Emulator (HCEE) were to aid in the checkout and debugging of communication software; permit experimentation with software and terminal configurations (although the terminals did not actually exist) ; and evaluate both the operational limits and stress-load responses of on-line software. In addition, Honeywell felt that conventional test techniques (such as checking each transaction path, generating huge quantities of test data and stress testing with manned terminals) were time- consuming and impractical. More importantly, conventional tests violated a prime rule of scientific experimentation, i.e., reproducibility of test conditions. IBM also mentioned the importance of the reproducibility of test conditions in its chronology of experience gained from its first virtual operating system [3] . The manufacturer's view of on-line system viability was and still is a complex relationship between cost, performance, and the value of system functions to the using organization. Fixed values were established for system ownership cost (measured in dollars expended over the life cycle) and function value (measured in the percentage of productivity improvement per user) . The viability of the system was determined by the number of users that could be supported at acceptable levels of system performance (e.g., such service objectives as response time by application and availability) . Typical on-line service objectives were specified and then compared with measures obtained by executing a series of RTE runs under controlled, reproducible conditions. Results indicated the number of users that could be supported by a given configuration. An interesting finding of IBM's virtual storage operating system development group was that, prior to the use of the measurement driver, more than two- thirds of all software errors in system releases were discovered by customers. Subsequent to the use of the measurement driver, the ratio reversed, to a point where three-fourths of all operating system software problems were detected before release by the Corporation. Design characteristics of Honeywell's first generation RTE were as follows: The HCEE simulated up to 63 lines (8-terminal) , with both asychronous and synchronous (2400 bps) terminal types. Maximum output of up to ten 100-character I/O pairs per second could be generated. The initial software was implemented on a Honeywell 1200 with 64K bytes. Resident software required about 30K bytes. By contrast, the IBM Measurement Driver handled up to 256 active asynchronous terminals on eight lines. Output was approximately 12 I/O pairs per second. The initial software was implemented on an IBM 360/40 with 128K bytes. User exposure to drivers was limited to such tools as those supplied to users of IBM's Passenger Airline Reservation System (PARS) . A unique aspect of the PARS software support accompanying the test tools was a system test methodology to assure quality at any point of the system development/operation process [4] . This methodology was an important feature because of the large number (hundreds to thousands) of terminals on-line in a typical PARS system. The RTE-like components and their relationship to the system test methodology were (1) a system test compiler (STC) used in all phases of testing, beginning with the unit test to create test records, programs, and messages; (2) a Remote Set Simulator (RSS) used to measure the behavior in a batched mode of individual or multiple terminals during application package test; and (3) a system test vehicle (STV) , similar to the RSS, activated during stress tests from remote terminals in conjunction with a prerecorded scenario tape loaded on the System Under Test (SUT) . All three tools were also employed during post-cutover testing to detect problems and to re-establish a baseline after changes had occurred. RTEs were made available in IBM's laboratory to perform stress testing for sizing various PARS configurations and to demonstrate the ability of a proposed configuration to meet its customer's service objectives. The PARS stress test demonstration of performance was extended to other software product lines in the early 1970 's by IBM for selected large on-line customers. The hardware/software configuration that successfully passed the customer ' s performance objectives served as a pilot/prototype vehicle for the general design and feasibility stages of the customer ' s system development cycle [ 5] . The system test compiler (STC) function was expanded from the PARS implementation to allow the user to specify in detail the characteristics of an application's 304 interface with the operating system and data base, and to specify the application's consumption of processor resources. The pilot/prototype approach was limited, however, by the lack of availability of RTEs in the user environment. Users had to form an evaluation team with the manufacturer's technical staff to develop scenarios for workload, applications, and data bases. Tests of various workload levels were performed at the manufacturer's benchmark facility. Limited test time forced many compromises on the experiment's design. 3. The "Second Generation" overall system architecture in cases where the RTE was integral to the operation and maintenance of the on-line system. An example of such an RTE was the International Computing Company's Network Exerciser, delivered in 1973 to a commercial customer as part of an on-line transaction system for their clients. The RTE was part of a distributed minicomputer network, serving the vital function of determining real-time problems in the network software. Client response time requirements of less than one second, in addition to throughput rates of over 100 transactions per second, were coimon. The development of the minicomputer marked the beginning of the "second generation" of Remote Terminal Emulators. Vhereas previous RTE implementations required manufacturer benchmark facilities, minicomputer RTEs were small enough to be installed at a customer's site. In 1972 and early 1973, two design-verification models of an RTE were developed by the MITRE Corporation under the sponsorship of the Air Force [6] . The first such model, located at MITRE/Bedford, MA, was a fixed-site system, used primarily for program and scenario development, and interfaced with the SUT through the switched telephone network. The second design-verification model was an on-site system, used primarily for detailed emulator test and evaluation. The on-site system was conceived as a prototype RTE to be used in future ADP system procurements. The on-site system interfaced through cables directly with the SUT's communications line adapters. The primary hardware component of both MITRE systems was a Data General Nova 800 minicomputer configured with a fixed-head disk, magnetic tape, TTY, and SUT interface unit. The system could emulate up to 16 low-speed asynchronous terminals. The on-site system had the additional capability of emulating 8 high-speed (up to 2400 bps) synchronous lines. Software included a macro pre-processor, scenario assembler, real-time executive, scenario interpreter, and data reduction program. In practice, the on-site system proved to be more difficult to relocate and return to an operating condition than had been expected, primarily because of the very complex nature of the RTE-SUT interface [7] . Custom-built RTEs were implemented for minicomputers and delivered as part of an An early use of a customized RTE for evaluating time-sharing services was made at the Bell Laboratories. During the mid- seventies, Bell Laboratories spent between $50 million and $75 million on time-sharing services. To obtain satisfactory service from its time-sharing vendors, Bell constructed an emulator for a DEC PDP-11/45 system and operated it similar to METRE'S fixed-site RTE. Each time-sharing service vendor was required to pass a live- test demonstration by repeating a series of tests controlled by the RTE over a two-week trial period. The results of the tests were documented for both performance and charging algorithm baselines [8] . Services were retested at unannounced intervals (about every six to twelve weeks) to verify that the service was within contract terms. The National Library of Medicine (NLM) used a similar approach to evaluate its MEDLINE services. The MEDLINE Simulated User System (MSUS) resided in the front-end computer (XDS940) and was limited by the number of terminals that the front-end could emulate without degrading system performance. Stress tests were performed at NLM by using multiple front-ends to drive the SUT. The NLM RTE functional and stress test capabilities were one of about a dozen RTE approaches unearthed during a survey of manufacturers and Government agencies [9,10]. In the mid-1970s, computer manufacturers upgraded their RTE capabilities at their benchmark test centers by using more efficient and flexible software. The upgrade enabled manufacturers to drive SUT's with very large networks of terminals and terminal combinations of greater variety. A rule of thumb for sizing RTE system requirements among manufacturers estimated that an RTE 305 one size smaller than the SUT would be able to "saturate the SUT" without falling behind the scenario rate established for the test. Because of the availability of RTEs at manufacturers' benchmark facilities, and the large Government investment in on-line systems and time-sharing services [11] , a workshop on RTEs was held jointly by the National Bureau of Standards (NBS) and the General Services Administration (GSA) in 1976. The early results of a procurement projection study were presented at the workshop [12]. The data presented indicated that the volume of procurements for on-line systems and the number of remote terminal devices would generate a need for a system- independent workload generator that could be used to evaluate vendor proposals. The related technical issues were subsequently made available in a later report [13] . In addition, a major government case study involving the use of four vendor RTEs was described [14] . GSA presented a plan for incorporating the use of RTEs in the Federal ADP procurement process [15] . The plan was closely adhered to by GSA, culminating in the successive production of three documents: (a) RTE Specifications, (b) a description of the use of RTEs in Federal procurements, and (c) a handbook [16] combining the two previous documents. This handbook presented in a compact format rules for temporary Federal Procurement Regulation 49 on the use of benchmarking and remote terminal emulators for performance validation in the procurement of ADP systems and services. About the time the Government began its examination of RTEs, the manufacturers publicized for customer evaluation the use of such devices in complex system environments [17,18], RTEs in manufacturer's benchmark facilities provided customers with measures of the sensitivity of a given hardware or software configuration to changes in the workload [19,20] . Customers were surveyed to determine the environment for specific industry uses of certain software packages (i.e., operating system, on-line application control system, data base management system, and network control system) . The results of the survey were used to construct synthetic representations of the customer's aggregate workload. Performance measures were taken for various increments of workload to determine system sensitivity to mixtures of workloads on given hardware and software configurations. The emergence of computer networks provided a further opportunity for second-generation RTE applications. Early networks were designed, tested, and operated with RTEs attached. Most early developments were related to digital networks. An early example was the evaluation of the packet-switched subnetwork CIGALE, the basis for the CYCIADES network which connects research centers and universities in France [21] . Studies addressed throughput rates for different line control procedures and maximum throughput and transit delays for packets transmitted under various loading conditions. Analytics, Inc., developed a driver that was employed both as a network tester and as an RTE to evaluate a SUT. The driver, installed at one of the locations of the Worldwide Military Command and Control System [22,23] , enabled the staff to reduce the time required to certify new releases of Honeywell's GCOS operating system from 200-400 man hours to about 20 man hours. The RTE was also able to drive the network with SUT output responses and to receive inputs from other network nodes. The SUT emulation capability permitted functional tests of multi-level protocols between hosts, as well as more conventional throughput and delay exercises. Public packet-switching networks generated further functional demands for RTE-like network exercisers. In France, the TRANSPAC experimental network required a tool to provide protocol conversion, frame error detection, call establishment and clearing, and a terminal packet data generator/absorber [25] . Users were given frame trace capabilities, a list of errors caused by their software; statistics on about line errors; frame suppression capabilities to simulate errors; the ability to activate frame generation; and the flexibility to establish calls, vary messages sent, and detect error conditions. The products that resulted: REX25; ESQPE; and SIMAD; have helped users to develop interfaces with TRANSPAC. REX25 was implemented on a MITRA 125, a powerful minicomputer. ESOPE and SIMAD were implemented in a Datapoint 5500 RJE station. A follow-up emulator was implemented for the Canadian public packet-switched network, DATAPAC. Other network testing devices also received attention during the same time period." Both IBM and Bell Telephone Laboratories developed digital devices for 306 testing analog networks in order to replace such complex analog test equipment as oscilloscopes. When an analog signal arrives at a modem, it is decoded into digital data, which is subsequently presented to a communication subsystem (e.g., a front-end communication processor, transmission control unit, or host computer) . Measuring is done by removing the line from service or attaching analog measuring equipment to the line, modem or associated interface. The operations staff must know when to measure and often has difficulty identifying intermittent errors. Special skills are required to read analog measuring equipment. Another approach [25] uses the extra information about the line's performance (the parameters of the analog signal as received) . The analog data are passed to a line quality monitoring (LQM) program in a separate computer, which analyzes them and creates appropriate summary information. The network location of the line quality monitor computer is the same interface point normally chosen for installation of an RTE (in front of the communications front-end processor) . The LQM approach offers the following advantages: (1) the line stays in service during the monitoring; (2) the data collected can be a permanent record of line performance; (3) the monitoring is continuous, allowing for identification of transient conditions; and (4) such an LQM process can be linked to a network operating system that performs dynamic routing assignments and network maintenance control. One installation's experience with tne LQM approach was a prime shift circuit availability increase from 82 percent to 99 percent within one month after LQM began. Over the first three months of operation, 17 of 21 reported line problems were confirmed as correctly diagnosed. The monitoring also proved useful in diagnosing a number of problems in telecommunication software. 4. The Third Generation The general problem of network managment has served as a unifying force to relate the RTE, LQM and other measurement and quality assurance functions. A unique approach evolved from the PARS System Test Vehicle [26] . Underlying the approach is the interrelationship between network operations and network development in order to maximize user satisfaction (i.e., to provide successful delivery of services and meet user objectives) . The RTE and related network and SUT measurement applications are integral parts of network "process" control. Such project activities as network design, hardware and software configuration, and capacity planning interface with the data obtained from on-line measurement and evaluation at the network interface. The "sidestream" processor, a prototype management tool, has been developed to evaluate the automation of many labor-intensive management activities associated with large on-line environments [27] . As a separate tool, the sidestream processor can be inserted into the communications system's operational process without disturbing the on-line applications. The sidestream processor has the potential of merging RTE applications with other network management functions in a single system. Third-generation RTEs fall into two major categories: (1) microcomputer based, economical versions of small, single- purpose systems that were previously minicomputer based and (2) multi-purpose systems, such as the sidestream processor, that integrate the RTE functions with related network management functions demanded by users and managers. Examples of the former category are under development by Analytics Inc., (e.g., a Honeywell Level 6 version of the ASE System) Computer Sciences Corporation (e.g., Data General (DG) MP-200 version of the MITRE RTE) and Logica Limited Instrumentation System (e.g., A Texas Instruments 990/10) . Examples of the multi-purpose systems are under development by International Computing Company (e.g., A Univac V7700 Network Exerciser delivered in January 1980 as part of the NASDAQ system) and contemporary manufacturers' large terminal systems (e.g., IBM's Teleprocess- ing Network Simulator (TPNS) used on IBM 4300 series systems) . Major features of each third generation approach follow. The Analytics ASE Level 6 system is designed to handle large stress test environments by combining multiple Level 6 systems into a large parallel system. Communication interfaces are similar to the current ASE implementation. Computer Science's approach permits an extremely portable product, weighing less than fifty pounds in total (power supply, CPU interfaces, etc.). The prototype DG MP-200 system can emulate up to 16 asynchronous terminal scripts based on the same scenario. Available protocols are only limited by the flexibility of the DG 307 communication, access manager and communication control hardware. Synchronous communication support (bi-synchronous) is under development. Logica, Limited designed their Communications Environment Generator (CEG) as a hierarchical multi-microprocessor system in response to a set of requirements issued by the Central Computer and Telecommunications Agency (CCTA) of the British Government. A Texas Instruments (TI) 990/10 minicomputer and associated DX10 software serves as the "master" control in the system. The "master" is connected to up to seven TI 990/5 microcomputers, which in turn can be connected to up to nine TMS 9900 microprocessors that function as micro-programmable four -channel controllers (FCCC) . RTE software functions are allocated across all three levels of the system. An initial configuration of the 990/5' s and four FCCCs can drive 48 full duplex lines at 4800 bps or 36 FDX lines at 9600 bps. Throughput rates of 15 I/O pairs per second can be sustained. Protocols are currently restricted to Britain's ICL product line. The International Computing Company Network Exerciser delivered to NASDAQ performs all of the common RTE functions - script generation, logging, data reduction time-stamping, synchronous and asynchronous line handling - while providing capability to sample the common carriers' lines and provide for network problem determination. The UNIVAC V7700 configuration is capable of handling up to 48 synchronous lines operating at 9600 bps and sustaining stress rates in excess of 50 I/O pairs per second. The software is written in a portable language (INFOS) that can be operated on five other minicomputers. An advanced third generation design is in development to incorporate other service quality assurance functions in a highly reliable "non-stop" hardware architecture. The Naval Personnel Research and Development Center (NPRDC) obtained a RTE design for a specific workload of up to 64 interactive remote terminals under contract N6600-79-C-0406. The hardware config- uration consists of a DEC PDP-11/34 with a 300 mb disc and a high performance microprocessor from Plessey called a MIPROC. The PDP-11/34 is used for compiling scripts and editing the output to readable form off-line. The PDP-11/34 and the MIPROC run the actual real-time evaluation. They are connected by an asynchronous control channel and a direct memory access channel to transfer script information from the PDP to the MIPROC and to transfer logging output from the MIPROC to the PDP. The MIPROC interfaces with the SUT to perform the most time-critical tasks. McDonnell Douglas Automation Company has developed a simplex emulator that can drive an IBM 4300 system. The emulator is written as a CICS application, using the timing features of CICS to perform the time-critical functions. Current plans call for an extension to full-duplex emulation capability. Announced, off-the-shelf manufacturer RTEs have recently added enhanced scripting capabilities and the ability to function in both simplex mode and as true RTEs. IBM's script generator for TPNS produces scripts from TSO, IMS and CICS trace tapes. 5. Summary Although technology has removed many of the barriers to successful use of RTEs, some problems remain. Because of the sheltered status of RTEs during the last decade, there are very few trained personnel that can successfully accomplish an RTE study. Problems with RTE use have been documented [28,29] , and the ever-present benchmark problem of lack of correct scientific method is especially prevalent in RTE environments [30] . Further impediments to RTE use are user organizations that are still structured around a production environment, rather than a service environment. The problems will be overcome slowly over the next decade. Although the 30-odd references appended to this paper represent the bulk of the published knowledge about RTEs, many ad hoc studies have been done at the manufacturer's benchmark facilities. Since TPNS became a licensed product, about 100 of the 3,000 installations that operate medium or large scale DBMSs have installed TPNS. Certainly a small percentage, but a large number in just a few short years. I believe the emphasis on service to the user, the importance of system quality assurance, problem determination, and the continuous interest in systems design issues will enhance the demand for functions an RTE can perform at its unique position between the entire SUT and the network [31] . The RTE will, I believe, become the basis for principal 308 CPM/CPE products for Computer and Communications Studies and for Network Management. References [1] Greenbaum, H., A Simulator of Multiple Interactive Users to Drive a Time-Shared Computer System, Report MAC-TR-54 (MIT, Cambridge, MA, October 1968) [2] Caplan, R. , Pearlman, J. M. , and Snyder , R. , A Communications Environment Emulator, Proceedings of the Spring Joint Computer Conference , 1969, pp. 505-512. [3] Schwemm, R. , Experience Gained in the Development and Use of TSS, Proceedings of the Spring Joint Computer Conference , 1972, pp. 559-569 [4] International Business Machines Corporation, Airlines Control Program (ACP) System Documentation, Vol. B, GS- 20-1435-0 , PARS Testing Philosophy, Rev 1, 9/13/68. [5] Kirrene, M. , and Spiegel, M. , Design for Performance, CPEUG (15th Meeting) , 1979, NBS, Washington, DC, pp. 129-140. [6] James, D.L., A Remote Terminal Emulator for Loading and Performance Measurement of On-Line Systems, M72-83, Bedford, MA: The MITRE Corporation, March 1972. [7] DeMone, E.C., Remote Terminal Emulator (Design Ver if ication Model) , Description of Hardware, MTR-2677, Vol. 8, Bedford, MA: The MITRE Corporation, February 1975. [8] Wright, L., and Burnette, W. , An Approach to the Evaluation of Time- Sharing Systems: MH-TSS, A Case Study, Performance Evaluation Review , ACM SIGMETRICS, Vol. 5, No. 1, January 1976, pp. 8-28. [9] Watkins, S., and Abrams, M. , Survey of Remote Terminal Emulators, NBS Special Pub. 500-4 , NBS, Washington, DC, April 1977, pp. 1-71. [10] Arthur, C. , Remote Terminal Emulator Development and Application Criteria, 1977 National Computer Conference , AFIPS Conference Proceedings, Montvale, NJ, 1977, pp. 733-739. [11] Davis, R. , Welcoming Address, Summary of the NBS/GSA Federal Workshop on Remote Terminal Emulators , June 21, 1976, pp. 5-7. [12] Kiviat, P., Procurement Projection Results, Summary of the NBS/GSA Federal Workshop on Remote Terminal Emulators , June 21, 1976, pp. 36-59. [13] Wyrick, T. , Concepts and Issues Relevant to the Use of Remote Terminal Emulator in Teleprocessing Procurements, Washington, DC: General Services Administration, May 1977. [14] McFaul, E., RTE - A Case Study at the Geological Survey, Summary of the NBS/GSA Federal Workshop on Remote Terminal Emulation, June 21, 1976 , pp. 122-133. [15] Wyrick, T., and Findley, G. , Incorporating Remote Terminal Emulation into the Federal ADP Procurement Process, CPEUG (14th Meeting) , NBS, Washington, DC, October 1978. [16] General Services Administration Handbook, Use and Specifications of Remote Terminal Emulation in ADP System Requesting, Washington, DC: General Services Administration, August 1979. [17] Bailey, L., Experimental Evaluation of an Interactive-Batch System Using a Remote Terminal Emulator, CPEUG 1976 Conference Supplement , NBS Washington, DC, November 1976, pp. 111-122. [18] Duke, M., Testing in a Complex System Environment, IBM Systems Journal , Vol. 14, No. 4, 1975, pp. 353-365. [19] Currie, R., An Experiment in Synthetic Benchmarking to Predict MVS Performance for Broad Mix Systems, Slides from Proceedings of SHARE 48 Conference , SHARE, Inc., Houston, TX. [20] Palmer, S., CS-1100 - Sperry UNIVAC Communications Simulator, CPEUG 1976 , NBS, Washington, DC, November 1976, pp. 197-201. [21] Eyries, F., and Gien, M. , On-Line Performance Measurement in the 309 CYCLADES Network, EURQCON 77 , Venice, Italy, May 1977, pp. 50-55. [22] Shirey, R., Tools for Post-Programming Software Testing in a Dispensed Multi-Vendor Environment, 18th Annual ACM Technical Symposium , June 21, 1979. [23] Analytics, Inc., ASE Computer Operations Manual, Naval Electronic Systems Command, Contract No. N00039-75-C0145, Technical Report 1152-25-TR-02, Washington, DC, December 1977. [24] Giraudeau, P., Description of Services Offered by REX25, ESOPE and SIMAD for the Purpose of Establishing an X.25 Module Dialogue, CCETT , Paris, France, December 1977. [25] Bryant, P., Giesin, F. , and Hayes, R. , Experiments in Line Quality Monitoring, IBM Systems Journal , No. 2, 1976, pp. 124-142. [26] Giles, H., Successful Network Management Hinges on Control, Data Communications , August 1978, pp. 33-41. [27] Leach, J., and Campenni, R. , A S ides tr earn Approach Using a Small Processor as a Tool for Managing Communications Systems, IBM Systems Journal , Vol. 19, No. 1, 1980, pp. 120-139. [28] Trehan, V., Problems in Remote Terminal Emulation, CPEUG 1978 14th Meeting) , NBS, Washington, DC, October 1978, pp. 27-61. [29] Tendoklar, N., Determination of Non-Steady State Conditions in Performance Measurement Runs, CPEUG 1977 (13th Meeting) , NBS, Washington, DC, October 1977, pp. 87-94. [30] Hyman, B., Stability and Workload Definition for Time-Sharing Systems, FIPS TG-13 , Document 70, 1974. [31] Wyrick, T., Benchmarking Distributed Systems: Objectives and Techniques, ICPCI 78 , Gardonne, Italy, June 19, 1978, pp. 13. 310 Application Prototyping: A Caso Study C. Wesley Jenkins Budget Analysis Division Congressional Budget Office Washington, D.C. 20515 Accurate specification of user requirements for interactive systems is especially difficult in an environment where the demand for information is intense, short-fused and largely unpredictable. The Congressional Budget Office was created in 1975 by an Act of Congress. Its primary mandate is to serve the Budget and Appropriation committees of both the Senate and the House of Representatives. The Act also defined a Congressional Budget process specifying a calendar of events and specific completion dates for major activities. This pacing of budgetary actions produces a highly charged environment in which CBO must be able to respond immediately to information needs with information that is both accurate and consistent. In approaching a redesign of some of these highly visible information systems, CBO decided to follow a strategy of prototyping these systems in order to faciliate the involvement of the user, highlight the user's real needs and to demonstrate the feasibility, effectiveness and any shortfalls of the proposed system before risking either a management or design commitment. 1. Introduction The traditional approach to systems analysis seems to have serious limitations when applied to interactive information systems that are in a state of constant change and growth. Communications among the user, analyst and manager tend to be imprecise, a detailed analysis prolongs the process to the annoyance of the users, and specifications are either am- biguous or too voluminous to read. To compound this problem, the user is often requested to 'freeze' his requirements and subsequent attempts at change are resisted. The alternative of an informal analysis only increases the proba- bility that the user will not be satisfied by the new system. Approaching systems analysis and design around a strategy of prototyping offers a tech- nique that minimizes the dangers of a long formal analysis and increases the likelihood of a successful system implementation. The essence of the prototype approach is the construction, test, and demonstration of a representation (model) of the system that is to eventually be developed. This methodology allows the user to interact with this skeleton version of his system and to be more involved in the design. The analysts and programmers can experiment with alternative system and data base designs. Managers can better gauge the expected per- formance and cost of the new system. CBO decided that the potential benefits seemed well worth the initial investment of re- sources. The following case study of the experi- ence at CBO includes a brief description of the background, an overview of the major steps leading up to the decision to prototype and a discussion of the prototype plans. Unfortu- nately, the publication deadline of this journal 311 precludes a description of the outcome of this prototyping experience. 2. Case Study Background 2.1 Instant Systems The appointed director of CBO, Dr. Alice Rivlin, took office in the spring of 1975. In less than six months an effective organization of almost 200 individuals had been pulled together. The Budget Analysis Division was formed to handle three major functions: (1) providing cost estimates on all bills reported, (2) keeping 'score' on the Congressional budget process, and (3) providing five-year cost projections on the Federal budget. It was quickly decided that the last two functions required computerized systems to provide the necessary information in a responsive manner. Within a few short months the necessary information systems had been developed and implemented by a handful of CBO analysts with the assistance of contract person- nel. The systems were installed and run in a service bureau environment. As time had not permitted analytical studies, these systems were considered necess ary stop gap measures to get up to speed in a hurry. 2.2 Nature of Demands Serving the information needs of Congres- sional committees is a unique, and often frus- trating, experience for the individuals involved in data processing support. Every request is a directive that must be answered immediately. Responsiveness is absolutely critical to the credibility and viability of an organization such as CBO. When information is needed, it is often needed immediately or the value of it is lost. Flexibility is also a must, as the type of infor- mation that is in demand fluctuates from Congress to Congress, chairman to chairman, committee to committee, year to year, week to week during the budget process and often hour to hour during the Budget Committee markup sessions. The third major characteristic of the information demands on Capitol Hill is the visi- bility and impact of the information delivered. Accuracy, consistency, integrity and security all are critical data issues. 2.3 EDP Objectives In 1978 the MITRE Corporation completed a study of the data processing systems in the Budget Analysis Division. By this time the 'interim' systems developed in 1975 had become a gigantic collage. For the most part they were still providing the needed information in a timely fashion, but MITRE warned of a further collapse if the systems were not redesigned. They speci- fically recommended that a Data Base Adminis- tration function be established and that the new design should center around the use of a Data Base Management System. The Budget Analysis Division took several positive steps during 1979. The first step was to provide a framework of long-range objectives and strategies to guide the planned redesign and development efforts. These objectives are high- lighted below. Long-Range EDP Objectives o Effective and efficient information ser- vices o Minimize data and system maintenance o Improve system flexibility o Simplify system use o Improve data accountability and manage- ment o Improve system transferability o Improve data integrity o Improve ability to monitor system per- formance 3. Initial Steps Between June 1979 and June 1980 the following steps were taken. Each was significant in the evolution towards using 'application proto- typing' at CBO. 3.1 Data Management Requirements Analysis This task was intended to identify and define the requirements for a more effective program of data management in the Budget Analysis Division. An important objective was to describe the needs from three basic perspec- tives: manager, user and data processing analyst. There were several significant observa- tions made during this study that confirmed the MITRE recommendations for system redesign and improving the methods used to manage the organization's data resources. A primary concern was with the consis- tency of information supplied by the Budget Analysis Division to committees and other users, both on and off Capitol Hill. Management also identified quality assurance as a major consider- ation. The responsibility for updating and accu- racy verification for each data item should be implemented as part of an authorization scheme that would ensure that only the designated indi- vidual would be permitted to alter that data item. Other factors involved in a program of quality assurance should include the ability to maintain audit and activity records in order to monitor usage and determine ineffectiveness, to track the logical and physical growth of the data base, to monitor system performance and to assure that adequate data validation is done. 312 Probably the most imperative requirement was viewed as the demand for timeliness in providing the information which is needed by a Congressional committee. This means knowl- edge of what data is available and how to access it is critical. It also means that the organization of the data base must be convenient and facili- tate the ease of retrieval and report generation. This leads directly to a related management concern for flexibility in data structure. As the committee demands for information vary radi- cally with the issues at hand, the point in the budget cycle and any number of external factors, the need to accommodate these changing views of the data is a primary requirement of manage- ment. Users with data accountability responsibili- ties each expressed concern about the need for better verification and control techniques. There is a particular need for better verification of modeling outputs. A related requirement is for a simulation or test mode when doing data projections so that the new figures can be re- viewed and accepted before the data base is actually updated. In conjunction with these concerns was a heavy emphasis on the require- ment for consistency and the difficulties of accomplishing it when the data is passed so frequently from system to system, machine to machine, and one file to another. This redun- dancy of storage and the complexity of the data combine with a lacking of documentation and standards to create a worrisome situation for the people responsible for the data. The programmer/analysts indicated a need to either have better information on all data files, reports, programs, and procedures or to have independence of the programs and proce- dures from changes in the data files. The programmers spend much of their time at- tempting to locate a program or programs that do this or that function in order to modify it or use it to create a different version. The outcome of this study supported the MITRE recommendations. The concerns that surfaced definitely pointed to the need for a program of data management and the use of such tools as a data dictionary and a DBMS. The ideal environment for data management should pro- vide the tools, techniques and procedures neces- sary for an organization to effectively and effi- ciently monitor and control the allocation and utilization of its data resources. This process often involves the recording and controlling of data about data in a single authoritative source. It does not necessarily depend on physical cen-. tralization of data files, but functional integra- tion is preferrable in order to limit the negative consequences, such as inconsistency, of applica- tion independence. 3.2 Application Requirements Analysis After the successful completion of the data management study, plans were made to begin a thorough analysis of user requirements in the major application areas. The first proposal called for the following type of information to be brought together in a workbook: (1) current environment description, (2) data description, (3) functional requirements, (*) management re- quirements, and (5) standards. This traditional approach was rejected because the paperwork appeared excessive, the resources required were substantial and the users objected to such a prolonged process. An alternative approach was proposed that called for using the structured analysis method- ology. This technqiue promised to provide a rigorous description of the new system with much less paperwork. This alternative was also rejected, however, as both management and users felt that it was a case of overkill for an organization the size of Budget Analysis Divi- sion; especially considering our limited technical support staff. At this point it was decided to ask the Federal Computer Performance Evaluation and Simulation Center (FEDSIM) for advice. A rep- resentative of FEDSIM, Mitchell Spiegel, spent a few weeks evaluating our data processing per- formance problems and considering our known requirements. Mr. Spiegel's recommendations basically supported the MITRE conclusions and indicated a need to begin moving to a new data processing environment. He also confirmed the suitability of a DBMS for the type of data processing requirements that exist at CBO. 3.3 Decision to Prototype Based on the FEDSIM and MITRE con- clusions about the benefits of a DBMS, CBO decided on the following course of action. The user requirements for the major applications would be briefly defined, then an evaluation of DBMS packages would be undertaken. The DBMS that appeared to best meet the CBO require- ments would then be leased for a 60-day period in order to verify the conclusions of the paper evaluation and to conduct performance tests. With the decisions about data base management systems, data dictionary, ad hoc languages, development languages and report generation tools completed, the plan would be to move rapidly into application design and development. As this plan evolved, it became clear that . the 60-day test period offered an ideal opportu- nity to bring up a model or prototype of one of the major systems needing redesign. As the 313 possibilities were discussed, it became clear that this approach provided several advantages. First, it was an excellent way to involve the user, to capture his interest and to obtain spe- cific feedback on certain techniques that were being considered. Second, this approach would allow us to demonstrate the various capabilities proposed to the managers who are actually re- sponsible for the 'go/no go 1 decision. Third, performance statistics and loading tests would provide excellent information for estimating the resource utilization impact and requirement of the proposed system. Forth, the technical team would be able to test alternative system designs and data base structures. In short, application prototyping appeared to be an excellent approach under the circum- stances. CBO decided to proceed with this plan. 3.4 Initial DBMS Evaluation The basic application requirements were documented. Major user concerns were obtained through interviews and any significant data pro- cessing problems in the current EDP environ- ment were described. Management supplied long-range goals and strategies to guide the process. Within a short time, a profile of mandatory requirements had been put together to be used in evaluating the DBMS packages available. CBO started with a list of some twenty DBMS products. The list was quickly reduced to seven contenders, who were invited to present their packages. The information on each con- tender was captured on a standard form that addressed the areas of major concern to CBO. An initial evaluation eliminated four more pack- ages. The remaining three contenders were examined in detail and evaluated within cate- gories that had been assigned weight factors by management. This process resulted in clearly placing one DBMS above all the rest in its ability to meet the CBO mandatory requirements. 4. Prototype With the DBMS selection decided, steps to prepare for the prototyping were started. The choosen DBMS was leased for a 60-day period, training for the participating programmers was arranged and the DBMS was installed at a con- venient service center on Capitol Hill. A de- tailed test plan was written, the application to be prototyped was choosen and the scope of the prototype was defined. 4.1 Test Hypotheses As the test period is of limited duration, it is important to specify in advance the hy- potheses that are going to be tested. In the case of CBO, these hypotheses relate to both the application and the use of the DBMS and data dictionary. (1) Use of a DBMS is in concert with overall BAD EDP objectives and re- sponsive to user requirements. (2) An integrated or active data dictionary provides important support to efficient and effective management of EDP re- sources. (3) Use of a high-level, user-oriented pro- gramming language for the develop- ment of interactive applications is an efficient and effective method for re- ducing system development and mainte- nance costs. (4) The cost analysts in BAD will use a high-level, user-oriented language to query data bases, write simple reports and perform certain types of generic updates to data for which they are responsible. (5) Use of a CRT improves the ease of data entry and reduces the need for hard- copy reports by providing users with a convenient method of viewing this data interactively. It is important to either confirm these positions or document any shortcomings or prob- lem areas during the prototype. Management and technical decisions based on these hy- potheses will have a far-reaching effect on future EDP developments at CBO. 4.2 Prototype Design Several decisions had to be made about the scope of the application prototype before an actual prototype design could be developed. CBO decided to use the five-year projections system and to load the entire data base for the test. The DBMS development language and COBOL would be used in the prototype. The DBMS would be installed under CICS and full- screen, 3270-like devices would be used. It was also decided to load the data dictionary first and to utilize it as an active part of the application prototype. The next major task was the definition of the application functions to be performed. One goal of the prototype is to demonstrate an effective method for accomplishing interactive data updating. This will include adding new records to either the data file or the reference files. The prototype will provide for changing several fields and for record deletion. These data updating functions will provide full-screen 314 capabilities, interactive editing of data entered and concurrent interactive updating of the data files by multiple users. Other techniques of updating that will be tested as part of the prototype include inter- active generic updating and batch updating of the data files with external data. The generic updating is a capability currently used in the projections system. It allows the user to specify a change for certain fields within a selected set of records. The following types of change will be permitted: zero, blank, add a field, and move a field. The batch updating will show the ability of the system to accommodate the addition of data from a sequential file. Another objective of the prototype is to demonstrate selected interactive data retrieval techniques and to display the data retrieved using full-screen terminals. There are two primary approaches to providing the users with this type of capability: one involves guiding the user through menus, tutorial aids and carefully designed parameter selections and the other is through the use of a powerful English-like query language that a non-technical user can utilize to state his own requests. The first approach is similar to techniques currently used, but it be- comes more convenient to the user with full- screen speed and more fleixible with a DBMS providing the retrieval capabilities. The prototype will be designed to provide a menu-driven query capability that offers several retrieval options. Within these options there will be choices on what data the user wishes to display. The prototype will also provide a browse capability that will allow the user more flexibility in selecting retrieval characteristics and to combine these parameters with Boolean operators or range operators. In all of these, appropriate facilities of the full-screen termi- nals will be utilized. The ad hoc query language will also be tested as part of the prototype. The technical team will exercise the ad hoc facility and measure its performance. Non-technical users will experiment with the ad hoc language in order to evaluate its ease of use, ease of learning and usefulness. Another objective of the application proto- type is to test and evaluate the report genera- tion facilities available with the selected DBMS. This will involve producing three or four of the projection reports off the prototype data base. The usefulness and efficiency of the selected Report Generator facility in creating these reports will be evaluated. At least one or two of these reports will also be produced using COBOL. This will provide us with better infor- mation on the performance differences between the two approaches, the relative programming difficulties and costs, and the relative flexibility. Other factors that will be evaluted via the prototype include: o Recovery from a system failure. This will include an analysis of the automatic recovery capability from both an effec- tiveness and efficiency point of view. o Ease and cost of reconstructing the prototype data base. o System utilities to backup and reload the prototype data base. o Flexibility of DBMS to accommodate changes to data structure. This will include adding new fields, adding dif- ferent record types (new logical view), changing fields from non-key to key, changing a logical view, or changing the physical attributes of a field. These will all be analyzed as to their impact on the prototype programs already written. o Data dictionary facilities. o System utilization and performance re- ports. o Levels of security. 5. Conclusion At this moment the technical team is putting down the final design specifications for the prototype. Report and screen layouts have been completed and all the programs to be written have been identified. The actual proto- type test period begins the last week in July; one week from now. We are optimistic that this application prototyping strategy will provide the information needed to assure a successful re- design of CBO's major data processing applications. 315 Operating System Performance Meters 317 Performance Evolution in a Large Scale System Richard S. Brice J. Wayne Anderson Los Alamos Scientific Laboratory Los Alamos, New Mexico This paper documents the evolution of system performance in an operating system from its initial use by a few friendly users in October 1978 to its present state. The system, DEMOS, was developed for the Cray Research Inc. (CRI) Cray-1 computer at the Los Alamos Scientific Laboratory (LASL) by staff members in the LASL Computer Science and Services Division. Some important features of DEMOS architecture are described. It is shown how the robustness in the design permitted major reduction in overhead to be achieved with only minor software changes. Particular emphasis is placed on file system design and performance because much of the system overhead reduction occurred as a result of changes to this system component. Also, some predictions are made regarding performance resulting from proposed modifications. Key words: design. System tuning; message overhead; capabilities; modular 1. Introduction In October 1978, the DEMOS operating system, which was developed at the Los Alamos Scientific Laboratory, became avail- able for use by friendly users at LASL on the Cray-IA computer. Hardware to support I/O included 7 channels, 7 disk control- lers, and 13 Cray Research, Inc. (CRI) DD19 disk drives. Sustained I/O rates were expected to be in the 20-60 Mbit/s range [ 1 ] . This rate approximates the maximum for a single channel/contoller/disk (35 megabits). Initial measurements showed that the system would support only half the minimum range value at an overhead ranging from 50% to 80% of the CPU. What had gone wrong? Could it be fixed? What would the fix cost? In this report, we analyze the DEMOS design and its underlying assumptions. We show how minor changes enabled the system to support the anticipated I/O workload with acceptable overhead, and we quantify the improvement expected from further enhancements . In Sees. 2 and 3 we present brief overviews of DEMOS architecture and its implementation, and the file system design. Detailed descriptions can be found in Refs. 1 and 2. In Sec. 4 we present some measurements, problems, and solutions encountered early in the evolution. In Sec. 5 we present some recent (and as yet unsolved) problems and speculate on performance gains that might be achieved by alternate solutions to the problems. 2. DEMOS Architecture The fundamental computational process in DEMOS is called a task. A task consists of a program and its associated state information, a memory area, and a link table that provides capability-like access to resources in the computing system [3] . 319 In the simplest case the terms "task" and "user job" may be synonymous, although a user job usually consists of many tasks, active either serially or in parallel. One designated task, called the kernel, assumes ownership of the hardware interrupt facility and access to all executable memory when the machine is deadstarted. The kernel contains hardware I/O drivers, interrupt management, the basic intertask communication mechanism, and the link creation/manipulation facility. Other system tasks are not distinguished from user tasks; however, they are able to acquire some special capabilities (links) as a side effect of the order in which they are created at system deadstart. How this happens is explained in the following descriptions of links, task communication, and the switchboard task. 2.1 System Calls A system call allows tasks to communicate with the kernel. To make a system call, tasks place parameters in particular registers (for example, an ID defining the service desired and pointers to data areas) and then execute an inter- rupt instruction. The kernel fields the interrupt, performs the service if appro- priate, and restarts some task. Typical system calls are move data (from one address in memory to another), send/receive message, and create/destroy link. 2.2 Links A link is a qualified pointer to the task that created it. When creating a link (pointer to itself), a task can endow the link with properties describing the link's use and disposition. Link properties include permission for (or restriction from) duplicating the link or giving it to some other task and specification of how the link is to be used. For example, a communications link allows messages to be sent to the creating task and a data link allows access (through a system call) to the creating task's memory. The physical realization of a link also contains other detailed information that allows the creating task to determine, on receipt of a message, which of its links is being used by the message sender. A task creates a link by first con- structing a data area describing the link's desired properties and then executing the appropriate system call. The kernel responds by creating a link, placing its physical realization in the task's link table (inaccessable by the task) , and returning to the task an integer representing the location of the link in the link table. A task uses the link by including this integer in a system call. The link can be used as a message or data path to the creating task. The message or data path is established when the creating task gives the link to another task. When a link is given to a task, the physical realization is removed from the giving task's link table and written into the receiving task's link table. 2.3 Task Communication A task can communicate with another task only if it has a link to that task. The communication is straightforward. A task wishing to send a message constructs the message in its memory area in a format that has been agreed to by the receiving task. The message area includes a header (to be used by the kernel) and a data area. The header includes an integer representing a displacement into the sending task's link table. A system call then causes the kernel to gain control, select the specified link from the sending task's link table, determine which task created the selected link, and copy the data from the message area onto a queue of messages for the task that created the link. The receiving task will receive the message only by issuing a receive message system call. The receiving task can selectively receive messages from specific subsets of the links it has given out by enabling reception only on those links with certain properties. The desired properties are specified when the receiving task issues the receive message system call. A task owning a data link to the memory area of another task can access the other task's memory area through a move data system call. The mechanism is similar to the send message system call; however, the task whose memory is being accessed is not actively involved in the transaction. The data is moved by the kernel. 2.4 Switchboard Task Some means must exist for tasks to exchange the links required to establish initial communication. The switchboard task fulfills this function. During system initialization, following a deadstart, the kernel creates the switchboard task and forges a link to it. The forged link is given to each task as it is created. Since 320 all system tasks are created at deadstart prior to creation of any user tasks, system tasks are able to use the switchboard to their advantage. They do this by creating a number of links to themselves and then giving the links (along with a name by which the link should be referenced) to the switchboard. Properties of some links specify that they may be duplicated; thus, the switchboard may give copies of these links to any task that asks for them by name. Other links given to the switchboard may not be duplicated and can be given only to the first task that asks. Those nonduplicatable links that give system tasks their special privileges are obtained from the switchboard during system initialization, leaving only the nonspecial (duplicatable) links to be obtained by the user tasks. Each user task, when created, is given copies of a few standard links to the system; for example, to the switchboard and to the file system. 3. DEMOS File System The DEMOS file system software consists of four memory resident tasks: the directory manager, request interpreter, buffer manager, and disk manager. As each of the four tasks is discussed below, the design philosophy of the DEMOS file system will emerge. Also, some potential bottlenecks and sources of system overhead will become evident. 3.1 DEMOS Directory Manager DEMOS files have hierarchical names that are implemented through special files called directory files. Data stored in a directory file establishes a correspondence between file names and files. The direc- tory files may be visualized as nodes in a tree. Directories point to other files, which may be directory files or data files. All leaf nodes are file descriptors that contain such information as the file's owner, classification, size, and the location on a disk of the file's data. For each file descriptor, there is a unique path through the directory tree from the system root directory to the file descriptor. The directory manager has access to the system root directory (and thus implicitly to all files) by virtue of a special link obtained from the switchboard at system deadstart time. Each user task, when created, is given a link to the directory manager. This link is the only communication path between user and file system guaranteed by DEMOS. The link allows the user to send messages to the directory manager. Through parameters embedded in such messages, the user may request the directory manager to create or open files, or may inquire about the prop- erties of some file. The directory manager processes requests by using the services of the other three file system tasks. A successful create or open request causes a new link to be created and returned to the user by the directory manager. During processing of the open/create request, a correspondence between the new link and the file it represents is established by the request interpreter to support processing of future transactions on the file. The user task sends messages containing requests for file processing (for example, read, write, and seek) through the newly created link to the request interpreter. In summary, the directory manager serves as the standard user interface to the file system, has access to all files/directories, controls file access through information contained in the directories, and processes user directory requests by obtaining services from the other file system tasks. 3.2 DEMOS Request Interpreter The request interpreter serves as the user interface to the file system for all transactions on open files. It uses services provided by the buffer-manager and disk manager to process user requests. Files are initially created with no data in them and no space on the disk allocated to them. As data is written to the file, blocks are allocated on a disk to hold the data. A block contains 4096 bytes and occupies a sector on the disk. Programs do not have to deal with blocks, however, and may read or write (sequentially or randomly) any number of logically contiguous bytes in the file. The user reads or writes the file by sending the appropriate message to the request interpreter through the link created at file open time. The request interpreter updates current logical file position pointers and then requests buffers from the buffer manager to hold the data. The request to the buffer manager is made through a special link obtained by the request interpreter from the switchboard task at system deadstart. Once the buffer manager has supplied the requested buffers, the request 321 interpreter moves data to/ from the user space from/to the buffers. Implicit in this transaction are data links, supplied by the user and buffer-manager tasks that give the request interpreter access to user and buffer manager memory. A user request that exceeds the size of available buffer memory space is fragmented into transac- tions of acceptable sizes by the request interpreter and the buffer manager as buffer space becomes available. A user request for N disk blocks will be processed as some number K distinct requests within the file system, where 1 <= K <= N. Fragmentation is logically transparent to the user. 3.3 DEMOS Buffer Manager The buffer manager supplies interme- diate storage between a user task's memory and secondary storage (disks on the Cray 1A) . There are several reasons for buffer- ing I/O data. In addition to the usual reasons (for example, CPU-I/O overlap or read ahead/write behind) , the buffer manager allows memory compaction (or reorganization) to proceed in parallel with I/O. Also, the buffer manager implements the interface between byte-at-a-time and block-at-a-time I/O, thus eliminating the need for this service to be replicated in each user task space. The buffer manager also serves as an adjunct to the CPU scheduler by limiting the rate at which user tasks are allowed to complete I/O transactions and, implicitly, their rate of CPU usage. A transaction that is not completely satisfied by the buffer manager is periodically retried by the request interpreter until it is satisfied. This activity is transparent to most users; however, those who have carefully balanced their CPU and I/O requirements to match the hardware capability and to overlap significantly may notice a dramatic reduction in apparent system performance; for example, CPU and I/O usage may have been serialized. A user read request is received by the buffer manager from the request interpreter as a "fill buffers" request. If the desired data are in system buffers, those buffers are returned to the request interpreter. Otherwise, the buffer manager allocates buffers and asks the disk manager to fill them. The filled buffers are returned to the request interpreter. A user write request causes the buffer manager to allocate empty buffers (and possibly do I/O to empty some buffers). The empty buffers are returned to the request interpreter. Filled buffers are returned to the buffer manager by the request interpreter for output to disk. 3.4 DEMOS Disk Manager The DEMOS disk manager maintains the logical-to-physical disk file mapping tables and allocates/deallocates all disk space. Physical disk I/O requests from the buffer manager are queued by the disk manager and sent to the kernel disk driver in a order based on least latency, least seek time next. In the initial version of DEMOS, the disk manager-to-disk driver requests were block (sector) at a time (per device) , because the disk hardware was oriented to this size transaction. Also the kernel disk driver did not implement request queueing. When the kernel disk driver notifies the disk manager that a transaction is complete, the disk manager notifies the buffer manager and attempts to start another transaction. An attempt to extend a file requires the disk manager to do housekeeping associated with disk allocation prior to initiating the user transaction. Disk space allocation maps are retained on the disks, so file extensions require the disk manager to do I/O. The disk manager uses buffer manager services in a way similar to the request interpreter for this purpose. 3.5 User/File System Communication Costs Figure 1 illustrates the logical communication precipitated by a user I/O request to an open file. Paths ml, . . . , m8 represent message paths supported by message links; paths dl and d2 are data paths, dl supported by a data link and d2 implemented in hardware; paths cl and c2 are hardware control paths, cl being implemented by hardware command and c2 by interrupt. Paths 11 and 12 represent link creation/inquiry system calls. The kernel has been omitted from most paths in Fig. 1 for simplicity; for example, each of the paths, mi, requires a send message system call and a receive message system call, each of which inter- rupts to the kernel for implementation. In the simplest case a user I/O request requires 11 system functions, 4 322 | KERNEL | 11 12 ! ! USER i TASK USER BUFFERS 1 1 1 1 1 1 ! dl I | ml ! m8 1 1 1 ! i 1 1 ! REQUEST INTERPRETER! j 1 1 1 I m2 i ! m7 1 1 1 | BUFFER I MANAGER 1 BUFFERS ! 1 m3 l ! m6 1 1 1 d2 I DISK i MANAGER 1 { mA j m5 i ! KERNEL DISK I DRIVER ! | 1 1 1 i i | cl 1 i ! c2 1 ! DISK 1 1 1 Fig. 1. DEMOS logical file system organization. send/receive message pairs, 1 data move, and 2 link related system calls; that is, the data was found in (or moved to) the buffer manager buffers. The simplest case involving a physical disk I/O transaction also requires 11 system functions, 4 send/receive pairs, and 3 disk interrupts (paths cl and d2 are done by hardware) . This case represents I/O between the buffer manager and the disk. Those cases including only the disk manager, kernel disk driver, and disk are considered to be part of physical I/O. A user I/O request that requires physical disk I/O is the sum of the two above, and requires 22 system functions. For transactions involving N disk blocks, paths m4, m5 , dl, and d2 must each be traversed N times, and paths cl and c2, 3N times. The cost in terms of system functions becomes: Cost = 2(6 + 2N) messages + 3N interrupts + N data moves + 2 link calls. (1) A transaction that is broken into K fragments by the buffer manager adds 4 (K-l) send/receives to the cost for communication between the request interpreter/buffer manager and buffer manager/disk manager, along paths m2, m3, m6, m7. Thus the cost of a fragmented request becomes: Cost = 2(6 + 2N + 4(K-1)) messages + 3N interrupts + N data moves (2) + 2 links. Throughout the remainder of this section and the next we will equate the cost of all system functions. Performance data are presented later to support this simplification. The cost (in system functions) for a one-track (18-block) I/O transaction is: Cost (min) = 84 messages + 54 interrupts + 18 data moves + 2 links (3) = 158 system functions. Cost (max) = 228 messages + 54 interrupts + 18 data moves + 2 links (4) = 296 system functions. Clearly, the cost in CPU cycles of a system function and fragmentation intro- duced by the buffer manager are potential sources of serious system overhead. Figure 2 represents overhead during a sustained period of I/O transfer and illustrates potential overhead for a transaction with N =18. In Fig. 2, the overhead is the fraction of time the CPU is busy managing I/O during the elapsed time required to perform the I/O. The x-axis represents times (in microseconds) to perform a system function, and the y-axis represents the overhead fraction. Overhead curves are shown for several values of K. Note that disk latency and positioning times are not included in the calculations. .9 .7 .5 .3 •1 I OVERHEAD FRACTION Fig. 2. 175 53 TDT5 s/SYSTEM FUNCTION Overhead in a 1-track (18-block) I/O transaction. 323 Recall that the overhead in Fig. 2 includes only system function overhead and excludes processing by the file system tasks. File system overhead will be incorporated in the next section. 3.6 CRAY-1A I/O Subsystem Hardware We now describe briefly those features of the Cray I/O hardware needed to understand the discussion of sustainable I/O rates. A detailed description is found in Ref . 4. Figure 3 illustrates the topology for the Cray I/O components used at LASL. Other topologies can be used. Only a single channel is shown in this figure. The controller has two buffers, each a disk block (512 words) in length. This allows data transfer between disk/controller and channel/controller to proceed in parallel. In this configuration, data can flow from main memory to the disk at a sustained rate of 550,000 word/s within a disk cylinder and 500,000 word/s to multiple contiguous cylinders. Data flows between main memory and controller buffer at nearly 4,000,000 word/s. function cost that permits software to keep pace with hardware. To keep matters simple, we assume a task-processing cost equal to that of a system function (measurements confirm this within a factor of 1.5). With this assumption, maximum acceptable cost becomes: COST max (input) = (700 ys)/(22 functions + 3 system tasks) (5) 28 us. For output, the maximum acceptable cost is doubled due to the way the hardware operates. Similarly, the overhead for an unfragmented N block request can be determined. OVERHEAD = 2(6 + 2N) system calls + N data moves 3N interrupts (6) + 2 links + 3 system tasks. Again, equating the costs of the five functions above, overhead becomes The controller signals that I/O is complete as soon as data has been transferred between main memory and controller buffer memory. For sequential sector output requests, this allows a window of approximately 1.5 sector times (1.4 ms) of elapsed time after the controller signals 1/0 complete for the file system to output the next physical disk block. A longer delay causes a missed disk revolution. For input, approximately 0.75 sector times (0.7 ms) of elapsed time are allowed between completion of one sector and a request to input the next without missing a disk revolution. If we assume a value for processing time in each of the file system tasks while satisfying a request, we can determine the maximum value for an acceptable function cost, where acceptable means a system OVERHEAD = (8N + 17)Csf, where Csf is the cost of a system function. Percent overhead, that is, percentage of time the CPU is busy managing the I/O transaction, is then: % OVERHEAD = 100 x (OVERHEAD) /ELAPSED TIME. (7) For a continuous (consecutive sectors and no missed disk revolutions) stream of I/O requests, I/O ELAPSED TIME becomes Tds x N, where Tds is the time for a disk sector to pass under the read/write heads (in our case, 926 ms). MAIN | CHANNEL j ! CONTROLLER | | DISK | MEMORY i ! BUFFER 1 I ! BUFFER 2 i | DISK ! Fig. 3. Cray topology at LASL (1978) 324 Figure 4 is produced using values for N of 1, 18 and a range of values for Csf in Eq. (7). 140 120 PERCENT 100 OVERHEAD 80 60 40 18 "215 40 60 80 100 120 HO SYSTEM CALL COST IN us Fig. 4. Percent overhead for nonf ragmented I/O transactions. When we include a fragmentation factor, K, Eqs. (6) and (7) become: OVERHEAD = (8N + 8K + 9)Csf (8) and % OVERHEAD = 100 x (OVERHEAD) /ELAPSED TIME (9) Figure 5 is a reproduction of Fig. 4 adding values for K of 3 and 6. 140 120 PERCENT 100 OVERHEAD 80 60 40 20 40 60 80 100 120 140 SYSTEM CALL COST IN US Fig. 5. Percent overhead for fragmented I/O transactions for the case N = 18. Clearly Csf and K are important factors in both overhead and sustainable I/O rates. In the next section we present some initial measurements of Csf, K, and I/O overhead and rates. 4. Performance Data Gathering Demos supports both centralized (kernel) and localized (system task) data collection facilities. The kernel task collects information describing all system calls and interrupts in a circular main memory buffer. Any system or user task can extract the data from the buffer. System tasks collect summary information pertinent to their operation in their local memories. These tables can be read by programs running on a computer connected to one of the Cray channels. The kernel buffer data are used to reproduce a chronological trace of very detailed system activity at the expense of writing an enormous amount of data to disk with the attendant high cost overhead. Individual task data is used to determine parameter values associated with more global events, for example, number of physical disk I/O requests, number of active buffers. Both facilities are used to extract the data in the following sections . 4.1 Some Preliminary Results Early measurements gave average values for costs of messages, interrupts, data moves, and file system task processing of 120, 100, 75, and 150 \xs , respectively. Using these values and N=l, Eq. (6) gives: OVERHEAD = 16(120) + 1(75) + 3(100) + 2(120) + 3(150) = 2985 us. Clearly, from Eq. (5) the I/O software cannot keep pace with the I/O hardware; that is, the minimum time between completion of sector i and a software request for sector i+1 is over three sector times. With a stream of consecutive, single-sector requests, only a single sector would be read per disk revolution without some scheme for reading alternate or every n-th sectors. However, the hardware read-ahead feature in the disk controller permits a rate of two sectors of input per disk revolution. Thus, the predicted sustainable rate for single- sector reads becomes Rate (in) = 2/18 x maximum rate = 55,000 word/s. 325 From Eq. (7) the predicted percent overhead is % OVERHEAD = 100 x (2 x 2985)/l6667 = 36% . The dual write-behind features in the disk controller and buffer manager will allow transfer of four blocks per revolution (after the first) for single- block writes. During the first revolution, two blocks are buffered in the controller, a third is queued in the kernel, and some number (<= 15) are queued in the buffer manager. During subsequent revolutions, sufficient time exists for the block queued in the kernel and one block queued in the buffer manager to be written to disk and two more can be buffered in the controller. The predicted output rate is then Rate (out) = 4/18 x maximum rate = 110,000 word/s for single-block requests, and the predicted overhead is % OVERHEAD = 100(4 x 2985)/ 16667 = 72%, Measurements taken in October 1978 gave values of 53,000 word/s at 42% overhead for single-block input and 94,000 word/s at 73% overhead for single-block output. 4.2 A Simple Nonsolution One solution to the problem seemed obvious: large user requests. For large requests (for example, one disk cylinder--180 blocks) Eq. (6) predicts: OVERHEAD = 16027 us, and from Eq. (7) , % OVERHEAD = 100 x 16027/1666667 = 93%. This seems to suggest that at least large swaps or program fetches might be done effectively, because during a full memory swap or load no user program desires to use the CPU. Such swaps and loads are common in our production environment. Measure- ments show that the average swap is between 200,000 and 300,000 words and a significant fraction, 30% to 40%, are over 600,000 words. Large requests are always fragmented due to limited buffer manager buffer space (currently 18 disk blocks). Observed fragmentation factor values are in the range N/3 to N/4 for large N; that is, a 180-block request will probably be fragmented into 45 to 60 small requests within the file system. Equation (8) predicts that the resultant overhead will exceed 100% of the total CPU time. In this case, the I/O software will not be able to keep pace with the hardware, and disk revolutions will be missed. To test the predicted effect of fragmentation and overhead on large 1/0 transfers, a test was run that would first fragment the available buffer memory among a number of small (3-block) requests and then attempt large (180-block) trans- actions. Measurements showed a maximum sustained input rate of 250,000 word/s and output rate of 140,000 word/s. To understand how overhead and fragmentation produce these rates consider the following. Assume each 180-block request is fragmented into 60 equal sized requests of 3 blocks each. Further, note that system tasks are prioritized in increasing order from user to kernel drivers in Fig. 1. A summation of processing times required by the buffer manager, disk manager, and kernel implied by paths in Fig. 1 gives 3,120 \is required to process each 3-block request. This is slightly more than the elapsed time required to read three disk blocks (926 ys/block) . On input, hardware read-ahead covers for the slight excess and allows reading to proceed for a while; the system would eventually fall behind and miss a revolution. However, no time remains for communication between buffer manager and request interpreter. When all 18 blocks (6 requests of 3 blocks each) have been transferred, the buffer and disk manager run out of work. During the resulting missed disk revolution, the request interpreter completes the transaction and produces another set of six fragmented (3-block) requests in the buffer manager buffers. This gives an effective rate of half disk speed or 250,000 word/s. On output, hardware read-ahead is not available. The disk manager falls behind after three transfers to disk and, during the remainder of the revolution, is able to transfer two blocks to the controller buffers. The result is a rate of about five blocks per revolution or 5/18 of maximum hardware rate, that is, 150,000 word/s. 326 The average rate is well below the lower end of the anticipated 1/0 demand range of 20 to 60 Mbit/s and the overhead for these low rates is still unacceptable. A. 3 Initial Effective Solutions A simple solution was not evident for the fragmentation problem, so effort was concentrated on reducing other overhead. One improvement was obvious and simple: increase the bandwidth along paths m4 and m5 in Fig. 1. As a result of this increase, a single message can contain a request for multiple physically-contiguous disk sectors. This did not require implementation of request queueing in the kernel but did permit a reduction in the number of required interrupts from three to one for all but the first block of a request. For relatively large N, Eqs. (6) and (8) become: OVERHEAD = 16 messages + N data moves + N interrupts + 2 links + 3 tasks + (21 + 2N)Csf, (10) OVERHEAD = (4K + 2N + 13)Csf . (11) In addition some small amount of kernel code to manage messages was converted to assembly language, and the cost of a message was reduced from 120 ys to about 75 us. These improvements reduced average system function and overhead cost by about 25%. However, overhead would still be too high for a sequence of small requests to keep pace with the disk, and fragmentation should continue to prevent large requests from keeping pace. Tests confirmed the modest reduction in overhead, a modest increase in sustainable I/O rates, and that fragmentation could still disrupt large requests. Tests that attempted to avoid fragmentation produced more encouraging but still unsatisfactory rates. 4.4 More Complex Solutions Several solutions to the fragmentation and overhead problems were discussed. They included: (a) merge the three file system tasks, (b) increase the number of buffers, (c) modify the buffer manager to avoid fragmentation for large requests, (d) eliminate buffers and buffer manager, that is, 1/0 direct to user space, (e) treat selected 1/0 requests in a special way. Solution (a) was rejected for three reasons; it would take much work, would not address fragmentation, and would not reduce overhead to an acceptable level (only by another 20-25%). Solution (b) was politically unten- able; the system was already far too large. Solution (c) seemed attractive but all proposed schemes appeared artificial and kludgey . Solution (d) was undesirable since the buffer manager satisfied (on the average) 57% of all 1/0 requests and 67% of all small requests without resorting to physical disk I/O. These figures represent percent of total requests, not percent of total data volume. Furthermore, changes to the memory manager, swapper, etc., to lock user tasks with pending 1/0 at a fixed location in memory appeared difficult and counterproductive to system throughput. Solution (e) was chosen as the most desirable. The special treatment consisted of transferring the data between user memory, a kernel buffer, and the disk one block at a time. This required adding a new message path between the request interpreter and the disk manager, a buffer per channel in the kernel, and a small amount of code in each of the tasks. During data moves (done by the kernel), the current address of the task could be checked; thus no task locking was required. Only those user requests not satisfied by the buffer manager and that did not require further disk allocation were candidates for special treatment. Typically, these requests were swaps, program fetches (loads), and periodic snapshots by user codes of large tables. For these requests , fragmentation would be eliminated and overhead could be sharply reduced. Initial task communica- tion to set up the 1/0 transaction costs 14 system functions (now at a cost of slightly less than 100 ys average). Processing during the request costs an interrupt and a data move (about 175 ps); processing to clean up and terminate the request costs six system functions. 327 Figure 6 illustrates the predicted costs for processing an N block request using the new scheme or the standard. The standard curve shown assumes a fragmenta- tion factor of K = N/3. The new scheme is referred to here as buffer bypass, or bypass I/O. The cost for an N block request with no fragmentation is approxi- mately the same as the cost for an N block bypass request; only fragmentation has been eliminated, overhead remains. 4.5 Current Workload Several measured 24-hour periods show that about 60% of all I/O now is treated as bypass I/O. Another 10% to 15% of the I/O workload could be transferred from standard to bypass I/O by allocating a file's disk space when the file is created rather than when it is written. As an intermediate step, the file system could be modified to allocate large blocks of disk space when a file is extended. 14000 12000 COST IN 10000 MICROSECONDS 8000 6000 fragmented bypass 9 18 27 36 45 54 63 N, IN DISK BLOCKS Fig. 6. Relative cost of bypass and fragmented I/O. Figure 7 contains typical measured rates for various sized bypass requests. These measurements show that the system can now support I/O rates at the upper end of the projected range (60 X 10 bit/s). Overhead is about 20% for small requests (N ranges between 9 and 36 blocks) and decreases with increasing values of N. Overhead remains constant at about 15% once N exceeds 5 tracks (90 blocks). The overhead to support an I/O rate of 60 X 10 bit/s will be 30% if the requests are all large bypass requests. This 30% overhead reflects the fact that a program requiring this rate for a single input or output stream will have to merge input/output streams for files on two different channels . The average size of bypass I/O requests is 16 blocks, and for standard 1/0 the average size of a request is 4 blocks. 5. Remaining Problems Figure 7 demonstrates that fragmenta- tion and (as a byproduct) overhead are significantly reduced for large bypass I/O requests. However, rates are unacceptably low for small bypass requests, and both overhead and rates are unacceptable for standard requests. To understand the cause of low rates for small bypass requests, we again sum processing costs of the components . The cost of finishing one bypass request and starting another is 20 system functions and two system task times . The sum of CPU costs for this activity is slightly longer than the maximum window allowed in order to continue a sequence of I/O transactions without missing disk revolutions. Even though the overhead is low for large requests, (N >= 9), the sustained I/O rate will be approximately: Rate = maximum disk rate x Tcount/ (Tcount + 1), RATE (word/s) 500000 400000 300000 200000 100000 18 90 179 TRANSACTION SIZE (blocks) 358 Fig. 7. Observed I/O rates for bypass requests, 328 where Tcount is the number of tracks in the request. This relation only states that a track will be missed between successive requests. The effective rate for 1-track requests is 1/2 x maximum rate and for full cylinder (10-track) requests, 10/11 x maximum rate. 5.1 Some Solutions Several solutions have been discussed that should eliminate the missed revolutions between successive bypass requests. None have been implemented. Some of the solutions are: (a) merge some or all of the file system tasks, (b) implement single sector read-ahead in the kernel , (c) signal output complete prior to transmitting last block to disk, (d) increase the number of kernel buffers per channel. (e) convert more of the kernel to assembly language and further reduce the cost of a system call or interrupt. Solution (a) is attractive because it would eliminate some missed revolutions and would reduce overhead for standard I/O. However, a nontrivial amount of work would be required. Solution (b) would only eliminate missed revolutions for bypass input requests. A combination of (b) and (c) would eliminate missed revolutions for bypass input and output requests. The drawback is that read-ahead unnecessarily ties up the channel and disk. Notification prior to actual I/O completion would allow hardware errors to go unreported to users. Solution (d) is attractive because it is easy to implement and, coupled with (b), would eliminate missed revolutions for all bypass requests and for some standard requests. It would cost a small amount of trivial code and additional buffer space. It has the same drawbacks as solutions (b) and (c). that message, link and interrupt costs could be reduced to about 50 ys with a moderate recoding effort. A result of (b) would be to widen the input window (see Sec. 2.6) to be equal to that of the output window, that is, 1.4 ms. 5.2 Predicted Effects of Modifications Using the predicted cost of system functions, the cost of a standard I/O request becomes Cost (standard) = 10 system calls + N data moves + (3+(N-l)) interrupts + 3 tasks. The predicted cost of a bypass I/O request with the proposed modifications and times is Cost (bypass) = 11 system calls + N data moves + (3+(N-l)) interrupts + 2 tasks. The costs of standard and bypass I/O are both reduced to approximately the same value. Figure 8 shows predicted percent overhead for bypass or standard requests, 21 20 19 V E 18 R H 17 E A 16 D 15 14 13 12 18 27 36 45 54 63 72 A combination of solutions (a), (b), and (e) probably represents the best approach from a performance and reliability standpoint; however, it also represents the most work and departs farthest from the original modular design. Results of (a) and (e) would be to reduce both overhead and processing time between successive requests. One current estimate suggests Fig. 8. Predicted percent overhead for I/O transactions. An important point with the proposed modifications and times is that the processing done between the end of one request and start of another excludes the 329 costs of all but one data move and one interrupt. This places the inter-request processing time well within the 1.4-ms window for both standard and bypass requests. The software should be able to keep pace with the hardware and, as Fig. 8 shows, the overhead becomes excessive only for very small requests. 6.0 Conclusions It continues to be our experience that when things go wrong with the performance of a system there will be numerous educated guesses as to the cause and correction required; most of them will be wrong or ineffective. This paper provides a detailed case study of how one software project profited from a performance measurement and analysis effort that began in early design and was continued through development, testing, and production use of the software. We believe that the results reported here are typical of those for any large-scale software development project that incorporates a comprehensive performance measurement and analysis component. Acknowledgments We thank all members of the LASL DEMOS development staff and in particular Jim Clifford and John Montague for providing insight into DEMOS organization and operation. We also thank the CRI on-site engineers for their help in understanding the details of hardware functions. References 1. M. L. Powell, The DEMOS File System, Proceedings of the Sixth Symposium on Operating System Principles , November 1977, 33-42. 2. F. Baskett, J. H. Howard, and J. T. Montague, Task Communication in DEMOS, Proceedings of the Sixth Symposium on Operating System Principles , November 1977, 23-31. 3. R. S. Fabry, Capability-Based Addressing, CACM , 17, 7 (July 1974), 403-412. 4. Cray Research, Inc., CRAY-1 Computer System Reference Manual , Publication 2240004 (1977). 330 NBS-114A (REV. 9-78) U.S. DEPT. OF COMM. BIBLIOGRAPHIC DATA SHEET 1. PUBLICATION OR REPORT NO. NBS SP 500-65 a.Gov't Accession No* 3. Jfeclptertt's Accession No. .-: : :-:-:-: : .-:-:-.-.-.-.-.-:'.-:'. .-: : : : . . : :-.:". ; :'. :-. . : .".'. .'■■: ; .V; : .y: : : ; : : : :;.;■: .■ -■:-.:- '-.- :'. . : --: :-.'. . : . . .y:-;-. :■ 4. title and subtitle COMPUTER SCIENCE & TECHNOLOGY: Proceedings of the Sixteenth Meeting of the Computer Performance Evaluation Users Group (CPEUG) 5. Publication Date October 1980 6. Performing Organization Code 7. AUTHOR(S) Dr. Harold J. Highland, Editor 8. Performing Organ. Report No. 9. PERFORMING ORGANIZATION NAME AND ADDRESS NATIONAL BUREAU OF STANDARDS DEPARTMENT OF COMMERCE WASHINGTON, DC 20234 l% aroj^ci/task/Wo^ iferif No. 11. Contract/Grant No. 12. SPONSORING ORGANIZATION NAME AND COMPLETE ADDRESS (Street, City, state, zip) Same as No. 9, 13. Type of Report & Period Covered Final soring Agency Code 15. SUPPLEMENTARY NOTES | | Document describes a computer program; SF-185, FIPS Software Summary, is attached. 16. ABSTRACT (A 200-word or less factual summary o( most significant information. It document includes a significant bibliography or literature survey, mention it here,) The Proceedings record the papers that were presented at the Sixteenth Meeting of the Computer Performance Evaluation Users Group (CPEUG 80) held October 20-23, 1980, in Orlando, Florida. With the theme "CPE Trends in the 80's," CPEUG 80 focused on new applications that are expected to grow in the 80's and changes that may occur in traditional areas during the 80's. The program was divided into two parallel sessions and included technical papers on previously unpublished work, case studies, tutorials, and panels. Technical papers are presented in the Proceedings in their entirety. 17. KEY WORDS (six to twelve entries; alphabetical order; capitalize only the Jirst letter of the first key word unless a proper name; # separated b y semicolons) Benchma no ng ; capacity planning; computer performance evaluation; computer performance measurement; computer performance prediction; computer system acquisition; CPE in auditing; installation management; on-line system evaluation; queuing models; simulation; workload definition. 18. AVAILABILITY [K] Unlimited I I For Official Distribution. Do Not Release to NTIS PT1 Order From Sup. of Doc, U.S. Government Printing Office, Washington, DC 20402 I I Order From National Technical Information Service (NTIS), Springfield, VA. 22161 19. SECURITY CLASS (THIS REPORT) UNCLASSIFIED 20. SECURITY CLASS (THIS PAGE) UNCLASSIFIED 21. NO. OF PRINTED PAGES 316 22. Price $8.00 USCOMM-DC NBS TECHNICAL PUBLICATIONS PERIODICALS JOURNAL OF RESEARCH— The Journal of Research of the National Bureau of Standards reports NBS research and develop- ment in those disciplines of the physical and engineering sciences in which the Bureau is active. These include physics, chemistry, engineering, mathematics, and computer sciences. Papers cover a broad range of subjects, with major emphasis on measurement methodology and the basic technology underlying standardization. Also included from time to time are survey articles on topics closely related to the Bureau's technical and scientific programs. As a special service to subscribers each issue contains complete citations to all recent Bureau publications in both NBS and non- N3S media. Issued six times a year. Annual subscription: domestic $1 y. foreign S 1 6.25 Single copy, $3 domestic; $3.75 foreign. NOTE The Jourral was formerly published in two sections: Sec- tion A "Physics and Chemistry" and Section B "Mathematical Sciences." DIMENSIONS, NBS — This monthly magazine is published to in- form scientists, engineers, business and industry leaders, teachers, students, and onsur ers. of the latest advances in science and technology, with p imary emphasis on work at NBS. The magazine highlights and reviews such : ssu^ cs energy research, fire protec- tion, building technology :or srston, pollution abatement, health and safety, and cg net j J -:t performance In addi- tion, it reports the results of Su.eau programs in measurement standards and techniques, properties of matter and materials, engineering standards and services, instrumentation, and automatic data processing Annual subscription, domestic SI I; foreign $ i 3.75. MOMPERIODICALS Monographs — Major contributions to the technical literature on various subjects related to the Bureau's scientific and technical ac- tivities Handbooks — Recommended codes of engineering and industrial practice (including safety codes) <1t^eicpcd in cooperation with in- terested industries, pruicssiondi organizations, and regulatory bodies. Special Publications — Include proceedings of conferences spon- soied by NBS, NBS annual reports, and other special publications appropriate to this grouping such as wall charts, pocket cards, and bibliographies. Applied Mathematics Series — Mathematical tables, manuals, and studies of special interest to physicists, engineers, chemists, biologists, mathematicians, computer programmers, and others engaged in scientific and technical work. National Standard Reference Data Series-Provide:, quantitative data on the physical and chemical properties of materials, com- piled from the world's literature and critically evaluated Developed under a worldwide program coordinated by NBS under the authority of the National Standard Data Act (Public Law 90-396). NOTE: The principal publication outlet for the foregoing data is the Journal of Physical and Chemical Reference Data (JPCRD) published quarterly for NBS by the American Chemical Society (ACS) and the American Institute of Physics (AlP). Subscriptions, reprints, and supplements available from ACS, 1 155 Sixteenth St., NW, Washington, DC 20056. Building Science Series — Disseminates technical information developed at the Bureau on building materials, components, systems, and whole structures. The series presents research results, test methods, and performance criteria related to the structural and environmental functions and the durability and safety charac- teristics of building elements and systems. Technical Notes — Studies or reports which are complete in them- selves but restrictive in their treatment of a subject. Analogous to monographs but not so comprehensive in scope or definitive in treatment of the subject area. Often serve as a vehicle for final reports of work performed at NBS under the sponsorship of other government agencies. Voluntary Product Standards — Developed under procedures published by the Department of Commerce in Part SO, Title 15, of the Code of Federal Regulations. The standards establish nationally recognized requirements for products, and provide all concerned interests with a basis for common understanding of the characteristics of the products. NBS administers this program as a supplement to the activities of the private sectoi standardizing organizations. Consumer Information Series— Practical information, based on NBS research and experience, ccvering areas of interest to the con- sumer. Easily understandable language and illustrations provide useful background knowledge for shopping in today's tech- nological marketplace. Order the tabove NBS publications from: Superintendent oj Docu- ments, Government Printing Of/ice, Washington, DC 20402. Order the following NBS publications — FIPS and NBSIR's — from the National Technical Information Service;, Springjield. VA 22/61 . Federal Information Processing Standards Publications (FIPS PUB) — Publications in this series collectively constitute the Federal Information Processing Standards Register. The Register serves as the official source of information in the Federal Govern- ment regarding standards issued by NBS pursuant to the Federal Property and Administrative Services Act of 1949 as amended. Public Law 89-306 (79 Stat. 1127), and as implemented by Ex- ecutive Order 11717(38 FR 12315, dated May II, !973)and Parte of Title 15 CFR (Code of Federal Regulations). NBS Interagency Reports (NfeSlR) — A special series of interim or final reports on work performed by NBS for outside sponsors (both government and non-government) In general, initial dis- tribution is handled by the sponsor, pubhc distribution is by the National Technical information Services, Springfield, VA 22161, in paper copy or microfiche form BIBLIOGRAPHIC SUBSCRIPTION S! The following current-awareness and literature-survey bibliographies are issued periodically by the Bureau: Cryogenic Data Center Current Awareness Service. A literature sur- vey issued biweekly. Annual subscription' domestic S35; foreign $45. Liquefied Natural Cas. A literature survey issued quarterly Annual subscription: $30. Superconducting Devices and Materials. A literature survey issued quarterly. Annual subscription: $45. Please send subscription or- ders and remittances for the preceding bibliographic services to the National Bureau of Standards, Cryogenic Data Center (736) Boulder, CO 80303. U.S. DEPARTMENT OF COMMERCE National Bureau of Standards Washington, D.C. 20234 OFFICIAL BUSINESS Penalty for Private Use. $300 J PE| W I STATE UNIVERSITY LIBRARIES ADDDD7n53iQ^ l£2L SPECIAL FOURTH-CLASS RATE BOOK