^2 /-^- /<* .' ^c^* - C V
%
f
St °>.
V
J
\
m
°»EA\1 Of
NBS Special Publication 500-64
U.S. DEPARTMENT OF COMMERCE
National Bureau of Standards
NATIONAL BUREAU OF STANDARDS
The National Bureau of Standards 1 was established by an act or Congress on March 3, 1901.
The Bureau's overall goal is to strengthen and advance the Nation's science and technology
and facilitate their effective application for public benefit. To this end, the Bureau conducts
research and provides: (1) a basis for the Nation's physical measurement system, (2) scientific
and technological services for industry and government, (3) a technical basis for equity in
trade, and (4) technical services to promote public safety. The Bureau's technical work is per-
formed by the National Measurement Laboratory, the National Engineering Laboratory, and
the Institute for Computer Sciences and Technology.
THE NATIONAL MEASUREMENT LABORATORY provides the national system of
physical and chemical and materials measurement; coordinates the system with measurement
systems of other nations and furnishes essential services leading to accurate and uniform
physical and chemical measurement throughout the Nation's scientific community, industry,
and commerce; conducts materials research leading to improved methods of measurement,
standards, and data on the properties of materials needed by industry, commerce, educational
institutions, and Government; provides advisory and research services to other Government
agencies; develops, produces, and distributes Standard Reference Materials; and provides
calibration services. The Laboratory consists of the following centers:
Absolute Physical Quantities 2 — Radiation Research — Thermodynamics and
Molecular Science — Analytical Chemistry — Materials Science.
THE NATIONAL ENGINEERING LABORATORY provides technology and technical ser-
vices to the public and private sectors to address national needs and to solve national
problems; conducts research in engineering and applied science in support of these efforts;
builds and maintains competence in the necessary disciplines required to carry out this
research and technical service; develops engineering data and measurement capabilities;
provides engineering measurement traceability services; develops test methods and proposes
engineering standards and code changes; develops and proposes new engineering practices;
and develops and improves mechanisms to transfer results of its research to the ultimate user.
The Laboratory consists of the following centers:
Applied Mathematics — Electronics and Electrical Engineering 2 — Mechanical
Engineering and Process Technology 2 — Building Technology — Fire Research —
Consumer Product Technology — Field Methods.
THE INSTITUTE FOR COMPUTER SCIENCES AND TECHNOLOGY conducts
research and provides scientific and technical services to aid Federal agencies in the selection,
acquisition, application, and use of computer technology to improve effectiveness and
economy in Government operations in accordance with Public Law 89-306 (40 U.S.C. 759),
relevant Executive Orders, and other directives; carries out this mission by managing the
Federal Information Processing Standards Program, developing Federal ADP standards
guidelines, and managing Federal participation in ADP voluntary standardization activities;
provides scientific and technological advisory services and assistance to Federal agencies; and
provides the technical foundation for computer-related policies of the Federal Government.
The Institute consists of the following centers:
Programming Science and Technology — Computer Systems Engineering.
'Headquarters and Laboratories at Gaithersburg, MD, unless otherwise noted;
mailing address Washington, DC 20234.
: Some divisions within the center are located at Boulder, CO 80303.
o
Q.
9
COMPUTER SCIENCE & TECHNOLOGY:
DATA BASE DIRECTIONS—
The Conversion Problem
Proceedings of the Workshop of the
National Bureau of Standards and the
Association for Computing Machinery,
held at Fort Lauderdale, Florida,
November 1 - 3, 1977
John L. Berg, Editor:
Center for Programming Science and Technology
Institute for Computer Sciences and Technology
National Bureau of Standards
Washington, D.C. 20234
Daniel B. Magraw, General Chairperson
Working Panel Chairpersons:
Milt Bryce, James H. Burrows,
James P. Fry, Richard L. Nolan
Sponsored by:
acm
National Bureau of Standards
Association for Computing Machinery
U.S. DEPARTMENT OF COMMERCE, Philip M. Klutznick, Secretary
Luther H. Hodges, Jr., Deputy Secretary
Jordan J. Baruch, Assistant Secretary for Productivity, Technology and Innovation
© NATIONAL BUREAU OF STANDARDS, Ernest Ambler, Director
Issued September 1980
Reports on Computer Science and Technology
The National Bureau of Standards has a special responsibility within the Federal
Government for computer science and technology activities. The programs of the
NBS Institute for Computer Sciences and Technology are designed to provide ADP
standards, guidelines, and technical advisory services to improve the effectiveness of
computer utilization in the Federal sector, and to perform appropriate research and
development efforts as foundation for such activities and programs. This publication
series will report these NBS efforts to the Federal computer community as well as to
interested specialists in the academic and private sectors. Those wishing to receive
notices of publications in this series should complete and return the form at the end of
this publicaton.
National Bureau of Standards Special Publication 500-64
Nat. Bur. Stand. (U.S.), Spec. Publ. 500-64, 1 78 pages (Sept. 1980)
CODEN. XNBSAV
Library of Congress Catalog Card Number: 80-600129
U.S. GOVERNMENT PRINTING OFFICE
WASHINGTON: 1980
Ken- sale liy the Superintendent of Documents. U.S. Government Printing Office
Washington. D.C. 20402 -'Price $5.50
TABLE OF CONTENTS
3.
Page
INTRODUCTION 3
1.1 THE FIRST DATA BASE DIRECTIONS WORKSHOP 3
1.2 PLANNING FOR SECOND CONFERENCE 4
1.3 DATA BASE DIRECTIONS II 4
1.4 CONCLUSION 5
EVOLUTION IN COMPUTER SYSTEMS 7
2.1 QUESTIONS 7
2.2 HARDWARE CHANGES 9
2.3 SOFTWARE CHANGES 10
2.4 EVOLUTIONARY APPLICATION DEVELOPMENT 11
2.5 MIGRATION TO A NEW DBMS 13
ESTABLISHING MANAGEMENT OBJECTIVES 19
3.1 OVERVIEW 20
3.2 CONVERSION TO A DATA BASE ENVIRONMENT 24
3.2.1 Impact On the Application Portfolio 25
3.2.2 Impact On the EDP Organization 29
3.2.3 Impact On Planning and Control Systems. ... 35
3.2.4 Impact of Conversion On User Awareness. ... 38
3.3 MINIMIZING THE IMPACT OF FUTURE CONVERSIONS .. 39
3.3.1 Institutionalization of the DBA Function. . 39
3.3.2 DBMS Independent Data Base Design 40
3.3.3 Insulate Programs From the DBMS 40
3.4 CONVERSION FROM ONE DBMS TO ANOTHER DBMS 41
3.4.1 Reasons for Conversion 42
3.4.2 Economic Considerations 43
3.4.3 Conversion Activities and Their Impact. ..." 44
3.4.4 Developing a Conversion Strategy 48
- i i i -
3.5 SUMMARY 50
ACTUAL CONVERSION EXPERIENCES 53
4.1 INTRODUCTION 54
4.2 PERSPECTIVES 55
4.3 FINDINGS 57
4.3.1 Industrial /Governmental Practices 58
4.3.2 The First DBMS Installation 60
4.3.3 Desired Standards 61
4.3.4 Desired Technology 61
4.4 TOOLS TO AID IN THE CONVERSION PROCESS 62
4.4.1 Introduction 62
4.4.2 Changing From Non-DBMS To DBMS 62
4.4.3 Changing From One DBMS To Another 64
4.4.4 Changing Hardware Environment 66
4.4.5 Centralized Non-DBMS--di stri buted DBMS. ... 66
4.4.6 Centralized DBMS--di stributed DBMS. .• 67
4.5 GUIDELINES FOR YOUR FUTURE CONVERSIONS 67
4.5.1 General Guidelines 68
4.5.2 Important Considerations 68
4.5.3 Tight Control 69
4.5.4 Precise Planning/pre-planning 69
4.5.5 Important Actions 70
4.6 REPRISE 73
4.7 ANNEX: CONVERSION EXPERIENCES 73
4.7.1 Conversion: File To DBMS 73
4.7.2 Conversion: Manual Environment To DBMS. .. 79
4.7.3 Conversion: Batch File System To a DBMS. . 84
4.7.4 Conversion: DBMS-1 To DBMS-2 87
STANDARDS 93
5.1 INTRODUCTION 93
5.1.1 Objectives 93
5.1.2 What Is a Standard? 94
5.1.3 Background 94
5.2 POTENTIAL BENEFITS THROUGH STANDARDIZATION ... 97
5.3 SOFTWARE COMPONENTS IN CONVERSION PROCESS 98
-i v-
5.3.1 Scenario 1 98
5.3.2 Scenerio 2 99
5.3.3 Scenario 3 100
5.3.4 Scenario 4 100
5.3.5 Miscellaneous Standards Necessary 100
5.3.6 Non-software Components Necessary 100
5.4 RECOMMENDATIONS 101
5.4.1 The Development of a Standard DBMS 101
5.4.2 Generalized Dictionary/directory System. . 101
5.5 CONCLUSION 103
5.6 REFERENCES 103
CONVERSION TECHN0L0GY--AN ASSESSMENT 105
6.1 INTRODUCTION 106
6.1.1 The Scope of the Conversion Problem 106
6.1.2 Components of the Conversion Process 107
6.2 CONVERSION TECHNOLOGY 109
6.2.1 Data Conversion Technology 110
6.2.2 Application Program Conversion 120
6.2.3 Prototype Conversion Systems Analysis. ... 129
6.3 OTHER FACTORS AFFECTING CONVERSION 138
6.3.1 Lessening the Conversion Effort 138
6.3.2 Future Technologies/Standards Impact 145
BIBLIOGRAPHY 150
PARTICIPANTS 161
-v-
PREFACE
In 1972 the National Bureau of Standards (NBS) and the Association
for Computing Machinery (ACM) initiated a series of workshops and
conferences which they jointly sponsored and which treated issues
such as computer security, privacy, and data base systems. The
three -day Workshop, DATA BASE DIRECTIONS--The Conversion
Problem , reported herein continues that series. This Workshop
was held in Fort Lauderdale, Florida,on November 1-3, 1977, and is
the second in the DATA BASE DIRECTIONS series. The first, DATA
BASE DIRECTIQNS--The Next Steps , received wide circulation and,
in addition to publication by NBS, was published by ACM's Special
Interest Group on Management of Data and Special Interest Group on
Business Data Processing, the British Computer Society in Europe,
and excerpted by IEEE and Auerbach.
The purpose of the latest Workshop was to bring together leading
users, managers, designers, impl ementors, and researchers in
database systems and conversion technology in order to provide
useful information for managers on the possible assistance
database management systems may give during a conversion
resulting from an externally imposed system change.
We gratefully acknowledge the assistance of all those who made the
Workshop's results possible.
S
DWector, Cerrter for Programming
Science and Technology
Institute for Computer Sciences and
Technology
- vi -
A MANAGEMENT OVERVIEW
To a manager, conversion answers the question, "How do I
preserve my investment in existing data and programs in the face
of inevitable changes?" Selection of conversion as a solution
depends directly on issues of cost, feasibility, and risk. Since
change is inevitable, prudent managers must consider
preparations that ease inevitable conversions. How does a
manager choose a course of action?
When Mayford Roark, Executive Director of Systems for the
Ford Motor Company, and Keynoter of the Workshop, sought an
analogue for examining DBMS and the conversion problem, he aptly
selected the idea of "mapping a jungle." Reporting on his
experience, he noted that 90% of his computers were changed within
three to five years and major software changes from the vendor
occurred somewhere between five and ten years after acquisition.
These forced changes coupled with the organization's
changing requirements led Roark to a basic point: "evolutionary
change is the natural state of computer systems." In short, the
ADPmanager's continual task is to "manage change."
Roark's managers experienced the classic benefits of DBMS:
quicker response to changing requirements, easier new
application development, and new capabilities not possible in the
earlier systems, but Roark summarized his conversion experience
within a DSBMS environment in this way:
. hardware changes — having a DBMS was a moderate to major
burden.
. software changes — dependent on the circumstances, having
a DBMS ranged from negligible impact to am a j or burden.
. evolutionary application change — havi ng a DBMS was a
moderate boon. It proved effective, but very expensive.
Considering even the conversion to a DBMS, Roark scores this
process a moderate burden because of the risks and costs
associated with DBMS.
- vi 1 -
He emphasized this point by describing the major DBMS
need as "easy-to-use, easy-to-apply , and inexpensive
approaches for upgrading" two decades of computer files to a
data base environment.
Given this charge, how did the workshop respond?
Consideration of the conversion to a DBMS led to
several specific caveats intended to control the risk
inherent in such a step.
Though conversion to a DBMS requires careful
preplanning and may not be appropriate for every
application, managers should consider data base
technology an inevitable development thrusting
itself on future data processing installations. A
manager will have no choice but to face this
decision eventually.
The first DBMS application can make or break the
success of the conversion. The new system's users
must have a receptive disposition which results only
from careful preparation, preplanning, and the
application of basic management skills. The initial
application plays an important tutorial role for
everyone in the organization including such subtle
1 essons a s :
--whether management is truly committed to the new
system by supporting it with adequate resources,
planning for the continuous support of the system,
and applying the necessary managerial cross-
department discipline.
--whether the installation technical staff truly
appreciates user needs, can adjust to user changes,
and has the necessary skills and backing to carry-
off the task.
--whether any of the DBMS conversion proponents
have accurate estimates of the costs, the proper
tools to use, and a feasible conversion plan
expressed in terms that satisfy a! 1 risk sharers.
- VI 1 1-
While the staff must know the new
technology, they must not conclude that the
new technology relieves them of the old
project management controls that all new
systems reauire. Tight planning, management
control, cost monitoring, contingency
approaches, user review, and step-wise
justifications must be used.
No "final conversion" ex i sts--pl anni ng for
the next one begins now! Prepare your
system to permit evolutionary change to
enhanced technology--!' ncl udi nq improvements
to DBMS.
How will future technology help
Hardware development, particularly the prol
of mini- and micro-computers, networks, a
mass storage, will increase the need for ge
conversion tools. On the other hand,
hardware costs will make conversion inc
more acceptable. Additionally, speci
machines which promote logical level in
will simplify the conversion process,
advances in improved data independence
software design will also simplify bu
eliminate the conversion problem. Of
concern to managers: user demand for seve
models will continue.
mana
i f er
nd
rera
dro
reas
al
terf
th
t
sp
ral
gers?
a t i o n
1 arge
1 i z e d
p p i n g
i n g 1 y
DBMS
a c i n g
Major
rough
never
e c i a 1
data
In the next five years, managers can expect to
see more operational generalized conversion tools
but certainly not full automation of the process.
Significantly, standards design and acceptance by
vendors will plan a major role in the success of
generalized conversion tools. Commercially
available tools for data base conversion seem likely
in ten years but the conversion of application
programs is not likely to have a generalized
solution in the next five years.
Standards address several manager needs in the
conversion process. A standard DBMS would
considerably ease the future conversions involving a
DBMS. A standard data dictionary/directory would
facilitate all conversions. This latter point
emphasizes that the data dictionary/directory can
stand apart from data base systems and. therefore,
can assist the conversion to a first DBMS.
-ix
A standard data interchange format would ease
considerably the loading and unloading of data and
thus facilitate the development of generalized
conversion tools. Manufacturer acceptance of the
standard format, would permit their development of
convertors from the standard form to their system, a
boon to managers either forced to or desirous of
considering a different system.
Similarly, standardization of the terminology
used in data base technology, convergence of
existing DBMS to using common functions, and use by
DBMS of a micro-language or standard set of atomic
functions would assist managers in dealing with
conversion from DBMS to DBMS.
In summary: in the next five to ten years
managers must depend on existing good management
practices rather than wait for automated conversion
tools. Standards,. as a management exerted
discipline, will facilitate conversions but users
can expect reluctant acceptance of standards.
-x-
DATA BASE DIRECTIONS —
The Conversion Problem
John L. Berg, Editor
ABSTRACT
What information can help a manager assess
the impact a conversion will have on a data base
system, and of what aid will a data base system be
during a conversion? At a workshop on the data
base conversion problem held in November 1977
under the sponsorship of the National Bureau of
Standards and the Association for Computing
Machinery, approximately seventy-five participants
provided the decision makers with useful data.
Patterned after the earlier Data Base Direc-
tions workshop, this workshop, Data Base
Directions — the convers ion problem , explores data
base conversion from four perspectives: manage-
ment, previous experience, standards, and system
technology. Each perspective was covered by a
workshop panel that produced a report included
here.
The management panel gave specific direction
on such topics as planning for data base conver-
sions, impacts on the EDP organization and appli-
cations, and minimizing the impact of the present
and future conversions. The conversion experience
panel drew upon ten conversion experiences to com-
pile their report and prepared specific checklists
of "do's and don'ts" for managers. The standards
panel provided comments on standards needed to
support or facilitate conversions and the system
technology panel reports comprehensively on the
systems and tools needed — with strong recommenda-
tions on future research.
Key words: Conversion; Data Base; Data
Description; Data Dictionary; Data Directory;
DBMS; Languages; Data Manipulation; Query.
-1-
1. INTRODUCTION
Daniel B. Magraw
GENERAL CHAIRMAN
Biographical Sketch
Daniel B. Magraw is Assistant Commissioner,
Department of Administration, State of Minnesota.
For nearly ten years he has been responsible for
all aspects of the State of Minnesota information
systems activities. His more than thirty years'
experience in systems divides almost equally
between the private and public sectors.
A frequent contributor
activities, he was one of the
past president of the National
State Information Systems. He
Systems for 22 years in the
Minnesota Extension Division
to professional
founders and is a
Association for
taught courses in
University of
and he has been a
frequent speaker on many matters relating to
information systems and has been deeply involved
with both Federal and state data security and
privacy legislation.
He was keynote speaker at the 1975 Data
Directions Conference.
Base
1.1 THE FIRST DATA BASE DIRECTIONS WORKSHOP
In late October, 1975, a Workshop entitled "Data Base
Directions: The Next Steps" was held in Fort Lauderdale, Florida.
Resulting from a proposal brought to Seymour Jeffery at the
National Bureau of Standards by Richard Canning and Jack Minker,
the workshop was sponsored jointly by the Association for
Computing Machinery and NBS. The product of the intensive two and
a half day effort was a series of panel reports which, as
subsequently edited, were issued under the title of the workshop
as NBS Special Publication 451.
-3-
As early as December, 1975, suggestions were made to
ACM and NBS concerning the desirability of one or more
future conferences on the same general topic. These
suggestions were based on the belief that data base systems
will grow increasingly in importance and pervasiveness and
were supported by perceptions, even prior to issuance of NBS
SP 451, that the workshop had more than met expectations.
The report was issued in 1976; it was generally thought
to be a valuable contribution to several audiences including
top management, EDP management, data base managers, and the
industry". It has had an unusually wide circulation for
reports of this nature.
1.2 PLANNING FOR SECOND CONFERENCE
In February, 1977, ACM and NBS decided in favor of a
second data base workshop and invited me to serve as General
Chairman. An initial planning group was established
consisting of Dick Canning and Jack Minker representing ACM,
John Berg representing NBS, and the chairman. The planning
group entitled the Conference "Data Base Directions II--The
Conversion Problem."
Four working panel subjects were selected. Panel
chairmen were recruited and became members of the planning
group. Subject matter coverage for each panel was specified
by the planning group. Each panel chairman then selected
members of his working panel. In addition, the planning
group specified that the format and procedure for the
workshop should follow closely that of the first workshop.
As in the 1975 workshop, attendance was by invitation only.
Work was done by each panel prior to arriving at the
work shop .
1 .3 DATA BASE DIRECTIONS II
The workshop was held November 1-3, 1977, in Fort
Lauderdale, Florida. There were approximately 75 in
attendance. Mayford Roark, Ford Motor Company, gave an
excellent keynote address in plenary session, reproduced
herein. Final instructions were given as to objectives of
the workshop, and all panels were in full operation by 10:30
a.m., November 1. From then until the closing plenary
session, each of the four panels met separately with the
ultimate purpose, of developing a consensus among its members
on which to base the panel report.
-4-
Each working panel chairman was expected to guide the
group discussion and to assure preparation of a good rough
draft report prior to the closing plenary session. Though
each panel chairman organized his panel in a form best for
his subject, each followed a general pattern. A recorder
was selected from among each panel's members to maintain
"minutes" in visual form on flip chart sheets. These were
displayed so that the principal discussion and consensus
points were visible.
One of the panels had done extensive drafting of report
segments prior to the beginning of the workshop. Each of
the other three accomplished the objective of rough draft
preparation by developing detailed outlines. Portions of
the outlines were assigned to individual panel members or to
two person teams for draft preparation. When time
permitted, drafts were reviewed by others on the same panel
prior to their incorporation into the final draft.
Following the completion of the workshop, the drafts
were put into "final" form and circulated to panel members
for final comment prior to submission to the proceedings
edi tor .
Communication among the four panels was maintained
mainly through the panel chairmen and members of the
planning group who circulated among the panels. At the
closing plenary session, each working panel chairman
presented a twenty minute summary of his panel's findings
and responded to Questions or comments from the floor.
1.4 CONCLUSION
T
perspe
a base
unpara
materi
the c
It is
impact
on two
attemp
d e c i s i
II , ho
that t
he
cti v
of
llel
al v
ount
hope
th
mor
ted
on-
weve
hi s
mix o
es , ex
knowl e
ed. T
al ue , e
ry car
d that'
an did
e years
to be m
makers .
r , will
documen
f P
p e r i e
dge
he o
s p e c i
ry ing
this
Data
of e
ore d
The
be d
t pro
a r t i c i p
nces , a
in d a t
utput
al ly to
respon
p u b 1 i c a
Base D
x t e n s i v
i r e c t a
real v
i r e c 1 1 y
v i d e s t
ants
nd te
a ba
from
thos
sibil
tion
i r e c t
e ex
nd po
al ue
prop
o the
with
c h n i c a 1
se syst
that g
e d e c i s
i ty for
can have
ions I b
p e r i e n c e
i n t e d in
of Data
o r t i o n a 1
several
the
expe
ems
roup
ion-
data
an
ecau
an
its
Ba
to
cl a
i r d
r t i s e
that
shou
makers
base
even
se it
d bee
a d v i c
se Di
the as
sses o
iffe
prov
may
Id b
ac
f utu
gre
i s b
ause
e to
rec t
si st
f us
ring
ides
be
e of
ross
res .
ater
ased
it
the
ions
ance
ers .
-5
EVOLUTION IN COMPUTER SYSTEMS
Mayford L. Roark
KEYNOTER
Biographical Sketch
Mayford L. Roark is Executive Di
Systems for the Ford Motor Company in
Michigan. He has been in charge of the
systems function at Ford since 1965, as
Controller and later as Director of th
Office before assuming his present p
1973. He joined the Ford Division of F
Company in 1952 as Senior Financial An
managed various financial departments at
1955-1965.
Mr. Roark was a Budget Examiner at
Bureau of the Budget from 1947
Previously he was with the U.S. Weather
the Colorado Department of Revenue.
rector of
Dearborn ,
corporate
Assistant
e Systems
o s i t i o n in
ord Motor
alyst, and
Ford from
the U.S.
to 1952.
Bureau and
He studied at the University of Colorado,
receiving the B.A. "Magna Cum Laude" degree
(Economics), and the M.S. (Public Administration).
He is a member of Phi Beta Kappa.
2.1 QUESTIONS
When I was asked to address this group, with its theme
of "The Conversion Problem," I felt some puzzlement about
the thrust of the meeting. Was there a concern that data
base systems miqht be something of a special burden for the
organization faced with a conversion problem? Or, rather,
was there the hope that the data base system might be
something like "Seven League Boots," for making otherwise
tough conversions into happy and speedy journeys? Or, was
there a lingering horror that the data base systems
themselves might turn out to be conversion nightmares as
improved DBMS resources become available?
-7-
As the working outlines began to arrive through the
mail, it became clear that all of these questions were
concerns. As Jim Fry s statement put it, "The basic problem
we are addressing is the need to migrate data and
applications due to the introduction of new, or change in
^tpmc^T . re ; u1 r? ments » hardware, and/or software
•un 1 short, it appears that your intent is to map a
I won't try to preempt the work of the
individual
a
s
My
n A
impressions. As you survey a topic in depth, "yoTmay well
conclude that some of these were inaccurate. This ha
certainly been the
detailed surveys re
travelers.
panels by offering any detailed map of my own. Rather as
frequent travel er through this jungle, my traveler's 'note
may be of some help to your individual survey parties M
comments will take the form of generalizations an
-■■»--«- "^i^ niattuiflie. inis
usual result in geographic history,
fine the rough findings of ea
s
as
arly
As an initial generalization, let me
conversion problems into four families of change
Hardware Changes,
Software Changes,
Evolutionary Application Development,
Migration to a Mew DBMS.
categorize
I shoul
conversion
their freque
respect to d
as specific
DBMS in each
"Boon or Bur
help for a
it as plus 2
1. If DBM
shall score
a discussion
d like to discuss each of these families of
problems, and to relate my own impressions as to
ncy of occurrence and their significance with
ata base management systems. In an effort to be
as possible, I will try to rate the impact of a
against a "BB seal e"-- that ' s shorthand for
If a DBMS, in my view, is a major boon or
of conversion problem, I shall score
a moderate boon, that will be plus
likely to be a burden than a boon, I
1 or minus 2. Now let me proceed to
case
den."
q i v e n
. If
S i s
it as
class
it is
more
minus
of these families of conversion problems.
2.2 HARDWARE CHANGES
One
of othe
a c q u i s i t
rental
for a do
on about
universe
process i
years fr
mi nicomp
a p p 1 i c a t
s i g n i f i c
computer
us , 3 t
whatever
powerful
new pro
improvem
think t h
awhile.
of my
rs , a
ion i n v
of more
zen yea
3,000
, I w
ng comp
om thei
uters
ions;
ant ch
s do ch
o 5 yea
comput
We
ducts
e n t s i
e 3 to
chore
" p r i
o 1 v i n
than
rs no
compu
oul d
uters
r ac
used
these
ange
ange
rs is
er we
are
every
n co
5 ye
s at For
or r e v i
g a chan
$10,000
w , d u r i n
ter proj
guess
are 1 i k
q u i s i t i o
i n
may r
for a
fairly r
1 ong en
have an
al so f ac
few y
st effec
ar cycl
d is to
ew" of
ge of
a year
g which
ects .
that 9
ely to
n. Th
d e d i c a t
emai n
decade
a p i d 1 y ,
ough to
d to mo
e d with
ears ,
t i v e n e s
e i s
unde
eve
proc
. I
time
If Fo
pe
re
w
be
i s
ed,
i n
or mo
howe
satu
ve on
a da
al way
s. S
1 i k e 1
rtake ,
ry com
essor
have b
I hav
rd is
rcent
pi aced
oul d
i n d u s
opera
re . D
ver .
rate t
to s
z z 1 i n g
s wit
o , for
y to
wi t
pute
or
een
e s
typi
of
wit
not
tri a
tion
ata
For
he c
omet
sue
h s
mos
con
h the
r har
an
doing
i g n e d
cal o
all
hi n 3
hold
1 co
wi
proce
mos
a p a c i
hi ng
c e s s i
ubsta
t of
t i n u e
aid
dware
added
this
off
f the
data
to 5
for
ntrol
thout
s s i n g
t of
ty of
more
on of
n t i a 1
us , I
for
The extent of conversion trauma from a hardware upgrade
depends on the nature of the change. If the change involves
shifting to another supplier, an event that occurred in
something like 5 percent of our hardware changes, the
conversion effort can be an extended affair requiring a
major effort over a period of a year or more. In such
cases, a DBMS is likely to be something of a burden. In all
probability, the new equipment will require a shift to a
different DBMS. In any event, the logic requiring
conversion will be somewhat more complex and difficult to
handle if a DBMS is present.
One of our divisions recently completed a conversion
from Honeywell to Burroughs, a change made necessary because
of reorganization unrelated to the system function. Some of
the application programs to be converted had been developed
in Honeywell's IDS environment; they were converted to
Burroughs DMS II. IDS is a network system; DMS II is a
hierarchical structure, so one might guess that we would have
had problems. In fact, I am assured, both by our own people
and by the Burroughs conversion team, that this transition
went fairly smoothly. On the basis of that once-in-a-
lifetime experience, I would have to score the DBMS in this
case as minus 1. Perhaps it would have been minus 2 if the
application programs involved had been larger and more
compl ex .
-9-
Hardware conversion can also be a serious trauma when
migrating to new products of an incumbent supplier
Suppliers do present us with new families of hardware from
time to time that require structural changes in our
application programs and supportina software. " Hopefully
this kind of change will be less frequent in the future than
it has been in the past. Even so, I suspect we are likely
to be faced with something of this kind e^ery 10 years or
so. Already, one hears strange rumblings about the kinds of
changes likely to unfold 2 or 3 years hence in a new product
line called 'The New Grad." So far, DBMS has not figured
importantly in any of our conversions within the hardware
products of a given supplier. On the basis of guesswork and
a skeptical disposition, I would have to assume that the
presence of a DBMS for a major hardware family transition
would be similar to that where a change in supplier is
involved, possibly minus 1 or minus 2 on the BB scorino
scale. 3
The remaining hardware conversions, which represent the
vast majority of all hardware upgrades, involve moving up
within a compatible family of products offered by a single
supplier. In these cases, DBMS would not be much of a
factor, so we might score its presence as zero on the BB
scale.
2.3 SOFTWARE CHANGES
Despite IBM's assurance that MVS is here to stay our
experience would suggest that we can look for "major"
software upgrades or enhancements from any supplier every 5
or 10 years. At Ford, we are somewhat preoccupied at the
moment with the MVS problem at our large data center in
Dearborn. The completion of this conversion will require a
major effort extending over a two-year period. There are
three substantial groups of data base systems that will be
affected by the conversion; these involve, respectively
IMS, TOTAL, and SYSTEM 2000. I have checked with each of
the systems groups involved to see how they assess the
impact of DBMS at this point, when we are about midway in
the conversion effort. The managers working with IMS agree
that its presence has added substantially to the difficulty
and complexity of the conversion task--somethi ng close to
twice as much effort as would have been involved with non-
DBMS applications. The IMS conversion involves more than a
new operating system, however. It requires the transition
to a new DBMS called IMS-VS. One manager sees the new
package as offering many attractive new features. Another
manager sees no incremental benefits at all for his
particular applications; but he has no choice, his present
IMS software will not be supported under MVS.
-10-
The managers using TOTAL and SYSTEM 2000 see no special
conversion problem at all in going to MVS.
This sampling of experience adds up to a very mixed
bag. As the warnings say about some drugs, a major software
conversion "may be hazardous to your heal th"--but again, it
may not. I think we have to score the presence of a DBMS in
such a situation with a range from to minus 2 on the BB
seal e .
We do have numerous other software changes, of course,
on an almost continuous basis--new releases to old operating
systems, new software packages to handle new functions, and
the like. There is nothing in our experience to indicate
that DBMS is much a factor one way or another in these minor
changes .
2.4 EVOLUTIONARY APPLICATION DEVELOPMENT
Now we move into the
function is all about, and
terri tory of
what DBMS al so
what the systems
is all about.
Before getting into the particulars about this family
of change, we ought to be clear on one central point.
"Evolutionary change" is the natural state of computer
systems, just as it is for biological systems. Perhaps I
should not push this analogy too far because there is one
dramatic difference — evolutionary change works more rapidly
in computer systems, where even 5 years can produce dramatic
changes in structure, outputs, and even objectives.
During the past 5 years, the workload at
computer centers at Ford has been growing at
percent annually. I might explain here that we
measure w6rkload in BIPS (Billions of
Processed) for purposes of capacity planning,
say that growth has been close to 20 percent a
that the BIPS have been growing at close to
annual ly .
our major
close to 20
attempt to
Instructions
So, when I
year, I mean
20 percent
As best as I can judge, about
results from new applications and
adaptations of existing applications.
hal f of thi s growth
about hal f from new
The new adaptations, which are often described by the
rather condescending term "maintenance," include a lot of
things. One is simply the effect of volume. If we sell
more cars, other things equal, we will process more BIPS.
Another is changing requirements. In our industry, every
model yedr is, in a sense, a new game. The products may
change, terminologies may change, and data requirements may
-11-
change. One of the biggest sources of changing requirements
in the automotive industry is government regulation. The
work of the regulators never ceases, nor does the growth in
requirements for additional data elements in the burgeoning
computer records that we must maintain as evidence of
compliance with all sorts of directives from Washington
Some of these changing requirements are massive
like OSHA regulations or like corporate fuel
standards. Others seem almost trivial until
things
economy
, they are
examined in the context of required changes in computer
files. The industry is currently being asked to adopt as an
international standard a 16-character vehicle identification
number. Our existing identification numbers, with 11
characters, turn up in more than 50 separate computer files
in North America alone. The cost of converting these files
and their related application programs to the
international standard is estimated at $3 million.
new
In all fairness, the government is not the only source
of changing requirements, our engineers and product planners
are pretty good changers themselves. Our service parts
files include roughly twice as many parts today as 10 years
ago, when our basic inventory control system was developed.
Our organization people contribute more than a fair share of
changing reaui rements . Every time there is a corporate
reorganization, we find it necessary to go through a
restructuring of the hundreds, or even thousands, of
computer files and application programs that are affected by
the real ignment .
And even systems people are contributors to the
evolutionary process, only we usually call the changes
efficiency improvements." Almost e\/ery systems activity
has its own more or less continuous effort aimed at cleaning
up programs that run inefficiently, that are hard to
maintain, or that otherwise need overhaul.
To make another sweeping generalization, I would judge
that each of our systems activities annually rewrites
somewhere between 5 percent and 25 percent of its
accumul a ted code .
What a fantastic opportunity for data base management
systems! I would score the DBMS impact on this area of
evolutionary application development as plus 2 and proceed
to the next topic except for one problem. The DBMS benefits
can be extremely difficult to realize in practice, and the
price of the cure is sometimes worse than the pain of the
ailment.
-12-
Ma
af f ai rs
of data
coun ted
burden
several
cost,
overhea
and fou
resul te
process
r e d u c i n
ny of our
Some of
For sys
in 6 o
in a range
hundreds
One of our
d associa
nd that 70
d from D
of restru
g process i
files
these
terns o
r 7
of 30
of
a c t i v
ted w i
perce
BMS o
c t u r i n
na req
most
files
f this
figure
perce
thousa
i t i e s
th one
nt of
verhea
g its
ui reme
subject
run i n t
size, p
s annua
nt to 40
nds of
1 ast yea
of its
the i n s t
d. Thi
DBMS w
nts by a
to
the
roces
lly.
perc
dolla
r mad
1 arge
r u c t i
s ac
i th
full
change
billion
sing cos
An adde
ent can
r s in ad
e an an
data ba
o n s b e i n
t i v i ty
the o b j
40 perc
are
s of
ts ma
d ove
amoun
ded a
a 1 y s i
se sy
g exe
i s i
e c t i v
ent .
huge
bytes
y be
rhead
t to
nnual
s of
stems
cuted
n the
e of
of our activities launched a
in which nearly half of the
for pointers. Processing
can far exceed expectations
This sort of overhead can
Several years ago, another
large-scale data base system
file requirements were required
requirements in such a case
based on any prior experience,
mean a loss in processing productivity, something that has
to be weighed against any benefits in programming
productivity.
The best score I can give DBMS, therefore, for its
impact on "evolutionary application development" is plus 1.
It sometimes
expensive.
works well, but it can also be awfully
Somewhere in the deliberation of this "or related
groups," there ought to be some consideration of what can be
done to simplify DBMS technolgy. If this is not possible,
perhaps there could be some guides or standards to protect
the systems designer with a modest problem from unwittingly
stumbling into a huge solution out of all proportion to his
need. Sometime in the future, I would like to go through an
exercise like this again and conclude, without any
reservation, that DBMS is an unqualified boon for the
evolving system regardless of its size and complexity. We
are still in the very early stages of DBMS art.
2.5 MIGRATION TO A NEW DBMS
I have saved this family of conversion problems until
last. In a sense, it overlaps all three of the categories I
discussed earlier. Migration to a new DBMS may be prompted
by a change of hardware; it may be forced by a change of
software; or it may be undertaken with a view to realizing
the benefits we discussed under "evolutionary applications
development." Still, the migration to a new DBMS is a major
event that deserves consideration in its own right, whether
the migration is from a non-DBMS environment or is the rare
change from one DBMS to another.
-13-
There is a logical inconsistency in assigning any
rating at all to this family of conversion problems. It is
like asking, "What impact does DBMS have upon itself?" Yet,
at the risk of sounding completely illogical, I am aoing to
assign a BB rating of minus 1 to this category of conversion
problems on the ground that a DBMS is a high-risk
undertaking entirely apart from whether it is related to
changes in hardware, software, or applications.
Moving to a new DBMS is not unlike the process of
getting married, it takes a lot of desire, commitment,
sacrifice, and investment. It involves moving to a wholly
new lifestyle. Once there, the return to the old lifestyle
may be difficult or impossible without wrenching adjustment
probl ems .
I have several times had the
a spokesman for some other organ iz
made the policy decision that all
our organization will be based
shudder. It is a little like sayi
everyone should get married." I
every instance cross examinatio
policy or no policy, the organizat
a lot of its computer business out
I expect this will be true of
many years to come, or at least un
can be simplified to the point w
total desire, total commitment, to
i n vestment .
expe
a ti o
f utu
on
ng,
must
n h
ion
side
near
til
here
tal
ri en
n wh
re d
DBM
"We
add
as
i n q
the
ly a
the
it
prep
ce o
o te
evel
S."
've
, ho
esta
uest
DBM
11 o
DB
no 1
arat
f en
lis
opme
Thi
dec
weve
bl i s
ion
S en
f us
MS
onge
ion ,
coun
me ,
nt w
s ma
i ded
r, t
hed
s ti 1
vi ro
for
tech
r re
and
t e r i n q
"We 1 ve
o r k in
kes me
that
hat in
that ,
1 does
nment .
many,
no! ogy
quires
heavy
comp
d i v i
like
from
deve
appl
comp
this
neve
must
syst
unde
when
Not too
uter sys
s i o n s and
50,000
major
lop. Ot
i c a t i o n s
1 e x i ty .
whole ra
r shoul d
be taken
ems peop
rtake sue
we do .
long
terns
affi
appl
proje
hers
repre
DBMS
nge o
. In
as o
le.
ham
ago,
app
1 i at
icat
cts
were
sent
te
f de
any
ne o
We
igra
we
1 ica
es .
ion
req
rel
s a
chno
vel o
eve
f 1
all
tion
compi
t i o n s
The
prog
u i r i n
a t i v e
wide
logy
pment
nt , m
ife' s
nee
and
led a c
in use a
1 i s t i n g
rams. So
g scores
1 y simple
spectrum
today doe
a 1 r e q u i r
i g r a t i o n
c 1 i m a c t
d more gu
how to st
atal og
tour
came
me of
of
. The
of da
s not
ements
to a n
ic ev
i d a n c e
av ou
of
North
to s
these
man-y e
full
ta ne
real ly
Per
ew DBM
ents
about
t of
all the
Ameri can
omethi ng
resul ted
ars to
range of
eds and
address
haps it
S system
to most
when to
trouble
These, then, are the major problems to be found in this
jungle of systems conversions. To summarize, my BB scorings
have suggested that a DBMS can constitute a moderate-to-
heavy burden for major hardware conversions and major
software conversions. The presence or availability of DBMS,
however, can be a moderate- to-strong boon for evolutionary
-14-
application development. Finally, the migration to a DBMS
system can be a long and tedious journey, especially where
existing systems of great size and complexity are involved.
DBMS', in short, still offers attractive visions of a
world in which systems might respond quickly to changing
requirements, in which new creative applications might be
easily spawned from existing data files, and where data
bases, in the words of Dan Magraw at the earlier conference
of two years ago, can "move the DBM's into the area of
decision making."
These visions ought not to be taken lightly. In
preparing for this meeting, I checked back with our managers
who have been in the DBMS mode for 5 years or more. What
benefits did they really get? Their consensus might be
summarized as follows:
1. They, indeed, have been able to respond more quickly
to changing requirements, although not always as
easily as they once hoped.
2. New applications development has been easier. The
managers see a productivity improvement factor in a
range of 10 percent to 20 percent.
3. Most important, the managers believe they have
capabilities that previously did not exist at all.
Let me expand on these capabilities for a moment.
Those 50,000 computer programs I mentioned are a long-time
accumulation in response to numerous problems and
opportunities that were perceived by our division and staffs
over a period of two decades.
Our
moderate
range of
yet in a
its own
greatly
" struc tu
these a
would gi
wi th
accessib
files, w
ty p i c a 1
-size c
1,500 t
DBMS en
set of f
i nfl uen
red prog
pproache
ve some
what o
i 1 i ty i s
i th all
d i v i s
ompany ,
o 3,000
v ironme
i 1 e s .
ced by
rammi ng
s may h
a c c e s s i
ur pe
someth
the ace
ion, w h
has it
program
nt, each
In the 1
the cone
." For
ave prov
b i 1 i ty t
opl e c
i n g els
ess keys
i c h m i g h
s own ace
s. If th
of those
ast two y
epts of "
systems
i d e d a 1 o
o files,
all "sp
e--f ragme
hidden i
t CO
umul a
e di
prog
ears ,
top-d
of r
g i c a 1
For
aghet
nted
n " sp
rre spond
t i o n , us
vision
rams com
we hav
own d e s i
ecent v
struc tu
ol der s
t i pro
and sc
a g h e 1 1 i
to a
ual ly a
is not
e s with
e been
gn" and
i n t a g e ,
re that
ystems ,
grams ,"
attered
code."
As functional entities for the purposes they were first
created, these old programs may serve well. Even where we
want to launch a new application that will need to draw on
the data in these files, the problem is not overwhelming.
-15
We can, and do, write programs to get
files, wherever they exist.
at the needed data
But suppose we do not want a new system. All we want
is an in-depth analysis of a difficult problem on which 5 or
10 different files have something useful to say. This is a
sort of problem that gives computer people a bad reputation
as being slow -moving and unresponsive. It can be extremely
difficult to extract data from 5 or 10 different sources,
all with different maintenance cycles and data control
procedures, and produce anything but a mess.
We do a lot of computer-based analysis at Ford because
we have found that the data resources in our computer files
can open up all sorts of insights and understanding that
would otherwise be lost. I wish we could do even more, but
this form of analysis can be very time consuming and
frustrating in the non-DBMS environment. We have been
working on one such exercise involving non-DBMS file sources
for more than three months, all because of the relative
inaccessibility of data. Last week we decided to skip a
promising analysis altogether because it would have taken
more than a month to extract and organize the needed data
from a variety of source files.
So, when our managers talk about new capabilities from
DBMS, they are talking about one of the computer's most
important potenti al s--the power to carry analysis to
entirely new levels of understanding.
The benefits cited by our managers are impressive
testimonials. Why, then, has the data processing world been
so slow in converting to DBMS? I have seen no studies that
would provide an accurate measure as to the proportion of
data processing oriented to DBMS. In the absence of any
sure data, I am going to offer the opinion that the
proportion is no greater than 10 percent to 20 percent.
I have never expected that all computer systems would,
or should, move to DBMS. In the early Seventies, however,
as we first began to realize the potential of this
technology, most of us, even the conservatives I believe,
would have expected that something approaching half of all
computer files would acquire a DBMS format by the end of
1977. Even at Ford, where I believe the impact of DBMS has
probably been greater than average, the progress has seemed
slower than I would have expected. This painfully slow
progress points up perhaps the greatest conversion problem
of all--how can we move to a DBMS environment with existing
systems?
-16-
Most of our DBMS user divisions made their first moves
to data base at least five years ago. A couple of these
divisions went through a conversion trauma, eventually
recovered, and never undertook a subsequent DBMS project.
Once was enough, they concluded.
The other divisions have continued to extend their data
bases in areas of new systems development. But, this leaves
a very large accumulation of computer files more or less
untouched by the new technology. One manager, possibly our
most enthusiastic data base advocate, believes that about 40
percent of his divisional data is now in DBMS format after
some five years of development. He guesses that this figure
may reach 70 to 80 percent in another five years.
We have still another group
maintenance workload who have
approach at all.
of divisions
elected not to
with heavy
try the DBMS
The problem is not unlike that of our Detroit skyline.
We recently completed a magnificent new Renaissance Center
along the riverfront, with some of the most beautiful hotel
and office structures to be found anywhere. This is
exciting, but there is still a long way to go to bring the
rest of the city up to Renaissance Center standards. The
accumulation of history is still a huge obstacle to those
who want the best of things right now. Omar Khayyam, who
summed up so many things in language that a sophomore can
understand, expressed this frustration perfectly:
... could thou and I with
grasp this Sorry Scheme of
To
Woul d we
Re-moul d it
not shatter it to
fate conspire
things entire,
bits--and then
nearer the Heart's desire!
All of us, however, have to find ways to working with
what we have inherited. We might all wish we could somehow
get rid of the old mess and start all over again.
Unfortunately, that old mess represents an investment in the
hundreds of millions of dollars for my company and in many
tens of billions of dollars for all of us collectively.
One of our divisions several years ago tackled this
rebuilding problem in what seemed to me an innovative kind
of way. This division had identified 15 overlapping files
that had evolved over the years, with inventories and other
data related to parts. As might be expected, there was much
redundancy, and it was difficult to reconcile one file to
another. A complete overhaul of the applications programs
did not seem feasible, but the division hit on the idea of a
DBMS master file to serve as a sort of front end to these
-17-
application programs. There was a saving of more than
$100,000 annually in data preparation and data control.
This limited effort brought the division into the DBMS
environment and laid the basis for solid evolutionary
development, including subsequent application revisions to
exploit more fully the potential of the data base
environment.
If there is one thing I would particularly want to see
come out of this conference, then, it would be some easy-
to-use, easy-to-apply , and inexpensive approaches for
upgrading this great accumulation of computer files we have
all been working on for the last two decades or so. We have
all had tantalizing but too-brief experiences with data
bases as they can be. The question before us now is, what
can we do to make those benefits available wherever we need
them?
Where does an organization with an accumulation of
1,500 to 3,000 programs begin, without going out of business
for a year or two? What tools can it use to map out its
data resources? How can it go about restructuring these
files without scrapping its investment in application
programs? And, even if the files can be rebuilt, what about
the necessary remodeling of the interface points within the
old programs? I hear that some of you came up with answers
to these questions. Perhaps you can point the way to this
new data-rational world we all seek.
The realization of this promise is still a long way
off. The real world of tomorrow will come when the DBMS can
contribute to the evolutionary process of applications
development and to the full analytical use of computer
resources, without exacting an extortionate price; when the
DBMS can aid in the evolutionary process of hardware and
software change; and when the decision to go to a DBMS is no
longer a high risk affair requiring an all-out commitment
through one of the most difficult conversion problems to be
found in systems.
I have no doubt that something \/ery much like this
world of tomorrow will appear one of these days, in great
part because groups like this one pressed on relentlessly in
the definition of problems and the search for creative
solutions.
-18-
3. ESTABLISHING MANAGEMENT OBJECTIVES
Richard L. Nolan
CHAIRMAN
Biographical Sketch
Richard L. Nolan is a researcher, author, and
consultant in the management of information
Chairman of the Nolan, Norton &
, and a former Associate Professor at
Business School, Dr. Nolan has
to improving the management of data
in complex organizations. His
include major publications in the
systems. As
Company, Inc
the Harvard
contributed
processi ng
contributions
areas of:
The four stages of EDP growth
Management accounting and control of data
processi ng
Managing the data resource function
Computer data bases
Dr. Nolan's experience with the management of
computer systems includes earlier associations
with the Boeing Company, the Department of
Defense, and numerous other large U.S. and
European public corporations.
PARTICIPANTS
Marty Aronoff
Richard Canning
Larry Espe
Gordon Everest
Robert Gerritsen
Richard Godlove
Samuel Kahn
Gene Lockhart
John Lyon
Thomas Murray
Jack Newcomb
T. Will iam 01 le
Michael Samek
Steven Schindler
Richard Secrest
Edgar Sibley
-19-
3.1 OVERVIEW
"Assimilation of computer technology into organizations
is a process that has unique characteristics which
management does not have a substantial base of
experience to draw upon for guidance. Perhaps the most
important unique characteristic is the pace of
penetration of the technology in the operations and
conventional information systems." [Richard L. Nolan,
"Thoughts About the Fifth Staae," Data Base , Fall
1975.]
The pace of the assimilation of computer technology
into the data processing organization is represented by the
S-shaped "Data Processing Learning Curve."
The Data Processing Learning Curve is approximated by
the growth of the data processing budget and reflects the
staged evolution of the data processing environment along
four growth processes:
Growth process
programs
#1_,. The po rtf ol io of computer
The programs and procedures which are
organization in its business activities. The
appl i cat ions .
used by the
Appl i cat ions
Portfolio represents the cumulative end product of the data
processing organization.
Growth process #_2 . The data processing organization and
technical capabi 1 i ties . ~ The organization structures and
technical c a p a b i 1 i t i e s found within the data processing
department which are required to develop and operate
application systems. These include:
Data Processing Management Structure
Hardware and Software Resources
Systems Development and Operations Organizations
Growth process #_3 . Data processing pi a n n i n g and ma nagement
control systems . The set of organization practices used to
direct, coordinate and control those involved in the
development and operation of application systems, including:
Data Processing Planning
Project Management
Top Management Steering Committees
-20-
Chargeout
Performance Measurement
ASSIMILATION OF COMPUTER TECHNOLOGY OCCURS
IN FOUR STAGES
o
BUILDING THE
APPLICATIONS
PORTFOLIO
BUILDING THE
DP ORGANIZA-
TION
BUILDING THE
DP MANAGEMENT
PLANNING AND
CONTROL
DEVELOPING
USER AWARENESS
Functional
Proliferation
i i
1 Consolidation 1
Data Base
Cost-reduction
I of Existing
on-line
Applications
I Applications
! /
1 Middle
Applications
Specialization
User-oriented
Layering i
for Technological
Programmers
L-Management
and
Learning
•'Fitting"
Lax
More Lax /
1 Formalized
Formal
_j*^
1 Controls
Planning and
Management
Superficially
1 Held
Control
Effectively
"Hands-ofT"
Enthusiastic
I Accountable
Accountable
StageJ:
Initiation
Stage II:
Contagion
Stage HI.
Control
Stage IV: Tim*
Integration
Figure 2-1
Process! nq Learn i n a Curve.
Growth process #4_. The user . The members of line and staff
departments who must use the applications systems in order
to perform their jobs.
The nature of Data Base Management Systems (DBMS)
dictates that they interface with each of the four growth
areas. First, the DBMS acts as the data manager for all
types of application systems. Second, the DBMS introduces a
-21-
sl o Vnr"«%«i»^. "ndit c a ii s i m "- ated by the ^
technology to be
ne management considerations
be summarized into four key
conversions,
identified by the panel can
concepts :
Sfy^I "-^.CONVIimOM ARE A HATTER OF
Conversion from a non-data
environment is a part of
data processing within an
matured and
is typically
base to a data base
the natural evolution of
Processing department ^icT^f 1 ' " * A data
progressed to a Stage III environment
to C resDond th to h L 9h h ma1ntenanCe C0StS and an ^abiTity
io respond to ad hoc inquiries and reouests fmm
user management for intenrated reports This
parent "tV*"*', f0rCeS the data J'ocesI g
rIs?ruc?ur P tL * T 1 °l • data base technology to
restructure the applications portfolio
words, conversion to a DBMS
of how soon the data base environment shouM t.pain
to be constructed, not whether h J a k
environment should be ' impl emented " dat * baSe
In other
is primarily a question
KEY CONCEPT #2: CHOOSE
APPLICA I IO M CARFFIII I V
IHE DBMS CONVERSION
The initial application used fo
r conversion to data
-22-
base technology represents an important learning
experience for the entire organization. As such,
the initial application should:
-be a non-trivial application
-demonstrate the "power" of the DBMS facilities
-be simple to avoid overextension caused by
attempting to do too much, too fast
KEY CONCEPT #3: TREAT THE INITIAL AND SUBSEQUENT
DBMS CONVERSIONS SIMILAR TO OTHER SYSTEMS PROJECTS
Al tho
techn
i n v o 1
the r
be mi
1 arge
and j
same
exerc
Becau
conve
coord
ccmmi
ugh a
ology
vemen
i sk e
n i m i z
pro
u s t i f
pro
i sed
se o
r s i o n
i n a t e
ttees
data
to
t of
xposu
ed by
ject
i c a t i
ject
thro
f th
, sp
CO
, sen
base
the
all ar
re of
manag
woul d
on pro
mana
ughout
e maj
e c i a 1
n v e r s i
i o r ma
con
orga
eas
thi s
ing
be
cedu
geme
th
or i
eff
on
nage
versi o
n i z a t i
in the
conve
the co
manage
res sh
nt m
e pr
mpac t
orts
a c t i v
ment ,
n i
on
sys
rsio
nver
d.
oul d
echa
ojec
caus
shou
i t i e
and
ntroduce
and req
terns env
n effort
s i o n as
The same
be us
ni sms s
t life
ed by a
1 d be
s with
user are
s a
u i r e s
i ronm
can
any o
pi an
ed.
houl d
cy
data
made
stee
as .
new
the
ent ,
best
ther
ni ng
The
be
cl e.
base
to
ring
. KE_Y_ CONCEPT #4 : PLAN AND STRUCTURE FOR FUTURE DBMS
CONVERSIONS NOW; DBMS CONVERSIONS WILL BE_ A WAY; OF
rTFT
Certainly, once a DBMS is successfully installed,
the conversion of applications to that DBMS will
continue. However, the mature organization should
also plan on converting to another DBMS at some
point in time. The second DBMS may be just an
enhanced version of the first DBMS, or it may be a
totally new software package. In either case, it is
certain that a mature data processing organization
will want to take advantage of new DBMS facilities
and efficiencies; therefore, DBMS conversions will
become a way of 1 i f e .
To prepare for these continued conversions, the data
processing organization can take several steps to minimize
their impact:
minimize the application system processing logic,
program code and data base design dependencies on
the features of a particular DBMS
■23-
. institutionalize the Data Administration function
. fully document all business system functions on an
integrated dictionary
Es
as
The overall DBMS conversion philosophy developed by
follows" 9 Management Objectives panel can be summari
the
zed
Appreciate the technology , but recog nize that DBMS
conversion j_s a management problem: — — — H£1 ^-
The panel approached the topic of data base conversion
in a chronological manner. As such, the following sections
thP ?!:? anize ? ^ reflect management considerations du ?nq
the life-cycle of data base conversion efforts:
• CONVERSION TO A DATA BASE ENVIRONMENT
• MINIMIZING THE IMPACT OF FUTURE CONVERSIONS
. CONVERSION FROM ONE DBMS TO ANOTHER DBMS
3.2 CONVERSION TO A DATA BASE ENVIRONMENT
A certain level of maturity is necessary before
conversion to a data base environment is feasible In
general, data base technology is not appropriate for dat,
conversi on .
Th nn,!? llow1n9 sectl '° n s discuss the impact of con
to a DBMS on the four data processing growth pro
namel y : 3 b v u
versi on
cesses ,
Applications Portfolio
Data Processing Organization
User Awareness
. Data Processing Planning and Management Control
Sy stems
-24-
3.2.1 Impact On the Application P
ent
data
subst
portf
the D
conve
evol u
a p p 1 i
techn
devel
are c
parti
prior
a p p 1 i
entry
above
folio
base
a n t i a 1
olio t
BMS. T
r t i n g
tionary
cation
ology ,
opment
onverte
cul ar
ity fo
cation
- 1 evel
topi
wi ng se
en
rest
o ta
he
the
ap
sy
to
i s
d to
appr
r c
P
vi ronm
ructur
ke adv
severa
app
proach
stems
a re
suspen
the n
oach
onvers
ortf ol
i c a t i o
cs
ctio
are
ns .
wTTl
ing of the
antage of t
1 potentia
1 i c a t i o n
, in whi
are dev
vol u tionary
d e d until e
ew environ
sel ected ,
ion must
i o . With
n is of
Tscussed i
ortfo l io .
era I
gen
organ
he en
1 ap
portf
ch
el ope
app
x i s t i
ment .
an o
be
i n
c r i t i
i zat
hanc
proa
olio
new
d
roac
ng a
R
rder
dev
thi s
cal
n greater detail
Convers
ly nece
ion's a
ed capab
ches th
range
or r
using
h , in
p p 1 i c a t i
ega rdl es
ing i n d
el oped
order
i m p o r t a
r^
ion to a
s s i t a t e a
p p 1 i c a t i o n
i 1 i t i e s of
at permit
from an
epl acement
data base
which new
on systems
s of the
i c a t i n q a
for the
ing, the
nee ! The
— i n the
Approaches To A p p 1 i c a t i o n Conversion
Two
basic
approaches exist for conversion of the application portfolio
to a data base environment: revol utionary and evolutionary .
Actually, these two approaches represent opposite ends of a
continuum of approaches. No single approach is universally
best. In fact, more than one approach may be operable
within a given conversion; i.e., certain application systems
may be converted on a revolutionary basis while other
systems are converted in an evolutionary manner.
In the revolutionary approach to conversion, sometimes
called resystemization , one rewrites and restructures
existing appl i cation systems as necessary to operate under
the new DBMS. Generally, one should avoid exclusive use of
this approach for the following reasons:
Risks overextension caused by attempting
too fast.
too much,
Delays in one sub-project may impact others.
Insufficient resources may be available for
developing new systems during the conversion.
At the opposite extreme of the revolutionary approach
is the evolutionary approach in which all new systems are
developed under the new environment. Existing systems are
not converted but rather are replaced at the end of their
normal life cycle. This approach reduces the risk of
overextension and the impact of delays in sub-projects.
However, there are disadvantages to an evolutionary
approach:
-25-
Complex interfaces with existing, conventional
systems are generally entailed.
Local inefficiencies and redundancy typically
result.
Current organizational deficiencies and
may be perpetuated.
constraints
Just as the conversion
portfolio may be approa
revolutionary manner, so may
existing application system
application system may be con
at one time, or theconve
The latter approach, in whi
functions are converted gr
advantage of early availabil
data and the new features of
flexibility in scheduling
However, the gradual appro
redundant development and
increased management to pro
bridges.
of
ched
the
I
verte
r s i o n
ch t
adual
ity
the D
the
ach
data
vide
the
i n
conver
n oth
d to t
may t
he re
ly, us
of bo
BMS.
conver
has t
stora
and co
e n t i r
an e
si on
er wo
he n
ake p
p o r t i
i ng b
th c
Furth
si on
he d
ge,
ntrol
e a
vol u
of
rds ,
ew
1 ace
ng
r i dg
ross
ermo
i s
i sad
and
the
ppl i
tion
a
the
envi
i n
and
es,
-fun
re,
pr
vant
r
con
cations
ary or
single,
entire
ronment
phases .
update
has the
c t i o n a 1
greater
o v i d e d .
age of
e q u i r e s
version
One temporary measure that may be employed to avoid or
postpone conversion is to extract data from existing master
files in order to build a transient, integrated data base.
This data base is not maintained but instead is recreated on
a cyclic basis. The data base is used for cross-functional
reporting and analysis. This approach provides early
availability of c ross- functional data and lends itself to a
specialized interrogation language. At the same time, there
are certain disadvantages to this approach:
Availability of data is achieved at the expense
redundancy and reloading.
of
Problems of timeliness and consistency may be
created.
Basic application limitations are perpetuated.
Maintenance Moratoriums. There
__ never sufficient
resources, nor is it appropriate, to permit continued
maintenance and enhancements of application systems during
conversion to the data base environment. Moreover,
requires a relatively stationary target. Thus, a
on maintenance (or more accurately on
may be declared during conversion.
the
conversion
moratori urn
enhancement)
-26-
resu
mana
for
prim
degr
the
mana
main
norm
proc
dura
unde
The decl
It of ag
gement.
the con
ary motiv
ee of u
other h
gement,
tenance o
a 1 b u s i n
e s s i n g ma
t i o n of
r which i
a r a t i o
reemen
Senior
v e r s i o
ator b
ser re
and ,
senior
nly so
ess f
nageme
the
t may
n of
t am
and
n .
ehi n
si st
if
man
1 on
unct
nt m
mora
be m
a ma
ong u
user
Howe
d the
ance
the
ageme
g as
ions,
ust j
tori u
o d i f i
in ten a
ser, s
manag
v e r , i
conve
to a m
conver
nt wil
it
In
o i n 1 1 y
m and
ed or
nee m
e n i o r
ement
f sen
r s i o n
ai nte
si on
1 tol
does
e i t h e
dete
agree
cance
orato
, and
supp
ior m
, the
nance
i s
erate
not
r cas
rmi ne
to t
lied.
ri urn
dat
ort
anag
re w
mor
driv
a m
i nt
e, u
the
he c
must be
a proces
is n e c e s
ement is
ill be
a tori urn.
en by
or a tor i u
erfere
ser and
scope
ircumsta
the
sing
sary
the
some
On
user
m on
with
data
and
nces
A common device for invoking moratoriums is a steering
or priorities committee. Composed of data processing and
user management, the steering committee is responsible for
approving projects and establishing priorities. The
steering committee does not manage, nor does it relieve
management of its business responsibilities. Rather, it
provides a forum for discussion and has power derived from
its membership and sponsorship.
Analysis of Opportunities . Certain application systems
indicate bTtter opportunities for conversion than others.
The following types of applications represent good
opportunities for conversion:
An application system using many different master
files and/or many internal sorts, indicating the
need to represent complex data structures and to
support multiple paths between data.
An application with a requirement for on-line
inquiry and/or update of interrelated data. A DBMS
would still be applicable, although not required, if
the data were not interrelated.
An application system with chronically heavy
maintenance backlogs, suggesting redundant data
and/or inflexibility with respect to its data
structures .
An application system requiring a broader view of
data (either more detail or greater cross-functional
breadth) .
An application which crosses functional or
organizational boundaries (e.g., project control).
-27-
An application which cannot support
needs .
basic business
An application which provides
systems.
data used by other
Certain
types of applications represent
opportunities for conversion. For example:
poor
A purchased application which is maintained by a
third party supplier.
An application which uses historical data and
is processed infrequently.
which
A recently installed application system
effective in satisfying user needs.
which is
An analysis of the characteristics of the existing
applications based on the above considerations will yield a
preliminary ordering for conversion of the application
portfolio. As the conversion is planned in more depth the
preliminary ordering will be revised and refined to reflect
such factors as precedence relationships regardina
conversion, level of effort required, and the availability
of resources.
Sel ec
to a data
entry-1 eve
a p p 1 i c a t i o
mistakes a
conversion
risk to th
will for
v i s i b i 1 i ty
compl eted
considered
devel opi ng
reducing r
ti ng the Entry-level
base environment," a
1 application. In a
n woul d be sel ect
nd learning how to c
It woul d have a 1
e business. However
ce initial convers
, contains some elem
quickly. The fac
in identifying
technical compe
i s k and visibility:
Applications. In converting
"Fey
n idi
ed a
onver
ow pr
, the
ion
ent o
tors
the
tence
decTs
eal
s th
t and
o f i 1 e
real
of a
f r i s
list
best
wh
ion is selecting the
world, the initial
e vehicle for making
how to manage the
and not present any
ities of the world
system which has
k , and which must be
ed bel ow shoul d be
opportunity for
ile simultaneously
The application should be
trivial .
representative and non-
It should be a good DBMS application (though not
necessarily the best) .
It represents a relatively low risk to the business.
-28-
It provides sufficient opportunity for learning.
It is either an old system or technically obsolete.
It provides eventual visibility
management controls.
as a vehicle for
It is "owned" by a vocal, important, but neglected
(by data processing), segment of the business.
3.2.2 Impact On the EDP Organization. This section discusses
the following topics relating to the impact of conversion on
the data processing organization:
Organizational considerations.
Technical aspects: tools and methodologies.
. Data processing personnel skill requirements.
Organizational Considerations . Converting to a data
base environment qenerally entails reorganization of the
data processing function in order to provide the technical
and administrative means for managing data as a resource. A
key organizational consideration is the need to establish a
Data Base Administration (DBA) function within data
processing. Conversion will also impact the applications
development and computer operations functions within data
process i ng .
Data base administration . The Data Base Administration
TMA)"TiTnction is responsible for defining, controlling, and
administering the data resources of an organization The many
responsibilities of the DBA function are not discussed in
detail here since they are covered extensively in the
literature. However, some of the major responsibilities
include the following:
. Data base definition/redefinition. DBA must have
primary responsibility for defining the logical and
physical structure of the
consulting responsibility.
data base, not merely
Data base integrity. DBA is responsible for
protecting the physical existence of the data base
and for preventing unauthorized or accidental access
to the data base.
Performance monitoring. DBA must monitor usage of
the data base and collect statistics to determine
the efficiency and effectiveness of the data base in
satisfying the needs of the user community.
-29-
Conflict mediation. DBA must mediate the
conflicting needs and preference of diverse user
groups that arise because of data sharing.
Many alternatives exist for locatinq the DBA function
within the overall corporate structure. Three such
alternatives include the following:
Within the data processing organization,
to avoid an application orientation or a
on computer efficiency, DBA should, in gen
report to Applications Development or
Operations, respectively. Rather, DBA sho
to the highest full-time data processing e
Corporate level. When located at th
corporate level, DBA can take a broad vi
as a corporate resource. Furthermore, DBA
position to resolve conflicts between u
When DBA resides at this location, some of
technical aspects of the DBA function are
performed within the data processing organ
In order
n emphasis
eral , not
Computer
ul d report
x e c u t i v e .
e highest
ew of data
is in a
ser areas,
the more
typical ly
i z a t i o n .
. Matrix organization. This structure is patterned
after the aerospace industry where a given project
draws upon all functional areas. In this case, the
DBA staff would report functionally to DBA but would
also report directly to a project manager. This
organizational strategy has the advantaaes of
recognizing the integration required for a data
base, puts DBA at an equal level with other
functional areas, and serves to increase
communication during application development.
Within the DBA function, the two basic organizational
strategies are functional specialization versus application
area specialization.
Functional Specialization. This strategy organizes
DBA according to functions performed, such as data
base design, performance monitoring, data
dictionary, and so on. This approach has the
disadvantage of ensuring that no one person is
knowledgeable about all aspects of DBA support for a
particular application system.
Application Area Specialization. In this approach,
one person within DBA is responsible for performing
all DBA functions for a particular application area,
including both application development and
operation. This approach has the disadvantage of
developing expertise within functional areas of DBA
-30-
more slowly. Furthermore, unless controlled,
activities within DBA may become fragmented.
However, this approach results in an interesting and
challenging job and facilitates attracting and
keeping capable personnel.
A ppl i cations devel opment . Conversion to a
environment wTTT aTTect the applications
function within the data processing organization
ways. The most fundamental impact upon
devel opment
orientation to a
data base
devel opment
in several
appl i cations
will be the change from an applications
to a data orientation. Conversion to a data
base environment should also broaden the scope of the
application developers. Specifically, the developers need
to understand the basic business processes and to develop
application systems that cross organizational boundaries.
The application development methodology will have to be
modified by delimiting the relative responsibilities of both
DBA and applications development. Moreover, the basic
approach to application development may be revolutionized as
a result of conversion to a DBMS. Specifically, instead of
a rigorous approach to application development, the DBMS may
permit an iterative or convergence approach. With this
approach, user requirements are not defined in detail before
developing the application system. Rather, user
requirements are defined at a more general level and a
system is quickly built using the DBMS. When presented with
the system outputs, the user specifies any required changes,
which are then incorporated into the system. This process
is repeated until the application system satisfies user
needs. Note that this approach to application development
requires a DBMS in which data base definition, creation, and
redefinition and report writing are quickly and easily
accompl i shed .
Computer operations . Conversion to a data base environment
wTTI Trnplct tfie Computer Operations function in two ways.
First, many of the responsibilities of computer operations
function will be transferred to the newly established DBA
function. Second, the characteristics of the application
systems may change. Specifically, the DBMS may facilitate
the development and operation of on-line applications as
opposed to the more traditional batch systems.
Consequently, the computer operations function may have to
reorganize to operate within this more dynamic environment.
Technical Aspects -- Tools and Methodologies . Because
the subject of DBMS selection has been covered adequately in
the literature, it was not addressed by this panel.
However, the following are some of the tools and
methodologies typically required in making effective use of
-31-
the DBMS after its installation:
Data dictionary/directory. A tool for organizing,
documenting, inventorying, and controlling data. It
provides for a more comprehensive definition of data
than is possible in the DDL facility of most
commercial DBMS's. As such, it is essential for
management of data as a resource.
Data base design and validation tools. Used to
facilitate the design process and to validate the
resultant design prior to programming. Included in
this category are such tools as hashing algorithm
analyzers and data base simulation techniques.
Performance monitoring tools. Useful in analyzing
and tuning the physical data base structure. These
tools provide statistics on data base usage and
operation.
Application development tools. Used to facilitate
the development of application systems, including
such tools as terminal simulators which operate in
batch mode and test data base generators.
Data base storage structure validation utilities.
Used to verify that a stored data base conforms to
its definition or to assess the extent of damage of
a damaged data base. Examples include a "chain
walker" utility.
Query/report writer facility. Enables users to
access the data base and extract data without having
to write a procedural program in a conventional
programming language.
Data base design methodology. Needed to standardize
tbe approach to data base design and to provide
guidance in using the data base design, modeling,
and monitoring tools.
Application development methodology. Specifies the
standardized approach to developing application
systems; i.e., the activities to be performed during
the development process and the corresponding roles
and responsibilities of each of the various project
participants. Of particular importance is the need
to define the points in the development process at
which DBA and applications development functions
must interface and the relative responsibilities of
each with respect to application development.
-32-
Documentation methodologies. Needed by DBA to
document data definitions uniformly and to document
data base design decisions.
Data Processing Personnel Skill
impact of conversion to a data base
requirements will be considered in this
Requi rements . The
environment on skill
section.
Types of skills . The following types of skills
in a data base environment:
are needed
. Data Base Administration. DBA should be staffed
with individuals who are strong technically,
interface well with people, and collectively are
knowledgeable about the DBMS itself, the tools
necessary to support it, the application development
process, and the corporation and its data.
. Logical data base design. Within DBA there is a
need for individuals possessing the ability to
recognize and catalog data elements, to group
related data elements, identify relationships
between groups, and to use the data description
1 anguage .
. Physical data base design. Within DBA there is a
need for individuals knowledgeable with respect to
organization techniques, data compression, trade-
offs in data base design, simulation, and modeling
techn i ques .
DML programming. DBA should include individuals
with knowledge of the DML and its associated host
language, data base navigation, and the currency
concept .
Acquisition and training . Obviously, the required skills
may be developed internally or acquired externally. Hiring
the required personnel has the advantage of bringing
experience and new ideas into the data processing
organization. However, individuals knowledgeable with
respect to DBMS are scarce and hence expensive. Moreover,
individuals brought in from the outside typically have
little, if any, knowledge of the business.
Developing skills internally has the advantage of
building DBMS skills on top of knowledge of the business.
Furthermore, control can be exercised over what is learned
and when. Finally, it is generally less expensive and
disruptive than hiring.
-33-
When skills are developed internally, there are several
possible approaches to training:
In-house. In this approach, sta
possessing the necessary skills teac
to others by means of courses or jo
This approach may fit well with initi
development and has no cash cost,
mere act of having to teach their sk
enhances the knowledge and understa
teachers themselves. There
disadvantages to this approach: it
time of the most capable personnel wh
more effectively used elsewhere; it ca
where the required skills do not exi
as a closed system, it excludes differ
vi ew.
ff pe
h these
i n t p r
al appl
Moreove
ills to
n d i n g
are
r e q u i r
en they
nnot b
st inte
i n g p o i
rsonnel
skills
ojects .
i c a t i o n
r , the
others
of the
several
es the
may be
e used
rnally;
nts of
Vendor. This approach utilizes the courses offered
by DBMS and support software vendors. Vendor
courses may be a relatively inexpensive approach
particularly when courses are bundled as part of the
purchase/lease price. Furthermore, the internal
staff are likely to benefit from the
the vendor. However, the courses
thinly-disguised sales pitch.
may
expertise
be only
of
a
Other approaches,
include:
Additional approaches to training
-independent educational organizations
-colleges or universities
-videotape/cassette courses
may resul t
Turnover. Conversion to a data base environment ,„ UJ ,,,,,,
in employee turnover. The new DBMS may be perceived by the
staff as being threatening and, hence, may be resisted.
This resistance to change may be overcome somewhat by
involving the staff in the series of decisions leading to
the acquisition of a DBMS. If required skills are obtained
through hiring, the existing employees are likely to resent
Finally, as
, . - - - - ■ r market val ue
and it becomes increasingly expensive to retain the staff,
mese three factors -- resistance to change, resentment of
new hires, a.nd increased employee market value -- tend to
increase turnover following conversion to a DBMS
environment.
wniwuyn luring, trie e x i s i i n g employees are like
the high salaries paid to the new employees,
the skills of the staff increase, so does their
-34-
On the other hand, certain factors tend to decrease
turnover. Specifically, conversion to a data base
environment involves new opportunities for individual growth
and excitement such as new technology, new hardware and
software, and major development efforts. Properly
exploited, these factors can increase job
correspondingly decrease turnover.
satisfaction and
the
data
3.2.3 Impact On Planning and Control Systems. With
exception of the chargeout mechanism, conversion to a
base environment will not affect the basic mechanisms for
planning and control. However, recognize that conversion is
itself a pr ocess to b£ managed . This entail s applying
justification procedures for conversion, planning
conversion, establishing review and approval
and monitoring progress.
the
check points,
Planning the Conversion ,
requires the involvement
processing management:
Planning for the conversion
of senior, user, and data
Attempts to convert to a data base environment
without senior management support runs a high risk
of failure. If senior management has not formally
authorized DBMS studies or incorporated DBMS
planning into corporate plans, the probability of
successful conversion is remote.
Conversion will have a significant impact on user
departments in the form of disruption of normal data
processing services, restructuring «of application
systems, and a change in orientation on the part of
users from ownership to sharing of data.
Consequently, user involvement in planning the
conversion is critical.
Conversion to a DBMS generally affects the
structure, system development methodology, personnel
skill requirements, and hardware/software
configuration of the
lead time necessary
infrastructure for
environment must be
accordingly .
data processing function. The
to develop the appropriate
operating in a data base
appreciated and planned for
Given senior management support for the conversion, one
strategy for obtaining the required involvement in the
planning process is to establish a steering committee for
the data base as mentioned in the earlier section on
Maintenance Moratorium. This steering committee contains
representatives from both user departments and from data
processing and is responsible for controlling the evolution
-35-
the Data base Steering Committee
of the data base. As such,
is subordinate to the data processing Strategic Steering
Committee, which is concerned with the evolution of the
entire data processing function within the enterprise.
of the target data
represents a model
Given the appropriate participation, a necessary first
step in converting from a non-data base to a data base
environment is the development of an architectural plan for
the data base. This plan describes the intended structure
base. Conceptually, a data base
or image of the organization which it
serves. In order for the data base to represent an accurate
image of the organization, it is necessary for the data base
structure to reflect the fundamental business processes
performed in the organization. Consequently, the designers
of the data base must first understand the key decisions and
activities required to manage and administer the resources
and operations of the enterprise. This typically entails a
cross-functional study of the enterprise in order to
identify the business processes and information needs of the
various user departments.
The architectural plan permits planning and scheduling
the migration of application programs, manual procedures,
and people to a data base environment. This implementation
plan must incorporate review and approval checkpoints that
enable management to control and monitor the conversion
process .
Controlling the Conversio n. The actual conversion to a
data base environment is effected by a project team composed
of representatives from user departments, applications
development, and data base administration. At formally
established checkpoints during the conversion, the data base
steering committee reviews the progress of the project.
Items reviewed and analyzed include the following:
. Projected benefits vs. actual benefits
Data qual i ty (i.e.,
availability)
completeness, timeliness, and
Projected operating and development costs vs. actual
costs
Actual costs of collecting, maintaining, and storing
data vs. benefits realized
-36-
Project performance (i.e., performance of the
project team against the conversion schedule and
budget)
Based on the review, the data base steering committee
takes the appropriate approval action (e.g., go/no go) with
respect to the conversion activity.
Chargeback Considerations . The costs of operating in a
data base (i.e~ shared data) environment are extremely
difficult to charge back to individual users in an equitable
manner. At best, complex job accounting systems can only
approximate actual resource usage. Moreover, the chargeback
algorithm must not be dysfunctional with respect to its
impact on the various user departments. Conversion
data base environment frequently requires that a
department supply data which it does not itself use.
chargeback algorithm must reward, not penalize,
behavior on the part of the user department.
to a
user
The
such
an appropriate
Some considerations in developing
chargeback algorithm include the following:
. Consider capitalization of the costs of conversion
instead of treating such costs as current expense in
order to avoid inhibiting user departments from
undergoing the conversion.
. User departments typically have little control over
the costs of conversion. Consequently, consider
treating such costs as unallocated overhead, since
allocation will have little effect on the decisions
or efficiency of the user departments.
. Because ongoing costs of collecting, maintaining,
and storing data are difficult to associate with
individual users, consider developing percentage
allocation factors for these costs based on periodic
reviews of data base usage. Alternatively, consider
treating these costs as overhead.
Resource usage for retrieval and processing purposes
are easier to approximate, and such costs should be
charged directly to the users.
Consider incorporating a reverse charging mechanism
into the chargeback algorithm in order to compensate
users who supply data which they do not use.
3.2.4 Impact of Conversion On User Awareness. The most
fundamental impact of conversion to a data base environment
is the required change in orientation on the part of users.
No longer are files and applications "owned" by a particular
user department. Rather, data must be viewed as a corporate
resource to be shared by all user departments. The
requirement for sharing constrains the freedom of a user to
change arbitrarily and unilaterally the definition of the
data .
Sharing of data will impact users in a second way.
Following conversion to a data base environment, users may
be required to supply data which they themselves do not use.
As already discussed, the chargeback algorithm must be
structured to reward such behavior. Furthermore, suppliers
of data in general must be infused with a sense of
responsibility (not ownership) for the data in order to
maintain data quality.
Conversion to a data base environment
user community in other ways:
may impact the
Planning the conversion. User participation in
developing both the architectural plan and
implementation plan is necessary in order to obtain
user commitment and to ensure that the resultant
data base satisfies user needs.
Disruption of operations. Normal data processing
services are likely to be severely disrupted as a
result of such factors as limited availability of
personnel and maintenance moratoriums. Furthermore,
the conversion may disrupt and strain user
operations as new and old applications are operated
i n p a r a 1 1 e 1 .
Resolution of inconsistencies. Creation of the data
base typically entails merging of application-
oriented files. During this process,
inconsistencies in both data definitions and data
values are identified. These inconsistencies must
then be resolved by the relevant user departments.
Structure of user department. The structure of a
user department may no longer be effective following
conversion; e.g., the user department may be
designed around a particular application system.
Restructuring application systems during conversion
may precipitate user reorganization.
-38-
New organizational roles. Conversion may cause new
organizational roles to evolve in user departments.
For example, in order to provide coordination
between Data Base Administration and the user
departments, a "user data administrator" may evolve.
The user data administrator serves as the focal
point for participation and involvement on the part
of the user department both during and subsequent to
the conversion.
Systems analysis. By providing such tools as a
high-level query language and/or a generalized
report writer, the DBMS enables non-technical users
to access the data base directly; i.e., users are
less dependent upon programmers to satisfy simple
requests for information,
availability of data may result
the systems analysis function
to user departments.
This increased
in the migration of
from data processing
Personnel skill requirements. Conversion may impact
the skill requirements of user personnel. For
example, conversion to a data base environment may
also result in operating certain application systems
online, requiring that the user department acquire
or develop terminal operations skills.
3.3 MINIMIZING THE IMPACT OF FUTURE CONVERSIONS
The initial conversion to a data base environment is
not likely to be the only data base-related conversion that
an enterprise will undergo. Rather, conversions of one form
or another are likely to be a way of life. However, there
are certain measures that the data processing organization
can take to minimize the impact of future conversions,
including:
Institutionalization of the Data Base Administration
function.
Insulation of programs from a particular DBMS.
DBMS independent data base design.
3.3.1 Instit utionalization of the DBA Function. A well-
DBA function will minimize the impact of future
conversions. Specifically, the DBA function
establ ished
data base
shoul d take
action as follows
-39-
Maintain data definitions and
up-to-date data dictionary.
relationships in
an
Document -the structure and contents of all data
bases independently of the data description language
of the DBMS.
Develop methodologies and standards for data base
design which are independent of any particular DBMS.
3.3.2 DBMS In dependent Data Base Desian. The
the ANSI/X3/S
a conceptual
extensively
amounts to de
form independ
conceptual mo
DBMS selected
are predicate
clearly dist
data. Shoul
impl ementatio
structural co
applications
interest, use
whether a DBM
PARC St
data mo
in res
sign an
ent of
del to
for i m
d on th
i n g u i s h
d the
n , it
n v e r s i o
access
of the
S or co
udy Grou
del . Th
earch p
d docume
any part
the data
pi ementa
e charac
abl e f r
DBMS
is p o s s i
n s r e q u i
i n g the
concept
n v e n t i o n
p on DB
e topic
apers .
n t a t i o n
i c u 1 a r
descr
t i o n , t
teri sti
om the
be
ble to
red of
data
ual dat
al file
MS l n t r o d u
has al so
In prac
of the da
DBMS. In
i p t i o n la
he design
c s of the
natural s
changed
focus more
the data
base . A
a model
s are to b
recent
ced th
been
t i c a 1
ta ba
transl
nguage
d e c i s i
DBMS
tructu
subseq
cl ear
base
s a
is ap
e used
work of
e idea of
pursued
terms it
s e in a
a t i n g the
of the
o n s which
are more
re of the
uent
ly on
and
point
to
the
the
of
propriate
3.3.3 Insulate Programs From the DBMS. Two sets of
circumstances may motivate a~n organization to attempt to
insulate its application programs from any one DBMS. On the
one hand, an organization may be unwilling to commit
completely to the use of a particular DBMS. Rather, it may
desire to keep its options open with respect to converting
to a different DBMS at a later date. Alternatively, a large
multi-division corporation may desire to develop common
application systems for the divisions; yet it may find that
the data processing organizations within autonomous
divisions have installed different DBMS's.
It is possible to build an interface between
application programs and the DBMS in order to isolate the
programs from the DBMS. That is, the programs do not
interact directly with the DBMS. Rather, standard program
requests for DBMS services are translated into the required
DML statements either at compilation time or at execution
time. Thus, application programs are insulated from changes
in the DBMS as long as an interface module can be developed
to translate program requests into the DML statements of the
new DBMS. Similarly, a single application will execute
under any number of DBMS's as long as the appropriate
interface modules exist. Furthermore, the multi-division
corporation retains the flexibility of standardizing on a
-40-
single DBMS at a future date. The negative aspects of this
approach include reduced system efficiency and the
possibility of ending up with a pseudo-DBMS whose
capabilities represent the lowest common denominator of the
various DBMS's for which interfaces are built or planned.
Nevertheless, a number of corporations worldwide have
adopted or are adopting this approach.
3.4 CONVERSION FROM ONE DBMS TO ANOTHER DBMS
As a data processing organization goes through the
experiential learning necessary to assimilate data base
technology, the functions and features of the data base
management system package currently installed will tend to
be more highly utilized. Users will have positive
experiences with the facilities offered by the DBMS and will
subsequently place greater burdens on those facilities.
Also, the technical capabilities of the DBMS will be
increasingly utilized by the data processing staff in order
to meet user requirements.
In short, the tendency to use the full functions of the
DBMS over time will place a strain on the capabilities of
the DBMS. This is manifested by either decreasing systems
processing efficiency or increasing effort necessary to
develop systems which meet user needs. These increased
costs are recognized by both users and data processing
personnel who then initiate a search for increased DBMS
capabilities and, thus, begin data base conversion effort.
This second type of data base conversion can be
characterized by either a complete change in data base
management system packages or an upgrade in the version of
the DBMS currently installed. This section discusses the
impact of the conversion from one data base system
environment to another on each of the four growth processes
previously discussed. The section is organized as follows:
Reasons to go through the conversion.
Economic considerations of the conversion effort.
Conversion activities- and their impacts.
Developing a strategy for the conversion.
-41
3.4.1 R e a s o ns for Conversion,
discussed,
"tTTe"
_ _ As has been previously
most prevalent reason to undertake a
conversion from one DBMS to another DBMS-1 to DBMS-2
Conversion is to install a better DBMS . A better DBMS is
usually defined as having:
Improved functions (more complete).
Better performance.
Improved query capability.
Development of richer data structures.
More efficient usage of the computer resource
through decreased cycles and/or space.
Improved or added communication functions.
Availability of transaction processing.
Distributed processing capability.
Another major reason to undertake a DBMS-1 to DBMS-2
Conversion is to standardize DBMS usage within the company.
Many large corporations are finding that the DBMS selections
made several years ago to meet specific application needs
have resulted in the installation of several DBMS packages
within the data processing organization. The impact of
multi-DBMS usage in a single data processing environment is
major . For exampl e:
Application programs are constrained to the design
and processing characteristics unique to each DBMS.
Data files are structured to be accessed by a single
DBMS.
Design and programming personnel develop the skills
necessary to implement systems associated with a
single DBMS.
The multi-DBMS environment results in a substantial
investment in data processing personnel technical skills and
reduces the potential for integrating applications that
operate on different DBMS's.
For these reasons many companies are now developing
standards for data base management system usage. Those
standards are usually application systems to be developed
under a single DBMS. Exceptions may exist where the
application to be developed is stand-alone in nature with a
-42-
low potential for integration with other systems.
The last major reason for DBMS-1 to DBMS-2 conversion
is that such a conversion is dictated by a hardware change.
Many of the commercially available DBMS's Tre ofTeTed b"y
large mainframe vendors. As such, a move from one hardware
vendor to another will necessitate a change in DBMS usage.
This can become quite a complex effort in that the source
code and data base storage structures of all programs will
require changes. If there is a history of hardware
conversions in the company, the wise data processing manager
should select a DBMS that is not hardware dependent.
3.4.2 Economic Considerations. A
i n t r o d
conver
other
to a D
DBMS-1
basis
in the
the ba
base c
import
is e i t
a hard
uced
si on
hi g
BMS-
to
a s a
com
si s
onve
ant
her
wa re
l n
eff
h-r i
1 to
DBM
ny o
pany
of c
rsi o
if t
bett
cha
DBMS^
the ha
conv
rdwa
the p
ort shou
sk syst
DBMS-2
S-2 conv
ther sys
An ec
osts and
n . The
he major
er DBMS
nge~ th~e
on shoul
ersi
re cha nae .
revi
1 d b
ems
conv
ersi
terns
onom
ben
econ
rea
or _s
cos
d be
ous
e an
proj
ersi
on s
dev
ic j
efi t
omic
son
tand
sect
alyze
ect .
on ef
houl d
el opm
u s t i f
s as
just
for t
a r d i z
t an
i nc
d ben
1 uded
key
i on
d an
The
fort
be
ent
icat
soc i
i f i c
he d
a t i o
eTTt
i n
cone
is th
d ma
same
. As
j u s t i
eff or
ion s
a ted
a t i o n
ata
n of
s ass
the j
ept
at the
naged
conce
a re
f i e d o
t i s
houl d
wi th
is pa
base
DBMS u
oci a te
u s t i f i
that was
data base
like any
p t applies
sul t
the
n the same
justified
be made on
the data
r t i c u 1 a r 1 y
conversion
sage . For
d wi th the
cation for
The economic justification for a DBMS-1 to DBMS-2
conversion should be based on a succinct, articulation of the
COSTS and BENEFITS di rectly associated with the conversion
effort. Costs should be identified on an incremental basis
and be classified into three categories:
One-time conversion costs.
Incremental costs for each planned application to be
converted to the
new DBMS.
On-going DBMS support costs.
Benefits should likewise be identified on an
incremental basis within the same time frames as the
associated costs. Benefits are divided into two categories:
Discernible/definable cost savings
maintenance, and operations.
in devel opment ,
-43-
Intangible cost savings.
A more complete description of the types of economic
considerations to be addressed is contained in DATA BASE
DIRECTIONS : THE NEXT STEPS , National Bureau of StTFdlFdT
Special Tub 1 i ca t i on 45 1 .
Above ill, the justification for _a data base conversion
shoul d be devel oped and communicated to management in the"
same manner that any~other project is justified .
3.4.3 Conversion Ac tivi ties and Their Impact. The impact of
a DBMS-1 to DBMS-2 conversion effort can be felt on each of
the four growth processes previously discussed. Many of the
types of impacts are the same as those previously identified
in a conversion t£ data base technology. Users and data
exDeriln^^r^rnr* Sh0Uld reco 9 n ^that many of the same
experiential learning processes occur in subsequent data base
conversions as they do in the initial effort.
Impact On Application Portfol io . The impact of
subsequent data base conversions on application portfolios
occurs in three areas:
Application programs.
Data bases .
Ca tal ogued modul es .
Application programs . Because application programs are
buffered from actual data storage structures by the DBMS,
the unique characteristics of each DBMS will have a major
impact on application programs in the following areas:
and mappings (model,
. DBMS "call" structures.
Programs view of data
structure , content) .
Application program logic.
Data communications.
Data bases . Physical data storage structures and logical
data relationships are implemented via unioue DBMS utilities
and are patterned after distinct DBMS requirements. As
such, data bases developed under one DBMS are not readily
accessible by other DBMS packages. Specifically, data bases
are impacted by the vagaries of data base management systems
in the following ways:
-44-
. Data base definitions in both the DBMS and in the
Data Dictionary.
Data content and storage format.
Use of data base design and simulation aids.
Conversion aids.
Catalogued modul es . Though processed just as any other
program, catalogued modules differ from application programs
in their function and method of development. The specific
types of catalogued modules which are impacted by a change
in DBMS are:
Catalogued queries.
Catalogued report definitions.
Catalogued transaction definitions.
Im
previou
data b
process
the Dat
structu
process
ex peri e
have a
be req
it s h o u
will 1
by DBA
the dat
DBMS-2
pact On
sly d i
ase en
i n g or
a Base
re ha
i n g en
nee, th
major i
u i r e d
1 d be r
i k e 1 y e
personn
a proce
conver s
the Data Processing Organization
scus
vi ro
gani
Admi
s
vi ro
e co
mpac
as t
ecog
xi s t
el .
ssi n
ion
sed
nment
z a t i o
n i s t r
al rea
nment
nvers
t on
he f u
n i z e d
as t
Othe
g env
will
in the
, the m
n struct
a t i o n org
dy been
during
ion from
it. Only
notions o
that a
he new DB
r o r g a n i z
i ronment
be minima
secti on
ajor im
u r e is t
a n i z a t i o
i n t e g r a
the i
one DBMS
procedu
f the DB
substant
MS techn
a t i o n a 1
as a res
1 .
on co
pact
he im
n. B
ted
n i t i a
to a
ral f
MS ch
ial
ology
struc
ul t o
nver
on
pi em
ecau
i nto
1
noth
i ne-
ange
1 ear
i s
ture
f th
As
si on to
the
entatio
s e this
the
data
e r will
tuning
Howe
n i n g c
a s s i m i 1
change
e DBMS-
was
the
data
n of
DBA
data
base
not
will
ver ,
urve
ated
s i n
1 to
The major organizational impact throughout both data
processing and users areas is likely to be in the technical
and functional education required before, during, a"nd after
tKe conversion effort. Data processing and user personnel
in all areas of systems development and operation will have
to be trained on the new aspects of the DBMS. Training
programs for all people should be identified and initiated
in advance of the conversion implementation.
Another major area of impact on the data processing
organization from a data base conversion is the
modifications in documentation necessary to accommodate the
new DBMS environment. Changes in documentation will occur
in the following areas:
-45-
DBMS functional and technical support (reference)
documentation .
Functional and technical descriptions of any
application systems converted onto the new DBMS.
Physical and logical descriptions of any data bases
converted.
. User-oriented descriptions of application systems
processing characteristics.
. System development methodology documentation that
references particular aspects of data base or
application development.
Impact On Data Processing Management Control Systems.
As previously discussed, Data Processing Management Control
Systems comprise those sets of procedures regularly used to
control both systems development, and operations functions.
The conversion from one DBMS to 'another is not going to
modify the conceptual framework used to control the data
processing environment. However, specific changes will
affect the mechanics of control :
Data processing budgeting and user chargeback . The
chargeback algorithm used to charge users for data
processing services is likely to change because of
modifications in:
DBMS overhead (cycles).
DBMS space requirements.
methods of implementing logical relationships.
ownership of data items.
differences in efforts required to design and
implement application systems.
differences in methods used to structure ad-hoc
queries and periodic reports.
methods of charging end-user cost centers for the
one-time costs of conversion. The time-frame of
allocating these charges can also be important (one
lump sum vs. periodic payments).
Systems devel opment methodology.
46-
There will be changes in the time frame and types of
effort required in systems development.
Conceptual approach to developing systems may change
due to total effort or time frame required to
generate sample reports on test data bases.
Design procedures in the methodology not likely to
change if DBMS facilities are similar; only jargon
should change in documentation.
standards by
are regul arly
functions and
monitors and
new processing
Data processi ng performance measurement .
Systems development and operations
which data processing personnel
measured should change due to new
especially to a new learning curve.
Computer operations performance
standards will change due to
technol ogy .
In tegri ty control .
Adequate data base backup should be carefully
analyzed and managed during the conversion process.
Operational restart/recovery procedures will change
due to new DBMS functions or utilities.
Processing of data exceptions may differ.
Securi ty control .
Differences in methods of data access security
should be recognized.
Data manipulation restrictions may vary from DBMS-1
to DBMS-2.
Pr i vacy control .
Where appropriate, special care should be taken that
all privacy disclosures are logged during a data
base conversion per recent government regulations.
Impact On User Areas . A key note of data base
conversions Ts that the conversion shoul d be as transparent
as possible to user areas . This axiom holds that the
processing Tmpac t on user areas should be held to a minimum
and that the* necessary technical capabilities to support the
conversion should reside in the data processing area.
-47-
The impact of the data base conversion effort should be
readily apparent to users regarding:
Functional improvements (e.g., new query language).
Increased data content (e.g., "while we are
changing, let's add ...").
Increase in sharing of data will highlight data
inconsistencies, validations, and format errors.
Possible planned disruption of services during
conversion period.
Data ownership changes.
User mental images or expectations may change.
Archival data capabilities may change (e.g., meeting
the needs of IRS, EEO, etc.).
3.4.4 Developing a Conversion Strategy. The well-managed
data processing installation should carefully articulate a
data base conversion strategy and plan before initiating any
conversion effort. Specifically, the
ttJ
u
o
s-
O.
c
o
•r-
ro to
I S-
tO CD
>
QJ C
J- o
3 (_>
•r- rtj
Li- -P
(T3
O
+->
O)
o
-111-
Since this step seeks to create a data structure of the
source data without system-dependent information, one can
consider the mapping between the input and the output of the
reformat process to be generally one-to-one. While this
step looks simple functionally, its actual application and
implementation can be quite complex. For example, an
application program may use the high order bits of a zoned
decimal number for its own purposes, knowing that these bits
are not used by the system. Such specifications of
nonstandard item encodings present a difficult problem in
data conversion.
of the unload
Note, however,
that the use of a common data form provides additional
benefits, such as easing the portability problem.
The load process is the counterpart
process and needs no further clarification.
The restructuring process undoubtedly represents the
most complex process of a generalized data conversion
system. The languages for this mapping process can differ
widely (for example, some procedural and other
nonprocedural) and the models used to represent the data in
the conversion system are also quite divergent. (For
example, some use network structures; others use
hierarchical structures). More will be said on this topic
later in this section.
Let us now turn to discuss the issue of implementation
briefly. Generally, there are two techniques: an
interpretive approach or the generative approach. In the
interpretive approach, the action of the system will be
driven by the descriptions written in the system's languages
via the general interpreter implemented for the particular
system. In the generative approach, the data and mapping
descriptions are fed into the compiler(s) which generates a
set of customized programs executable on a certain machine.
Later in this section we'll discuss the merits of each of
these approaches.
Turning our attention to the tools that have been
developed for data conversion, we shall first discuss
currently available tools and then the research and
development work in progress.
Available Conversion Tools . Currently, available tools
have limited capabilities. Because it is impossible in this
short report to provide an exhaustive survey of all the
vendor-developed conversion tools, we will highlight the
spectrum of capabilities available to the user by providing
examples from specific vendor shops.
-112-
The repertoire of vendor conversion tools begins at the
character encoding level of data conversion with the
provision of hardware/firmware options and continues through
the software aids for conversion and restructuring of data
bases.
Depend
devel op s
Probably th
foreign f i
the direct
such as E
within COBO
its capabi
some do not
process mi
Behymer and
a general c
knowl edge t
transl ati on
i n g on a d i
oftware to
e most prev
1 e process
reading or
BCDIC tape
L. Althoug
1 i t i e s are
handle un
xed mode
Bakkom [DT
onversion b
here are no
tool s .
v e r s i t
ol s v
al ent
i n g a i
wri ti n
s or
ha r
neve
1 abel e
data
11], w
ridge
vend
y of c
a r i e s
file co
d . Th i
g of a
Honeyw
e 1 a t i v e
rthel es
d tape
types .
h i c h wa
with a
or sup
ondi ti
from
n v e r s i
s type
partic
ell Se
1 y wi
s 1 im
s , wh
As id
s a i m e
partic
ported
ons , the
vendor t
on tool i
of facil
ul ar cl as
ries 200/
despread
i t e d . F o
i 1 e o t h e
e from t
d toward
ul ar vend
general
need
o ven
s a C
ity al
s of f
2000 f
facil
r exam
rs ca
he wor
achie
or , to
i zed
to
dor.
0B0L
1 ows
i 1 es
i 1 es
ity,
pie,
nnot
k of
ving
our
file
In co
have been
data base
conversion
Honeywel 1 .
and the
whole data
adopted .
base into
type conve
allows the
but not
include t h
fields (I
which are
restructur
sophistica
ntrast t
devel o
e n v i r o
aid i
Becaus
fact th
process
The fi
the I-D-
r s i o n s a
data ba
opt imal 1
e genera
-D-S/II
al 1 ocate
ing of
ted capa
o the
ped t
nment
s th
e of
at th
ing
rst
S/II
nd po
se to
y.
t i on
requ
d i n
the
b i 1 i t
abo
hat
•
e I
the
e us
shop
step
form
inte
be
Addi
of t
ires
step
I-D
i es
ve f
have
One
-D-S
1 arg
ere
, a
i s
at ,
r me
proc
tion
he
"Pr
1 b
-S/I
of I
il e t
thei
exam
/II m
e vol
annot
co-
to r
maki
chan i
essed
al s
a d d i t
ior &
ut no
I (C
-D-S/
ransl
r mai
pi e
i g r a t
umes
affo
exist
eform
ng t
sm ad
in t
teps
i o n a 1
Head
t fi
oexi s
II.
a t i o n
n app
of a
ion a
of
rd to
ence
at th
he n
apti o
he I
i n
I-D
er" c
lied
tence
tool
1 icat
da
id pr
data
shut
appr
e I-D
ecess
ns .,
-D-S/
this
-S/II
hai n
in)
) to
s , t
i ons
ta
o v i d e
i nvo
down
oach
-S/I
ary
This
II m
migra
poi
point
and
the
ool s
i n a
base
d by
1 ved
hi s
was
data
data
step
ode,
tion
nter
ers ,
the
more
Some data base restructuring tools specific to a
particular DBMS have been developed by DBMS users. One
example of this type of tool is RE0RG [Rll], a system
developed at Bell Laboratories for reorganization of UNIVAC
DMS-1100 data bases. RE0RG provides capabilities for
logical and physical reorganization of a data base using a
set of commands independent of DMS-1100 data management
commands. A similar capability has been developed at the
Allstate Insurance Company.
-113-
In addition to the above, there are also software
companies and vendors who will do a customized conversion
task on a contractual basis.
Data Conversion Prototypes and Model s . Over the past
seven years, a greaT deal of research on the conversion
problem has been performed, with the results summarized in
Figure 6-4. The University of Michigan, the University of
Pennsylvania, IBM, SDC , and Bell Laboratories initiated
projects, as well as a task group of the CODASYL Systems
Committee. In many cases, interaction and cross-
fertilization between these groups led to some consensus on
appropriate architectures for data conversion. The
individual achievements of these groups is discussed below:
The CODASYL Stored - Data Descri pti on and Trans! ation
Task Grou
task grou
Descri pti o
data tran
invest i gat
annual Wor
ref ormul at
Task Grou
devel opmen
1 evel s of
the group
an example
cl ass of
stored-dat
at and d i
1 evel s .
p_ In 1
p ( o r i
n Lang
si ation
ion of
kshop i
ed as
p and
t of
i mpl erne
speci
1 angua
logic
a defi n
str ibut
970, the C
g i n a 1 1 y
uage Task
The g
the area i
n Houston
the Stored
presented
a detai 1 ed
ntat i on [S
f i es the d
ge for des
al and p
i t i o n 1 a n g
ed to the
ODASYL
cal 1 ed
Grou
roup
n the
[SL1].
-Data
a g
model
L4,DT2
ata co
c r i b i n
h y s i c a
uage a
access
Systems
the
p) to st
presented
1970 SIGM
In 1972
Descri pti
eneral a
for desc
]. The m
nversi on
g and tr
1 struct
1 1 ows dat
path , en
Commi ttee
Stored
udy the p
i ts
0D (then
, the g
on and Tr
pproach
r i b i n g da
ost recen
model and
a n s 1 a t i n g
ures [SL
a to be
cod i ng , a
formed a
Structure
robl em of
initial
SIGFIDET)
roup was
a n s 1 a t i o n
to the
ta at al 1
t work of
presents
a wi de
8]. The
described
n d device
The Un i versi ty of Mi chi gan . The nonprocedural approach
to stored-data definition set forth by Taylor and Sibley
[SL3,6] provided one of the major foundations for the
development at the University of Michigan (see Figure 6-4)
of data translators. In concert with Taylor's language,
Fry, et al. [DTI] initiated a model and design for a
generalized translation.
The translation model was
implementation of the Michigan
tested in a prototype
Data Translator in 1972
[UT2,4], and the results of the next implementation, Version
I, were reported by Merten and Fry [DT4].
In 1974, the work of the Data Translation Project of
the University of Michigan focused on the data base
restructuring problem. Navathe and Fry investigated the
hierarchical restructuring problem by developing several
levels of abstractions, ranging from basic restructuring
types to low level operations [R6].
-114-
Eh
s
S
o
H
Eh
a,
o
Eh
CO
J
o
>h
H
a.
CO
>
03
a
O
Q
o
03
O ••
03 H
< CO
W O
60 m
cd o
3 to
60 m
§^
H cd
4->
t> aj
§9
rH (I)
CU m
13
O -P
2 CO
I H
C ft o
O -H O
•H U fii
-POO
ft co co
cu
13 CU
s o
o o
CO C\J
3 co
-p
Eh
Q
M
2
Cm
H
CU
be
cd
C
3
o
6C
■H
rf
P>
3
cd
J
>
-P
■P
O
a)
P
4J
o
cd
M
ft
ft
O
>h 2=
Eh ^
M C3
CO H
03 03
£S
H 2
w
E <
0)
J3
1
60
D 3
-p
o
•H
Eh O
M
>
o s
ft
cm
ft
H
J D <
>h 03 J
s
CU
o
c
H
CO
CO Eh
< CO 3
03
CO
o
JO
o
£3
•H
•H
o
a o
O
to
CO
-p
OHM
w
•H
M
CO
O O Eh
g
-P
CU
CO
§3
cd
>
ft
CU
Eh
o
x:
O H
■H
13
a i — i
£h fc
Cm
o
c
cd J
CO w
•H
cd
O CO
a
O
cd
M I 1
1)
-p
s
ft
ft
cd
CU
ft -p
CO
13
H
cd -h
o
c—
o\
CU
U
3
H -P
cd o
o 3
•h u
60 -P
O CO
CU
13
O
2
60 r—i
CO
>>
J3 -H -
ft ft ro
ftj
O cd co
+3 «■— I
H
ON
-p M
ft co
■h cd
Sh Eh
13 JS
cu o
M
cd i—i
aj
£3 Sr<
o a
3 cu
O 13
M O
cd
M
cu - .
C co _=r
CU m i_l
60 CU CO
Cm a
o o CU
O M
2 13
CM
0\
o
cu
-p
U
H
o
o
■H
M
p
Cm
ft
cd
H
H i — i
13
W
cd -3-
CU
S3
•H Eh
■P
cd
-p a
O
M
G -
■H
-P
CU -3-
fH
3 *
-P
cd
ct 1 ro
10
p
CU Eh
cu
cd
co D
60
£3
•H
M
3
-P
o
3
M
P 1
13 VO
O 03
2 i—i
Fi gure 6-4
Historic Context of Data Conversion Efforts
-115-
ac
eg
o
(X
ft M
a co
w o
■p
CO 1-3
Ih
a 5
(D
>
a
S CO
pq
o
ft
O
<
H
>H
jgs
H
<^
M
>
m
J
cc
>H
ft
m
>
a
H
a
§
ft
ft
o
SH 2
Eh 3
M O
CO H
ft W
ft O
> H
Cfl
PC
ft
X
CU
O
s
•rH
V
cd
-P
c
4!
&•-?
CU U"\
H H
ft EH
6 O
M i — i
CU
s
!h
o
a o
-H
tn^~-
s o
-p
a h
li 2
a)
CO
•H o
ft — '
5
Sh O
3 X!
ft l — 1
-p 3
p
-p o
°3
U CU
cu M
co
O CO
3
-P E-i
■*~3
co
Sh (3
CO I '
o o
-P o
cu
Sh -p
co
CO -p
-P Sh
ft
CO
> h
ft ro
3
-P 3 3
■H H
cd
o -p o
U EH
U
-P O -H
o Q
Eh
O 3 -P
^ ^ cd
CO I 1
CD
cd
ft -p -P
ft 0)
P
en 3
bO
cd
H D 1)
C cd
a
« ^ e
3
■H M
§
O ft H
P C
■H fc ft
cd cd
60 P O E
r-\ ft
■H
cd > -H
CO
ft
^ -p
§ o
o
CU CU Tj
■p
3
0)
6 3
ft
O -H
r-H -P
(U cd
!> rJ
CU CO
6H
T5 3
ft
cd
ft
•P U
9
3 -P>— •
cu t—
« dH
fn -P Eh
3 cd O
O 73 ' '
ft
cd
3
^
cu
ti
cu
P 3
cd O
ft 1 — 1
o
O CO
Ti
ft
cu a
ft CO
+j a
cu i — ■
o ^
T3
cu M)
o cd
U O
S -P
■H !n
cd
■3 ft
-d t)
cu
cu
M O
ft Tj
■P
•H CU
&
cd U
3 3
P>
O O
c—
Figure 6-4 (continued)
Historic Context of Data Conversion Efforts
-116-
Later, Navathe proposed a methodology to accomplish
operations using a relational normal form for the int
representation [DT12]. Version II of the Michigan
Translator was designed to perform hierarc
restructuring transformations, but the project did
implement it. Instead, the research was directed int
complex problem of restructuring network type data b
To address this problem, Deppe developed a dynamic
model--the Relational Interface model--which simultane
allowed a relational and network view of the data
[UR3]. This model formed the basis of the Version
design and implementation of generalized restruct
capabilities [UT8,9,10]. Another component necessary
the development of a restructurer was the formulation
language in which to express the source to target
transformations. This language, termed Transl
Definition Language (TDL), evolved through each trans
version beginning with a source-to-target data item "e
list" in the Version I Translator to the ne
restructuring specifications of Version IIA. Whil
initial version of the TDL was quite simplistic, the cu
version, the Access Path Specification Language [DT16
provides powerful capabilities for transforming network
bases .
these
ernal
Data
hical
not
the
ases .
data
ousl y
base
IIA
ur i ng
for
of a
data
a t i o n
1 ator
quate
twork
e the
rrent
,R9],
data
The
work at
of Penns
descri pti
1 anguage
storage
(TDL) [SL
the logi
SDDL. I
mappings,
Fol 1 owi ng
1 anguage-
programs
reports
was prov
transl ato
system, a
rsi ty of Pennsylvania.
" t Mich' ^
U n i v e
tne Univer sTT y - o
ylvania (see F
on approach and
(SDDL) for defin
devices, and a
2,DT2] and three
cal , storage , an
n order to de
a first or
from this work
driven "generat
to perform the
on the uti 1 izati
ided by Winter
r developed by
nd appl ied it to
i gure
devel
i ng st
tran
1 evel
d phys
scribe
der c
, Ram
ive"
conv
on of
s and
Ramire
conve
igan,
6-4)
oped a
orage
si atio
s of
ical ,
the
al cul u
irez
transl
ersion
genera
Dick
z, the
rting
Concur
Smith at
al so
stored-
of data
n descr
data ba
are desc
source-
s 1 angu
[DT3,6]
ator whi
One
1 i zed tr
ey [TAI
y i n s t a 1
IBM 7080
rent
the
took
data
on
wi t
Un i v
a
defi
sec
i pti on la
se struc
ribed usi
to-target
age was
impl emen
ch create
of the
a n s 1 a t i o n
] . U s i n
led it on
files.
h the
ersi ty
data
n i t i o n
ondary
nguage
tures ,
ng the
data
used .
ted a
d PL/1
first
tool s
g the
their
IBM Research , San Jose In 1973, another major data
translation research endeavor was initiated at the IBM
Research Laboratory in San Jose, California. Researchers in
this project — initially Housel , Lum, and Shu, later joined
py bhosh and Tayl or— adopted the general model as specified
in Figure 6-1 but made several innovations. First, in the
belief that programmers know well the structure of the data
in a buffer being passed from a DBMS to the application
program, the group concentrated its effort on designing a
-117-
data description language appropriate for describing data at
this stage. Second, regardless of the data model underlying
any DBMS, the data structure at the time it appears in the
buffer of an application program will be hierarchical. The
general architecture, methodology, and languages reflecting
these beliefs is reported in Lum et al.[DT14].
In addition, the group in San Jose felt that, while it
is desirable to have a file with homogeneous record types,
it is a fact of life that many of today's data are still in
COBOL files in which multiple record types frequently exist
within the same file. As a result, the group concentrated
on designing a data description language which can describe
not only hierarchical records (in which a relational
structure is a special case) but also most of the commonly
used sequential file structures. This language, DEFINE, is
described by Housel et al-[SL7].
The philosophy of restructuring hierarchies is further
reflected in the development of the translation definition
language CONVERT, as reported by Shu et al [R2]. This
language, algebraic in structure, consists of a dozen
operators, each of which restructures one or more
hierarchical files into another file. The language
possesses the capability of selecting records and record
components, combining data from different files, built-in
functions (e.g., SUM and COUNT), and the ability to create
fields and vary selection of the basis of a record's content
( a CASE statement) .
A symmetric process occurs at the output end of the
translation system. Sequential files are created to match
the need of the target loading facility. The specification
of this structure is again made in DEFINE.
A prototype implementation, originally
but renamed XPRS, is reported in [DT15].
called EXPRESS
System
project re
System Deve
avoid the
(i.e., i n d e
1 i ke) they
i nvol ved .
and load
systems .
reformatter
to a standa
f i 1 e had
reformatted
on the prob
Deve! opment Corporati o
ported by Shoshani [R
1 opment
compl e
xes , po
chose
In part
( genera
However
s from
rd form
to be
to and
1 em of
n An
by Shoshani [R3,4]
Corporation in 1974-1
xities of storage str
inter chains, invert e
to use existing facil
icular, they advocated
te) facilities of d
, when such facil it
the source ( e.g . , in
and from the standard
used. Given that
from a standard form,
logical restructuring
other restruct
was performed a
975. In orde
ucture specific
tabl es
and
ities of the sy
the use of
ata base manag
ies do not e
dex sequential
form to same o
data bases ca
they concent
of hierarchical
uring
t The
r to
a t i o n
the
stems
query
ement
xi st ,
file)
utput
n be
rated
data
-118-
bases in this form.
The
the res
Transl ati
simple,
speci fyi n
fields)
exampl e ,
of sourc
r e p e t i t i o
level in
source an
In a d d i t
INVERSION
rel ations
extensi ve
field v a
1 anguage
1 ocal , t
combinati
further
"semantic
i n c o n s i s t
data base
language u
tructuring
on Langua
For the
g a m a p p i n
of the sou
whi 1 e a DI
e items to
n of a sou
the tar
d target f
ion, t h e r
operator,
hips to
field res
1 ues coul
speci fi cat
here is t
ons that d
component
a n a 1 y s i
e n c i e s be
sed in the above project fo
functions (called CDTL-
ge) was designed to be
most part, it provides
g from a single field (or co
rce to a single field of the
RECT would specify a one-to
target items, a REPEAT woul
rce item for all instances
get hierarchy. In both ca
ields need to be mentioned a
e are more global operations
which causes parent/depe
be reversed. The system a
tructuring operators, wher
d be manipulated according
ions. Since most of these o
he possibility that they cou
o not make sense globally.
of the system was built
s," which checks fo
fore proceeding to genera
r specifying
-Common Data
conceptual 1 y
functions for
mbination of
target. For
-one mapping
d specify the
of a 1 ower
ses , onl y t he
s parameters.
, such as the
ndent record
Iso supported
e individual
to prescribed
perators are
Id be used in
Therefore, a
to perform
r possi bl e
te the target
Bell Laboratories Th
ADAPT
under
drive
1 angu
struc
compu
the
the t
data
c r i t e
data .
(A D
deve
n by
age on
ture
t a t i o n
target
ransfo
to p
ria ca
At a Parsi
1 opment ,
two hig
e d e s c r i b
of the
s whi 1 e p
data .
rm a t i o n s
roduce t
n be spec
ng an
i s
h-1 ev
es th
data
a r s i n
The
which
he t
i f i ed
e Bell
d Tran
a gen
el 1 an
e phys
and
g the
second
are t
arget
to ap
Labs
sf orm
eral i
guage
i cal
to p
sourc
lang
o be
data
ply t
data
a t i o n
zed
s [DT
and 1
rovid
e da
uage
appl i
E
o the
tran
syst
trans
17].
o g i c a
e var
ta a
is us
ed t
xtens
sour
si ati
em) ,
1 a t i o
With
1 fo
ious
nd g
ed to
o th
i ve v
ce an
on sy
curre
n sy
the f
rmat
tests
enera
desc
e so
a 1 i d a
d ta
stem
ntly
stem
i rst
and
and
ting
ribe
urce
ti on
rget
Two processing paths are available within the ADAPT
system: a file translation path and a data base translation
path (see Figure 6-3). A separate path for
responds to real-world considerations:
conversions do not require the capabilities
high overhead involved in using a data
path .
file translation
many types of
and associated
base translation
Rel ated Work Additional research effort examines the
development and acceptance of a standard interchange form.
An interchange form would increase the sharing of data bases
and provide a basis for development of generalized data
translators. The Energy Research and Development
-119-
Administration (ERDA) has been supporting the
Interl aboratory Working Group for Data Exchange (IWGDE) in
an effort to develop a proposed data interchange form. The
proposed interchange form [GG2] has been used by several
ERDA laboratories for transporting data between the
laboratories. Additional work on development of interchange
forms has been pursued by the Data Base Systems Research
Group at the University of Michigan [UT14].
Navathe [RIO] has recently reported a technique for
analyzing the logical and physical structure of data bases
with a view to facilitating the restructuring specification.
Data relationships are divided into identifying and
nonidenti fyi ng types in order to draw an explicit schema
diagram. The physical implementation of the relationships
in the schema diagram is represented by means of a schema
realization diagram. These diagrammatic representations of
the source and target data bases could prove to be very
useful to a restructuring user.
6.2.2 Application Program Conversion. So far, we have
concentrated on the data aspects of the conversion problem;
it is necessary to deal as well with the problems of
converting the application programs which operate on the
data bases. Program conversion, in general, may be
motivated by many different circumstances, such as hardware
migration, new processing requirements, or a decision to
adopt a new programming language. Considerable effort has
been devoted to special tools such as those to assist
migration among different vendor's COBOL compilers, and
general purpose "decompilers" that have been developed to
translate assembly language programs to equivalent software
in a high level language. While progress has been made
developing special purpose tools for a limited program
conversion situation, little progress has been made in
obtaining a solution to the general problem of program
conversion. With this fact in mind, this section focuses on
the modifications to application programs that arise as a
consequence of data restructuring/conversion.
Probl em Statement . There are three types of data bases
which can at fectap plication programs:
alterations to the data base physical structure, for
example, the format and encoding of data, or the
arrangement of items within records
changes to the data base logical structure — either :
a. the deletion or addition of access paths to
accommodate new performance requirements, or
-120-
b. changes to the semantics of data, for example,
modification of defined relationships between record
types or the addition or deletion of items within
records
migration to a new DBMS, perhaps encompassing a data
model and/or data manipulation language different
from the one currently in use
The actual impact
application programs is
independence provided by t
Data independence and i
problem are discussed el
incomplete data independe
of program conversion is r
schema changes. In fact,
management systems provi
insulation from a variet
data base, protection from
the semantic level--is
changes 1 ikely to have a
programs include:
of
a
he D
ts
sewh
nee
equi
whe
de
y of
1 og
mi n
pro
these
f unct
ata B
rel at
ere
and t
red i
reas
appl
modi
ical
imal .
found
da
ion
a se
ions
[GG3
hat
n re
most
icat
f ica
chan
E
ef
ta
of t
Man
hi p
].
ther
spon
com
ion
tion
ges-
xamp
feet
base
he a
agem
to t
We
efor
se t
mere
pr
s to
-par
1 es
on
change
mount of
ent Sys
he conve
a ssume
e some d
o data
ial data
ograms
the phy
t i c u 1 a r 1
of sem
appl ic
s on
data
terns .
rsion
here
egree
base
base
wi th
s i c a 1
y at
antic
a t i o n
Changes in relationships between record types, such
as changing a one-to-many association to a many-to-
many association or vice-versa.
Deletion or addition of data items, record types, or
record relationships.
Changing derivable information ("virtual items") to
explicit information ("actual items") or vice-versa.
. Changes in integrity, authorization or deletion
rul es .
Various properties of data base application programs
greatly complicate the conversion problem. For instance,
many data base management systems do not require that the
record types of interest (or possibly even the data base of
interest) be declared at compile time in the program; rather
these names can be supplied at run time. Consequently at
tne compile time, incomplete information exists about what
data the program acts on. Other troublesome problems occur
when programs implicitly use characteristics of the data
which have not been explicitly declared (e.g., a COBOL
program executes a paragraph exactly ten times because the
programmer knows that a certain repeating group only occurs
ten times in each record instance). Complexity is
introduced whenever a data manipulation language is
-121
intricately embedded in a host language such as COBOL. The
interdependence between the semantics of the data base
accesses and the surrounding software greatly complicates
the program analysis stage of conversion. Because of these
considerations, substantial research has been devoted to
alternatives to the literal translation of programs. In
particular, some currently operational tools utilize source
program emulation or source data emulation at run time to
handle the problem of incomplete specification of semantics
and yet still yield the effects of program conversion.
Current Approaches . In this section, we discuss two
main techniques currently employed in the industry. These
techniques are commonly used but unfortunately not
documented in the form of publications.
DML Statement Substitution
The DML substitution conversion technique, which can be
considered an emulation approach, preserves the semantics of
the original code by intercepting individual DML statement
calls at execution time, and substituting new DML statement
calls which are correct for the new logical structure of the
data base. Two IBM software examples which provide this
type of conversion methodology are 1) the ISAM compatibility
interface within VSAM (this allows programs using ISAM calls
to operate on VSAM data base), and 2) the BOMP/DBOMP
emulation interface to IMS. This program conversion
approach becomes extremely complicated when the program
operates on a complex data base structure. Such a situation
may require the conversion software to evaluate each DML
operation against the source structure to determine status
values (e.g., currency) in order to perform the equivalent
DML operation on the new data base. Generalization of this
approach requires the development of emulation code for the
following cases: maintain the run time descriptions and
tables for both the original and new data base
organizations, intercept all original DML calls, and utilize
old-new data base access path mapping description (human
input) and rules to determine dynamically what set of DML
operations on the new data base are equivalent to each
specific operation on the source data base.
Although a conceptually straightforward approach, it
has several drawbacks. The drawbacks can be categorized as
degraded efficiency and restricti veness . Efficiency is
degraded primarily because each source DML statement must be
mapped into a target emulation program, which uses the new
DBMS to achieve the same results. The increased overhead in
program size and/or processing requirements can be
significant.
-122-
emu! a
capab
the
upon
permi
seman
c o n t i
can b
wh i c h
e q u i v
task
chang
progr
The d
tion
i 1 i t i
model
the
s s i b 1
tics
nue t
e qui
th
al enc
of d
e i n
ams w
rawback
approach
es of th
ing of
ol d pr
e new da
of the s
o execut
te compl
e data
e. Ther
etermini
the data
ill be a
of res
i n h i b
e new
the o
ogram
ta str
ource
e i n t
ex , ev
str
efore ,
ng if
model
n exte
trie
its
DBMS
Id m
sem
uctu
prog
he s
en
uctu
i n
a ch
) w
nsi v
t i v e n
the u
and/
ethod
anti c
res t
ram i
ame m
for
re
some
ange
ill
e tas
ess come
til i za ti
or data
s. Addi
s limit
hat must
f the so
anner .
the 1 im
changes
instance
in data
support
k.
s about b
on of the
structur
t i o n a 1 1 y
s the
support
urce prog
Note that
ited sit
preserve
s , just t
structure
a set
ecause
incre
e thr
depend
sets
all of
ram is
the r
ua ti on
sema
he 1 im
(give
of so
the
ased
ough
ence
of
the
to
ul es
of
nti c
ited
n no
urce
some
tech
requ
data
reco
sour
reco
same
not
refl
segm
appl
Bridge
times r
ni que ,
i rement
base t
nstruct
ce pro
nstruct
result
modi f i
ect and
ent mu
i c a t i o n
Program The
ef err
the
s are
hat p
i on
gram
ed p
s tha
ed .
upda
st b
prog
ed to as
sourc
support
ortion o
is done
is the
ortion o
t would
Of cour
te , and
e prepa
ram .
sec
the
e a
ed by
f the
by m
n al
f the
occur
se , a
each
red
ond met
Bridge P
ppl icati
reconst
source
eans of
lowed t
source
if the
reverse
simul a
before
hod i
rogram
on p
r u c t i n
data b
"bridg
o ope
data b
source
m a p p i
ted s
it is
n use
Metho
rogram
g from
ase ne
e prog
rate
ase to
data
ng is
ource
need
today is
d . In this
' s access
the target
eded. Data
rams." The
upon this
effect the
base were
required to
data base
ed by the
This approach suffers from the same types of
disadvantages inherent in the emulation approach.
Efficiency problems for complex/extensive data bases and
programs performing extensive data accessing can make this
method prohibitively expensive for practical utilization.
This technique is generally found as a "specific software
package" developed at a computer installation rather than as
a standard vendor supplied package.
Current Research . Differing from the emulation and
bridge program approaches, current research aims towards
developing more generalized tools to automatically or semi-
automatical 1 y modify or rewrite application programs. The
drawbacks of the existing approaches described above can be
avoided by rewriting the application programs which would
take advantage of the new structure and semantics of a
converted data base and by using a general system to do the
conversion rather than using ad hoc emulation packages and
bridge programs.
-123-
Research on application program conversion is still in
its infancy. Consequently, very few published papers on
this subject exist. This section describes a handful of
works in the order of the dates of publication. Mehl and
Wang [PT6] presented a method to intercept and interpret
DL/1 statements to account for some order transformations of
hierarchical structures in the context of the IMS system.
Algorithms involving command substitution rules for various
structural changes have been derived to allow the correct
execution of the old application programs. This approach
works only for a limited number of order transformation of
segments in a logical IMS data base. Since it is basically
an emulation approach, it has the drawbacks discussed in the
previous section.
appl i
chang
attem
autom
progr
main
appl i
data
execu
trans
trans
these
to d
trans
A pape
cation
es resu
pt was
a t i c o
ams du
points :
cati on
variab
t ion p
1 a t i o n
forma ti
operat
escribe
1 ati on
r by S
program
1 ting f
made
r semi-
e to da
1) the
program
1 e rel a
r o f i 1 e ,
operat
ons are
ors. Th
the o
statemen
u [P
con
rom
to i
autom
ta ba
nee
i n c 1 u
t i o n s
etc .
ors
req
e ide
perat
ts is
T12]
v e r s i
a da
d e n t i
ati c
se ch
d fo
ding
P
; an
to
ui red
a of
ions
al so
give
on a
ta b
fy t
conv
anges
r ex
the a
rogra
d 2)
det
to a
the u
of so
prop
sag
s rel a
ase tr
he task
e r s i o n
. The
tensive
n a 1 y s i s
m-subpr
the
ermi ne
ccount
se of a
urce qu
osed .
enera
ted
ansfo
s req
of
pa per
ana
of p
ogram
use
wh
for t
com
e r i e s
1 mod
to da
rmatio
ui red
appl
stres
lysis
rogram
str
of da
at
he eff
mon 1
and t
el of
ta base
n . An
for the
i c a t i o n
ses two
of an
1 ogic ,
ucture ,
ta base
program
ects of
anguage
he data
An
in res
Schi ndl
code t
1 anguag
1 anguag
nested
that e
al gebra
r e 1 a t i o
express
new p
express
that a
through
appro
ponse
er [PT
empl at
e--DML
e mac
code t
ach o
An
nal e
ion to
rogram
ion b a
1 eve
curre
ach t
to
10].
es ,
sta
ros) .
empl a
ne co
appl i
xpres
acco
i s
ck in
1 of
nt pr
o the tr
a data
The app
which a
tements
Appl i
tes. Th
rrespond
cation
s i o n , t r
mmodate
genera
to code
1 ogical
ogrammin
ansfo
base
roach
re p
( rou
c a t i o
e cod
s to
progr
ansfo
the d
ted
tempi
data
g tec
rmation o
restructu
is based
redef i ned
ghly ana
n progra
e templat
an operat
am is t
rmat i ons
ata base
by mappi
ates. Th
i ndepend
hnol ogy .
f DB
ring
on
se
1 ogo
ms c
es
or i
hen
are
rest
ng
i s a
ence
TG-1
was
the
quen
us
an b
are
n th
map
perf
ruct
the
ppro
may
i ke pr
propo
conce
ces o
to as
e writ
d e v i s
e rel a
ped i
ormed
uring ,
trans
ach su
be ac
ograms
sed by
pt of
f host
sembl y
ten as
ed so
t i o n a 1
nto a
on the
and a
formed
ggests
hi eved
The work by Su and Reynolds [PT15] studied the problem
of high-level sublanguage query conversion using the
relational .model with SEQUEL [Z5] as the sublanguage, DEFINE
[SL7] as the data description language and CONVERT [R2] as
the translation language. Algorithms for rewriting the
-124-
source query were derived and hand
study, query transformation is d
translation operators which have been
data base. The purpose of this
effects of the CONVERT operators on
Only restricted types of SEQUEL qu
This work demonstrates that a genera
system should separate the data mode
factors from the data model and schema
and an abstract representation of pro
semantics of data translation operator
that data conversions at the logic
type which changes the data base sema
level can be attempted.
s i m u 1
i c t a t e
appl ie
work
high
e r i e s
1 pro
1 and
i n d e p
gram s
s need
1 evel
n t i c s )
ated .
d by
d to th
was to
-1 evel
were co
gram c
schema
endent
emantic
to be
(espec
and
In this
the data
e source
study the
queries,
nsidered .
onversion
dependent
factors ;
s and the
sought so
i a 1 1 y the
the DBMS
Two i
Su and L
approach t
former wo
semantics
various e
network) u
mapped in
program se
(called a
entities a
appl ied on
changes in
transforme
intermedia
which is
used for t
modified
generation
semantics
ex pi ic it t
appl i c a t i o
conversion
to accoun
as the DBM
ndpend
iu [P
o the
rk i s
( a con
xi sti n
sing d
to an
mantic
ccess
nd ass
the a
troduc
d rep
te re
d i c t a t
he tar
by an
of ta
of b
o the
n pr
metho
t for
S 1 eve
ent work
T13] an
appl icat
based
ceptual
g data
i fferent
abstra
s in t
pattern
c i a t i o n
bstract
ed by th
resentat
presenta
ed by th
get data
optimi
rget pro
oth the
conversi
ogram
dol ogy d
data con
1 .
s carr
d Hou
ion pr
on t
model )
model s
schem
ct re
erms
s) th
s . Tr
repres
e data
ion i
tion
e exte
base .
zation
grams .
sourc
on sys
anal ys
esc rib
v e r s i o
ied out ab
sel [PT14]
ogram conv
he idea
can be mo
( rel ation
as . Appl i
presentati
of the p
at can b
ansformati
entation b
t r a n s 1 a t i
s then m
(cal 1 ed a
rnal model
This rep
componen
This wor
e and targ
tern and be
is and t
ed is for
n at the 1
out the same time by
take a more general
ersion problem. The
that the same data
delled externally by
al , hi erarchical and
cation programs are
on which represents
rimitive operations
e performed on data
on rul es are then
ased on the types of
on operators. The
apped into another
ccess path graphs)
and specific schema
resentation is then
t and used for the
k stresses that the
et data base be made
used as a basis for
ransformation . The
program conversion
ogical level as well
Housel extends the work on application migration
undertaken at the IBM San Jose Laboratory. This work uses a
common language for specifying the abstract representation
of source programs as well as for specifying the data
translation operations. The language is a subset of CONVERT
with some of Codd's relational operators [GG4]. The
operators of the language are designed to have a simple
semantics and convenient algebraic properties to facilitate
program transformation. They are designed to handle data
manipulation in a general hierarchical structure called a
"form" as well as relational tables. In this system,
-125-
prog
oper
mode
oper
reco
i nve
it i
base
f unc
the
(the
refe
by a
stat
prog
exi s
prob
ram transformation is dictated by the
ations applied to the source data base
1 assumes that the inverse of these
ators exists; i.e., the source data
nstructed from the target data base by
rse operators on the target data base.
data mapping
The proposed
data mapping
base can be
applying some
More precisely,
s assumed that M'(T) = S where S is the source data
, T is the target data base, and M is the mapping
tion. Thus, program conversion is done by substituting
inverse M'(T) into the specification language statements
abstract representation of the source program) for each
rence to the source data base. This process is followed
simplification procedure to simplify the resulting
ements (the target abstract representation of the
ram). The author points out that the assumption on the
tence of M'(T) restricts the scope of the conversion
1 em handled by the proposed approach.
Presently, the Data Base Program Conversion Task Group
(DPCTG) of the CODASYL Systems Committee is investigating
the application group conversion problem. The group is
looking into various aspects of the problems including
decompilation of COBOL application programs, semantic
changes of data bases and their effects on application
programs, program conversion techniques and methodologies,
etc .
To
convers
progres
impl erne
program
Current
possi bl
compl ex
the dat
underta
can be
all .
semi -a u
manual
thi s
ion is
s has
n t a t i o n
conver
rese
e for
i t y of
a ha s b
ken to
done se
A full
torn at i c
convers
date , th
stil 1 very
to be m
The
si on are m
arch i n d i
some typ
program c
een modifi
determi ne
mi -automat
y automati
systems o
ion would
e wo
much
ade
probl
u 1 1 i t
cates
es o
onver
ed .
what
i c a 1 1
c too
r sys
be a
rk
at t
befor
ems
u d i n o
tha
f da
si on
Furth
can b
y, an
1 is
terns
more
on
he
e
of
us
t
ta
depe
er r
e do
d wh
hard
whi
real
a p p 1 i c a t
esearch s
e can s
automatic
nd extrem
rogram c
convers io
nds on ho
esearch
ne automa
at cannot
to a c h i e
ch provi
istic goa
i on
tage
tart
app
el y
onver
n , b
w dra
needs
t i c a 1
be
ve.
de a
1 .
program
and more
actual
1 i c a t i o n
compl ex .
s i o n is
ut the
stical 1 y
to be
ly, what
done at
Building
ids for
Current Research Directions . The current research has
uncovered several problems which need to be investigated
further before the implementation of a generalized
conversion tool can be attempted. The following issues are
believed to be important for future research:
Semantic
Descri pt i on of
Based on the work
Data
Base and Application
a se ana App
Programs Based on the work by Su and Liu [PT13J and~the
study of the DPCTG group, it is quite clear that a program
conversion system would need more information about the
-126-
semant i
the i n
DBMS,
i m p o r t a
appl ica
a p p 1 ica
needs t
of a pp
changes
program
convers
Several
[Mll.SM
on this
c propert
formation
Semantic
nt sour
tion prog
tion pro
be cond
1 i c a t i o n
to data
s , and
ion whic
exi st
1,2,3,DL2
subject.
i e s of the
provided
informat
ce for
rams which
gram conv
ucted to 1
programs ,
bases a
3 ) derive
h account
i n g wo r
9] may pro
source
by th
ion of
determi
is the
ersi on
) model
2) stu
n d t h e i
transf
for t
ks on
vide a g
and targe
e schema
the dat
ning th
real bo
probl em.
and descr
dy the me
r effect
ormation
he mean
data
ood basis
t data b
s of the
a bases
e sema
ttl eneck
Future
i b e the
a n i n g f u 1
on ap
rul es fo
ingf ul
base
for fut
ases than
exi st i ng
is an
n t i c s of
of the
research
semanti cs
semantic
pi i cat ion
r program
changes .
semantics
ure works
Equi val ency
convers
data ba
perform
source
retr iev
program
rel atio
not cle
program
o r i g i n a
source
without
we can
and tar
the sou
ion
se .
the
progr
e, d
bee
ns m
ar at
gene
1 int
data
1 os i
estab
get p
rce a
may
A co
id
am o
el et
ause
ay
al 1
rate
ent
ca
ng t
lish
rogr
nd t
of S
al ter
nverted
entical
n the s
e , or
some
have be
how we
d by a
of the
n be
he o r i g
the eq
ams bas
arget d
ource
the
appl
ope
ource
upda
reco
en ch
can
conve
sourc
recon
inal
u i v a 1
ed on
ata .
and
sema
i c a t i
ratio
data
te t
rds
anged
prove
rsi on
e pro
struc
data
ence
the
Ta
nti c
on pr
n on
. Fo
he s
may
in d
, in
syst
gram .
ted
rel at
rel at
same
rget Progra
conten
ogram
the ta
r exam
ame da
be de
ata co
genera
em st i
Nat
from
ions a
ion be
effort
ts of t
may or
rget da
pie, it
ta as t
1 eted
n v e r s i o
I , that
I I pres
ural 1 y ,
the ta
nd occ
tween t
they
ms
he s
may
ta a
may
he s
and
n .
a t
erve
if
rget
urre
he s
hav
Data
ource
not
s the
not
ource
data
It is
arget
s the
the
data
nces ,
ource
e on
techn
trans
1 angu
a us
trans
decom
progr
compi
i s t
produ
conta
i n h i b
progr
chang
envi r
the p
into
De compi 1 at ion
lque wh
formed i
age or
able la
formati o
p i 1 a t i o n
am to
1 e r s is
hat the
ce a fun
the
the c
has
DBMS
onmental
rogram d
a form a
i n
it
am
ed
ereby
nto a
an ab
nguag
n to
proc
a 1 a
a com
dec
c t i o n
DBMS
onver
the
en
con
u r i n g
pprop
Progra
a dat
n opera
stract
e 1 eve
a hi
ess an
nguage
p i 1 atio
ompi 1 at
ally e
, data
si on pr
same " i
vi ronme
d i t i o n s
the pr
riate t
m co
a ba
tion
repr
1 i
gher
d t
1 ev
n pr
i on
qui v
mo
oces
nten
ntal
sho
oces
o th
nvers
se ap
ally
esent
n a
ord
he p
el a
ocess
to a
al ent
del
s .
t" wh
CO
uld b
s of
e new
i on v
pi ica
equi
a t i o n
con
er 1
roces
pprop
. Th
high
pro
and d
That
ile b
n d i t i
e eas
compi
syst
ia d
tion
val e
and
vert
angu
s o
riat
e un
er o
gram
ata
is,
eing
ons .
ily
ling
em .
ecompi 1
progra
n t h i g
then r
ed fo
age 1 e
f retu
e to co
d e r 1 y i n
rder 1 a
that
depende
the
unrel a
The
i ncorpo
the pr
a t i o n
m i s
her
et urn
rm .
vel
rni ng
nvent
g co
nguag
does
n c i e s
decom
ted t
ch
rated
ogram
i s a
f i rst
order
ed to
The
i s a
the
i o n a 1
ncept
e can
not
that
pi 1 ed
o the
anged
i nto
back
-127-
Some researchers think that this would be the preferred
method to effect DML/host language program conversion. It
should avoid many of the efficiency/ restrict ion drawbacks
inherent in current automated methods, while being more cost
effective and less error prone than current manual methods
(e.g., program rewrite).
to u
the
DML
i s t
orga
deve
DML
prog
Stru
towa
1 eve
ini t
prop
One 1 ik
s e it to
programs
rel ated
be exp
ni zati on
1 opment
rel ated
rams tha
ctured
rd the
1 1 angu
ial cone
osed by
e 1 y d i
conve
may h
code
ected
of
of str
code
t are
tempi a
devel o
age i
epts o
the Un
sadvanta
r t exist
ave to f
i n a str
because
DML/host
uctured
shoul d
converti
tes mig
pment/se
nto whi
f data b
i v e r s i t y
ge to this m
ing data bas
irst be manu
uctured form
of the ambig
1 anguage p
programmi ng
provide a
bl e by the
ht al so prov
1 ect ion of
ch programs
ase program
of Michigan
ethod is that in order
e application programs
ally al tered to pi ace
at. This disadvantage
uity inherent in the
rograms. However, the
templates designed for
means for creating
decompilation method,
ide the needed insight
an appropriate high
can be compiled. Some
templates have been
[PT10].
Conversi on Aids A system
conversion anal ysts would se
feasible task. Given the inf
semantics of the data, a
application programs to 1)
segments which are affected
inefficient code in the pro
execution profile [GG1] wh
computation time required at
and 4) detect, in some cases,
on the programmer's assumptio
records, record or file size
together with some on-line ed
speed up the manual convers
in 2 and 3 would be useful
target programs and the da
conversion analyst to el imina
programs which makes the pro
automatic) extremely diffic
referencing than that usuall
can assist the conversi o
ramifications of changes to
product is the Data Correlat
produced by PSI-TRAN Corpora
used during a conversion proc
a data base structure cha
effected data base items in t
generated by the compiler t
needing changes. A more comp
would be a much better tool,
which provides assistance to
em to be a practical tool and a
ormation about data changes and
system can be built to analyze
identify and isolate program
by the data changes, 2) detect
grams, 3) produce a program
ich gives an estimate of the
different parts of the program,
the program code which depends
n of data values, ordering of
, etc. The data obtained in 1,
iting and debugging aids, would
ion process. The data obtained
for producing more efficient
ta obtained in 4 would help the
te the "implicit semantics in
gram conversion task (manual or
ult. A more complete cross-
y produced by today's compilers
n analysts in identifying
programs. An example of such a
ion and Documentation system
tion. One technique, sometimes
ess that has been initiated by
nge, is to alter the names of
he DDL only and use errors
o locate those program segments
lete cross-referencing system
if it were available.
-128-
Opt i mi zation of Target
conversi on ,
occur. This
new access
conversion .
have the ch
program. The
time may de
during progra
general i ty ,
[PT13.14] co
equivalent of
do a global o
converted pr
related to pr
mul ti pi e acces
i s because redu
paths may be
In this situati
oice of selecti
efficiency of
pend on the se
m conversion,
some program
nvert small s
DML statements
pti mi zation or
ogram. Techni
ogram conversio
Program As the result of data
s paths to the same data may
ndant data may be introduced or
added in the course of data
on, a conversion system will
ng a path to generate the target
the program during execution
lection of optimized access path
Also, for reasons of achieving
conversion techniques proposed
egments ' of programs or the
separately. It is necessary to
simplification to improve the
ques for program optimization
n need to be investigated.
6.2.3 Prototype Conversion Systems Analysis. This section
analyzes tne state ot the art ot generalized data conversion
systems. It summarizes what has been learned in the various
prototypes. The prototypes have yielded encouraging
results, but some weak points have also emerged. A section
below lists some questions that remain to be answered and
comments on additional features that will be necessary to
enhance usability. The following section on Analysis of
Architecture analyzes some implementation issues which can
affect the cases where a generalized conversion system can
be appl ied .
Where Do We
Secti"on 5". 772 have
some of these tests were
Stand. The prototype systems described in
been used in a few conversions. While
made on "toy files," a few of the
tests involved data volumes from which realistic performance
estimates can be extrapolated. This section will summarize
the major tests that were done with each of the prototypes.
The Penn Translator The translator developed by Ramirez
at the University of Pennsylvania [DT3,6] processes single
sequential files to produce single sequential target files.
Facilities exist for redefining the structure of source file
records, reformatting and converting accordingly.
Conversion of the file can be done selectively using user-
defined selection criteria. Block size, record size, and
character code set can be changed, and some useful data
manipulation can be included.
The translator was used in several test runs on an
IBM/370 Model 165. The DDL to generated PL/1 code expansion
ratio was 1:4, so coding time was reduced.
-129-
A further test of the Penn Translator was conducted by
Winters and Dickey [TA1]. An experiment was conducted
comparing a conventional conversion effort against the Penn
Translator (slightly modified). Two source files stored on
IBM 1301 disks under a system written for the IBM 7080 using
the AUTOCODER language were converted to two target files
suitable for loading into IMS/VS. Much of the data was
modified from one internal coding scheme to another. The
conversion required changing multiple source files to
multiple target files.
The conventional conversion took seven months versus
five months for the generalized approach, a productivity
improvement of roughly thirty percent. Time for adapting
the translator, learning the DDL, and adapting to a new
operating system is included in the five month figure.
Without these, an estimate of three months was made for the
conversion using the generalized approach.
The SDC Trans! ator The translator described in [R3,4]
was Tmp 1 emenTell during 1975-1976. The translator could
handle single, hierarchical files from any of three local
systems--TDMS, a hierarchical system which fully inverts
files; DS/2, a system which partially inverts files; and
ORBIT, a bibliographic system which maintains keys and
abstracts. Data Bases were converted from TDMS to ORBIT,
from TDMS to DS/2, and vice-versa, and from sequential files
to ORBIT. TDMS files were unloaded using an unload utility.
Target data bases were loaded by target system load
u t i 1 i t i e s .
The total effort for design and implementation was
about three man-years. The system was implemented in
assembly language on an IBM/370 Model 168, and occupied
about 40 K-bytes, not including buffer space which could be
varied. The largest file tested was on the order of 5
million characters and the total conversion time was about 1
minute of CPU time per 2.5 megabytes of data.
The work was discontinued in 1976.
The Honeywel 1 Trans
developed at Honeywe
performed file conversio
f i 1 es from IBM, Honey
8200 sequential and inde
fields could be change
alignment. New fields c
could be permuted wit
(fixed, variable, blocke
compare utility was av
of files with different
1 ator The prototype
by Bakkom and
( one file to
11
ns
well 6000, Honeywel
xed sequent ial file
d as well as field
ould be added to a
hin a record. F
d, etc.) could be
ail abl e for checki n
f i el d organi za ti ons
file tra
Behymer
one f i 1 e)
1 2000, Ho
s. Data t
justificat
record and
ile record
changed
g the cons
and enc
nsl ator
[DT11]
among
neywel 1
ypes of
ion and
fields
format
and a
i s t e n c y
odi ngs .
-130-
Tests of up to 10,000 records were run. Performance of 15
milliseconds per record was typical (Honeywell Series 6000
Model 6080 computer). The prototype has been used in a
conversion/benchmark environment but has not been offered
commercial ly .
The Michigan Transl ator Version IIB, Release 1.1 of the
Michigan Translator was completed for the Defense
Communications Agency in October 1977 [UT16]. It offers
complete conversion/restructuring facilities for users of
Honeywell sequential, ISP, or I-D-S/I files. Up to five
source data bases of any type may be merged, restructured or
otherwise reorganized into as many as five target data
bases, all within a single translation. Data Base
description is accomplished by minor extensions to existing
I-D-S DDL statements. Restructuring specification is easily
indicated via a high level language. Tests performed to
date included a conversion of a 150,000 record I-D-S/I data
base with a total elapsed time of 24 hours (500 milliseconds
per record). A given translation can be broken off at any
point to permit efficient utilization of limited resources
and also protect against system failures. The user is
provided with the capability of monitoring translation
progress in real time.
XPRS Test cases with
functionally duplicating
conventional methods. Se
Each case involved at
there was a requirement t
file, match with instan
redundant or unwanted dat
structure in the output,
for conditional actions b
all cases, the XPRS lang
adequate to repl icate the
of at least fifty per
debugging time was achiev
several thousand record
in that XPRS can restruct
be delivered from dire
performance comparisons w
programs with custom writ
the XPR
earl i e
veral ca
1 east t
o sel ect
ces in a
a , and b
In sev
ased on
uages we
convers
cent i n
ed . Tes
s. Perf
ure data
ct acce
ere made
ten prog
S system ha
r real con
ses have b
wo input fi
some inst
nother file
u i 1 d up a n
eral cases
fl ags wi thi
re found to
ion. A p r
total analy
t runs wer
ormance was
at 1 east a
ss storage
comparing
rams .
ve focussed on
versions done by
een programmed.
1 es . General 1 y ,
ances from one
, eliminate some
ew hierarchical
there was a need
n the data . In
be functional 1 y
oductivity gain
sis, coding, and
e conducted on
deemed adequate
s fast as it can
No detai 1 ed
XPRS-generated
Questions Remaining To Be Answered . Given that several
prototype data translation systems are operational in a
laboratory environment, there is a little question
concerning the technical feasibility of building generalized
systems. The remaining questions pertain to the use of a
generalized system in "real world" data conversions
involving a wide variety of data structures, very large data
volumes, and significant numbers of people. Three major
-131-
questions to be resolved are
1. Are the generalized systems
enough to be used in real conversions,
what will it take to make them
compl ete?
functionally complete
and if not,
functional ly
2. Can the people involved in data conversions use the
languages? What additional features are necessary
to enhance usability?
3. Overall, what is the productivity
with the generalized approach?
gai n avai 1 abl e
Within the next year, prototype systems will be
exercised on a variety of real-world problems in data
translation, and concrete answers to these questions should
be available. The systems being further tested for cost-
effectiveness are the Michigan Data Translator, the IBM XPRS
system, and the Bell Laboratories ADAPT system.
To date, preliminary results have been promising. A
significant sample size on which to do analysis of
productivity gain should be available at the end of the year
of testing.
A number of factors must be
measuring the cost-effectiveness
translator versus the
These factors include:
taken into account in
of the generalized data
conventional conversion approach.
ease of learning and using the higher level
languages which drive the generalized translators;
availability of functional capability to accomplish
real-world data conversion applications within the
generalized translators;
overall machine efficiency;
correctness of results from the conversion;
ability to respond in timely fashion to changes in
conversion requirements (conversion program
"mai ntenance" ) ;
debugging costs;
-132-
ability to provide "bridge back" of converted data
to ol d appl ications ;
ability to provide verification of correctness of
data conversion;
capabilities for detection and control of data
errors .
The languages used to drive generalized data
translators are high-level and non-procedural; they provide
a "user-friendly" interface to the translators. Since the
languages are high-level, programs written in them have a
better chance of being correct. Experience to date with
DEFINE and CONVERT, the languages which drive XPRS, has
shown that users can learn these languages within a week; it
has also shown that some practice is necessary before users
start thinking about their conversion problem in non-
procedural rather than procedural terms.
In early test cases, the languages which drive
generalized data translators have been found to be
functionally adequate for many common cases. In those cases
lacking a feature, a "user hook" facility is often provided.
However, forcing a user to revert to a programming language
hook defeats the purpose of the high level approach, and
interfacing the hook to the system requires at least some
knowledge of system interfaces. Thus, high level languages
must cover the vast majority of the cases in order to
succeed; otherwise, users will perceive little difference
over conventional approaches.
Facilities for detecting and controlling data errors in
the generalized systems are very important, and most of the
prototypes do not yet do a complete job in this area.
However, the generalized packages offer an opportunity for
generalized, high level methods for dealing with data errors
during conversion, and it could well be that once these
error packages are developed, they will contribute to even
larger productivity gains than have been experienced to
date.
The high-level language approach to driving generalized
translators should provide the ability to respond to changes
in conversion requirements with relative ease. Since large
conversions often take one or more years, it is not unusual
for the target data base design to change or for new
requirements to be placed on the conversion system. In
other words, in a large conversion effort, the programs are
not as one shot" as is commonly believed. In large
conversions, the savings in
could be significant.
conversion program maintenance
-133-
Generalized systems can also be used to map target data
back to the old source data form, assuming the original
conversion was information-preserving. This capability
provides a means for verifying the correctness of the data
conversion. In addition, this capability can be used as a
"bridge back" to allow users to continue to run programs
which have not yet been converted against the data in the
old format. Using a generalized system in this way allows
phased conversion of programs without impacting user needs
during the conversion period.
In an environment where a generalized translator is
used regularly as a tool for conversion, costs associated
with the debugging phase should be decreased. Common,
debugged functional capabilities will be utilized, whereas
it is unusual in the conventional approach for common
conversion modules to be developed. Thus, each new
conversion system requires debugging.
Usabil ity The usability of generalized data translation
systems must also be evaluated. Experience to date
indicates that the languages are easy to learn and use.
However, it would be wrong to think that these prototypes
are mature software products or that they can be used in all
conversions. This section discusses some of the unanswered
questions with respect to usability of the current data
conversion systems.
One question concerns the level of users of the
generalized languages. Current prototypes have been used by
application specialists and/or members of a data base
support group. The systems have not yet been used by
programmers, and the question remains whether programmers
(as opposed to more senior application specialists and
analysts) will be able to use the systems productively.
There is no negative data on this point; the systems have
not been used widely enough.
At prese
expl ici tl y t
using a spec
descri ption
they resembl
However, the
which can be
file with h
system should
desc ri ption ,
a system-COBO
Data Transl at
an interface
evolve. Not
nt , all the s
he source da
i a 1 data d e
languages ar
e statements
writing of t
tedious becau
undreds of f
be able to
such as thos
L macro 1 ibra
or [DT16], it
will be avail
e, however,
ystems re
ta to be
scri ption
e general
in the
he descri
se a pers
i el ds . I
make us
e e x i s t i n
ry. As e
is reaso
abl e as
that a
qui re
acces
1 an
1 y ea
COB
ption
on ma
deall
e of
g in
v i den
nabl e
data
data
a user t
sed by th
guage.
sy to 1 ea
OL Data
is a man
y have to
y, a data
an exi
a data d i
ced by th
to expec
c o n v e r s i
d i c t i o n a
o describe
e read step
These data
rn and use;
Division.
ual process
describe a
conversi on
sting data
ctionary or
e Michigan
t that such
on systems
ry or COBOL
-134-
macro library link may not necessaril
Data in current systems is not alwa
to be converted. This is especially
files. In these files, data definiti
the record structures of the programs
depends on a knowledge of the pr
Even with existing data bases, some f
may not be fully defined within
description. Thus, the user can exp
manual effort in developing data de
documentation is incomplete, this can
task, though it probably must be don
a generalized package is used or not.
y solve th
ys fully eno
true with no
on often is
, and a full
ocedural pro
i el ds and a
the system
ect a certai
f i n i t i o n s .
be a time
e regardless
e problem,
ugh defined
n-data base
embedded in
definition
gram logic,
ssociations
data base
n amount of
If existing
consumi ng
of whether
Another area where a user may have to expend effort is
in the unload step of the data conversion process. The data
description languages used to drive the read step have a
limited ability to deal with data at a level close to the
hardware (e.g., pointers, storage allocation bit maps,
etc.). Generally one assumes that a system utility program
can be used to unload source data and remove the more
complex internal structures. Another alternative is to run
the read step on top of an existing access method or data
base management system with the accessing software removing
the more complex, machine dependent structures. While
acceptable alternatives in a great many environments,
including most COBOL environments, some cases may exist
where neither approach will work. For example, a
load/unload utility may not exist, or a file with embedded
pointers which was accessed directly by an assembly language
program might not be under the control of an access method.
For these cases, the user is faced with complexity during
the unload step. The complexity associated with accessing
the data would appear to be a factor for either the
conventional methods or for the generalized approach.
However, in cases such as those above, some special purpose
software may have to be developed. Note that some research
LSL8] has examined the difficulty of extending data
description languages to deal directly with these
complex cases and has concluded that providing the
description language with capabilities to deal with
complex data structures greatly complicates
implementation and has an adverse affect on usability.
Thus, special purpose unload programs will continue to be
required to deal with some files.
Analysis of Architectures . This section discusses some
ot the different approaches that have been taken in
implementing the prototype data conversion systems. The
objective is to analyze some of the performance and
usability issues raised by the prototypes.
more
data
more
the
-135-
Two ap
generative
interprets v
the genera
output file
program gen
programs a
conversion .
the target
programs a
approach ,
restructuri
interpreted
proaches have been
approach in the Pe
e approach in the M
tive approach, a d
s, and restructuri
erator. From these
re generated to
In both the Pen
language for the ge
re then compiled
tables are buil
ng description,
to carry out the d
used in the
nn Translator a
ichigan Data T
escription of t
ng operations
d esc ri pt ions ,
accompl i sh
n Translator an
nerator. The
and run. In t
t from the
These tabl
ata conversion.
prototypes--a
nd XPRS, and an
ranslator. In
he input files,
is fed to a
special purpose
the described
d XPRS, PL/1 is
generated PL/1
he interpretive
data and/or
es are then
In data conversion systems, as in other software, an
implementation based on interpretation can be expected to
run considerably more slowly than one based on generation
and compilation. Initial experience with prototype data
translators has shown that there is much repetitive work,
strategies for which can be decided at program
compilation/generation time. Also, there is a good deal of
low level data handling, such as item type conversions.
Thus, those implementations based largely on an interpretive
approach run more slowly, and the ability to vary bindings
at run time does not appear to be necessary. Interpretation
was chosen in the prototypes for ease of implementation, and
in the future it can be expected that a compilation-based
approach or a mixture of compilation with interpretation
will be the dominant implementation architecture. However,
for medium scale data bases, the machine requirements of the
interpretive data conversion prototypes are not
unreasonable, and overall productivity gains are still
possi bl e.
on t
can
case
ran
been
coul
this
comp
whic
and/
conv
1 ang
expe
conv
Performance measurements with conversion systems based
he generative approach indicate that generalized systems
be quite competitive with customized programs. In one
, the program generated by the data conversion system
slightly faster than a "customized" program which had
written to do the same job. However, this example
d well be the exception and it would be naive to expect
in general. The reason generalized packages can be
etitive is that they often have internal algorithms
h can plan access strategies to minimize I/O transfer
or multiple passes over the source data. "Customized"
ersion programs written in a conventional programming
uage often are not carefully optimized, since the
ctation is that the programs will be discarded when the
ersion is done.
-136-
A second architectural difference involves the use of
an underlying DBMS or not. In both the Penn Translator and
XPRS, the generated PL/1 program, then executing, accesses
sequential files, performs the restructuring, and writes
sequential files. On the other hand, the Michigan Data
Translator functions as an application program running on a
network structured data base management system. Thus, the
interpreter makes calls to the underlying DBMS to retrieve
data during restructuring and puts restructured data into
the new data base .
The two approaches offer different tradeoffs. For
example, the Michigan Data Translator can make use of the
existing extraction capabilities of a DBMS and perform
partial translations easily. In addition, since it operates
directly within the network data model, a user does not have
to think of "unloading" data to a file model and then
reloading it back; rather, the user describes a network to
network restructuring much more directly.
to a
data
probl
DBMS
diffi
model
the
i s ba
r e q u i
conve
can b
On the ot
data ba
transl ato
em--the
of the da
cul t . I
of the d
data mode
sed. Al s
re more
rsion sys
e importa
her hand
se, the
r i m p 1 i
non-data
ta conve
t can be
a t a b e i n
1 of the
o , the u
on-line
terns can
n t in v e
, whe
use o
es a
bas
rsion
diff
g con
DBMS
se of
sto
be m
ry 1 a
n con
f an
sec
e dat
syst
i cul t
verte
upon
an
rage,
ade t
rge d
vert in
underl
ond o
a must
em , wh
, for
d diff
which
under
wher
o run
ata ba
g non
ying
rder
be c
ich m
examp
ers s
the
lying
eas
tap
se co
-data
DBMS
data
onver
ay or
1 e , w
i g n i f
conve
DBM
the f
e-to-
nvers
bas
as pa
con
ted i
may
hen t
icant
rsion
S ma
i 1 e o
tape .
ions.
e data
rt of a
version
nto the
not be
he data
ly from
system
y al so
riented
This
In the future
• ~. 1WUJ .s.uuo wi V.UMVCISIUII buudiions. ror example, it
possible to interface the "file-oriented" conversion systei
to run as application programs on top of existing data ba:
one can expect that data conversion
systems will offer a variety of interfaces to accommodate
various kinds of conversion situations. For example, it is
»ms
ig data base
management systems. It is also possible to develop "reader
programs" to load non-data base data into conversion systems
based on a DBMS. In addition, more automated interfaces to
data dictionary packages can be expected in order to improve
usability and obviate the need for multiple data
def i nitions .
One possible performance problem with generalized
conversion systems lies in the unload phase. For reasons of
usability, generalized conversion systems usually rely on an
unload utility program to access the source data, thus
isolating the conversion package from highly system specific
data. A potential problem with this approach is that the
unload package may not make good use of existing access
-137-
paths or may tend to acce
which assumes that the
(with respect to overflow
data is badly disorgan
which accessed the dat
considerably faster, and
the only feasible way to
how common this case
argument that the "spe
interfaced to the gene
practical standpoint, the
badly disorganized data
more sophisticated unload
as part of the generalize
ss the source data in a fashion
data has recently been reorganized
areas, etc.)* In cases where the
ized, a customized unload program
a at a lower level might run
for very large data bases might be
unload the data. It is not clear
is, and one can usually make the
cial" unload software could be
ralized package. However, from a
unloading phase on a very large,
base is a performance unknown, and
utilities may have to be developed
d packages.
Summar
for maj or c
Expectation
packages b
acceptabl e
conventiona
to improve
areas of d
data dictio
program an
more--are c
wi 1 1 be rea
y Detailed performance and productivity figures
onversions should be available in about one year,
s are that machine efficiency of the generalized
ased on a generation/compilation approach will be
(no worse than a factor of 2) when compared with
1 conversion programs. Additional enhancements
usability can be expected, especially in the
ata error detection and control and interfaces to
nary software. If the savings in conversion
alysis and coding times — often fifty percent or
onfirmed, then the generalized conversion systems
dy for extensive use."
6.3 OTHER FACTORS AFFECTING CONVERSION
In this section we look at the conversion problem from
two aspects. First, we address the quest ion--What can we do
today to lessen the impact of a future conversion? Second,
we look to the future to see what effects future technology
and standards will have on the conversion process.
6.3.1 Lessening the Conversion Effort. In order to identify
guidelines for both reducing the need for conversion and for
simplifying conversions which are required, one must
consider the entire application software development cycle
because poor application design, poor logical data base
design, inadequate use of and inappropriate DBMS selection
could each lead to an environment which may prematurely
require an application upgrade or redesign. This redesign
could, in many cases, require a major data base conversion
effort .
-138-
The set of guidelines specified below is not intended
as a panacea. Instead, it is meant to make designers aware
of strategies which make intelligent use of current
technology. It is doubtful that all conversions could be
avoided if a project adhered strictly to these proposed
guideliines. However, adherence to the principles set forth
by these guidelines could certainly reduce the probability
of conversion and, more importantly, simplify the
conversions that are required.
With respect to application design and implementation,
the more the application is shielded from system software
and hardware implementation details, the easier it becomes
for a conversion to take place. For example, a good
sequential access method hides the difference between tapes,
disks, and drums from the application programs which use the
access method.
The logical data base design should be specified with a
clear understanding of the information environment. A good
logical data base design reduces the need to restructure
because it actually models the environment it is meant to
serve. Introduction of data dependencies in the data
structure should, if possible, be kept to a minimum. An
analysis of the tradeoffs between system performance and
likelihood of conversion should definitely be made.
Selecting the wrong or non-optimal data base management
system, given the application requirements, is also
problem which can lead to unnecessary and large
efforts. The prospective user of a DBMS
example, carefully evaluate the data
characteristics of a proposed DBMS.
a key
conversion
shoul d , for
independence
The underlying principle of the guidelines which follow
is that decisions can be made at the system design and
implementation stages which are crucial to the stability of
theapplications.
Appl ication Design Guide! ines.
the r
the 1
base
such
will
base ,
what
answe
under
at th
Requirements
equi r
ong-t
desi
as wh
the
what
a"re
red a
stand
e out
ements
erm eff
gn as
at func
data
are th
the per
t this
the
set in
Analysi s
ana 1 y si s
ectivenes
wel 1 as
t i o n s doe
base ser
e possibl
formance
stage. I
i n format i
order
Many of
stage of
s of the
appl i c a t
s the a
ve and h
e future
constrain
t is esse
on e n v i r o
to lesse
the d
syste
appl i
ion p
ppl ic
ow wi
uses
ts of
n t i a 1
nment
n th
eci si
m dev
c a t i o
rogra
a t i o n
11 th
of
the
that
asm
e pr
ons made
el opment
n system
ms) . Que
requi re
ey use th
the data
appl i c a t i
the de
uch as po
obabil i ty
during
affect
( data
s t i o n s
, who
e data
and
are
signer
s s i b 1 e
that
>
on
-139-
frequent conversions will be necessary.
and
Phys
v iew
infl
seco
the
i s
envi
envi
set
exi s
Requirements analysis should focus o
should minimize constraints be in
ical environment since it can disto
of the application system's tru
uence of the physical environment sho
ndarily, in order that the designe
resulting compromises to the logical
not intended to imply that considerat
ronment is unimportant. Indeed,
ronment is ignored, the effect could
of requirements that are impossibl
ting physical and cost constraints.
n information needs
g imposed by the
rt the desi gner' s
e objectives. The
uld be considered
r be f ul ly aware of
requirements. This
ion of the physical
if the physical
be development of a
e to meet within
Program Design Guide! ines Three
otivate this discussion oT appli
m
They
underlying principles
cation program design.
are
design for maintainability
design for the application
data independence
Keeping sight of all of these during the design of the
application program will lessen conversion effects by
rendering the application as free as possible from physical
consi derat i ons .
Designing for maintainability implie
application should be written in a high-level
a syntax that permits good program structure
programming techniques such as top-down prog
implementation should be used throughout. The
be modular with relatively small, functio
programs. The programs should all be well
organized for readability. Design reviews
walkthroughs also help to expose errors in
design and "holes" in the application log
stage. It has been well documented that thes
ease making program modifications.
s that the
language with
Structured
ram design and
system shoul d
nal 1 y oriented
commented and
and program
the overall
ic at an early
e steps hel p
One error which is often made in designing programs in
a DBMS environment is to let the capabilities of the DBMS
drive the design rather than the application. This design
error can yield programs which are unnecessarily dependent
upon the features of a specific DBMS. For example, in
System 2000 one can use a tree to represent a many-to-many
relationship instead of using the LINK feature. The
parent/child dichotomy that results is an efficient but
-140-
arbitrary contrivance that cannot easily be undone later on.
ine key principle here is to concentrate on what results are
desired rather than on the implementation details of
achieving these results. Simplicity and generalization of
the design will provide a very high level of interface to
the application programmer which will, in turn, minimize the
total amount of software, provide the greatest degree of
portability, maintainability, devide independence, and data
independence.
of
Of extreme importance in program design is the notion
data independence; i.e., insulating the application
program from the way the data is physically stored.
Layered Design
That is, designing the application as a series of layers
each of which communicates with the system at a different
level of abstraction. One can visualize this as an "onion,"
ire
;er
he
level of abstraction. One can visualize this as an "onion
with hardware as its core and layers of successively moi
sophisticated software at the outer layers. The use
interacts with the outermost skin of the onion, at tl
highest level of abstraction.
1 aye
to u
than
i ntr
the
main
that
abov
subj
prog
cri t
If application programs are written
rs of the onion, then these programs
nderstand and, therefore, easier to
programs written at lower laye
oduction of a new mainframe will requ
software which references the p
frame. However, since the layers a
physical machine and device indep
e some level, only the software bel
ect to modification. To the exte
rams stay at the outermost layers
ical layer) reduced conversion effect
at the outermost
are smaller, easier
modify or convert
rs. For example,
ire conversion of
articulars of the
re constructed so
endence i s real i zed
ow that level is
nt that application
(i.e., above the
s can be achieved.
We can thus summarize the goal
follows:
of program design as
to provide the highest
interface to the program
to maximize
character!' sties
possibl e appl i cat ion
program independence from the
of the mainframe, peripherals, and
data base organization
-141-
. to maximize portability of the application program
through the use of high-level languages
to maintain a clean program/data interface
Programming Techniques
The previous sections of this chapter have focused on
the design decisions which should be made to alleviate the
conversion problem. However, regardless of how noble these
goals are, poor implementation decisions can go a long way
towards diminishing the returns of a good design. Equally
important to intelligent design is a set of programming
techniques and standards which prohibit programmers from
introducing dependencies in code. For example, a "clever"
programmer may introduce a word size dependency in a program
by using right and left shifts to effect multiplication and
division. Of course, there are no hard and fast safeguards
against using tricky coding techniques; an effort must be
made to make the programmer conscious of the consequences of
this kind of coding. In particular, a programmer should not
be allowed to jump across layers of the onion, such as using
an access method to read or write directly data bases.
Data Base Design . Perhaps the most costly mistake a
designer can maTe is an error in the data base design
because it has a direct effect on the information that is
derivable and the application programs that are created.
Incorrect or unanticipated requirements can lead either to
information deficient data bases or overly complex and
general design. An inadequate logical design has the
potential for complex user interfaces or extremely long
access time. A poor physical design can lead to high
maintenace and performance costs. Unfortunately, data base
design is still an art at the present time. Two surveys
report the results in the area to date. Novak and Fry
[DL26] survey the current logical data base design
methodology and Chen and Yao [DL34] review data base design
in general. The work of Bubenko [DL31] in the development
of the CADIS system and the abstraction and generalization
techniques of Smith and Smith [DL29,30] show promise.
An accurate logical design can still be unnecessarily
data dependent. Dependencies are inadvertently or
deliberately introduced in the interest of improving system
performance. In essence, "purity" is compromised to gain
processing efficiences. Since optimization is a worthwhile
goal, insisting on absolute purity may be unreasonable.
However, the data base designer should at least be aware of
contrivances and, therefore, be in a position to evaluate
the relative effects a design decision may have. Designers
should become sensitive to their decisions by asking: "How
-142-
will the data model be affected by a future change in
performance requirements? Have I done a reasonable job in
insulating applications from data structure elements that
are motivated strictly by performance considerations?"
Some examples of induced data dependencies in logical
data base design which may impact upon conversion are:
The use of owner-coupled sets in DBTG to implement
performance-oriented index structures or orderings
on records.
Storing physical pointers (or data base keys) in an
information field of a record
. Combining segment types (in DL/1) to reduce the
amount of I/O required to traverse a data base.
DBMS Utilization and Selection . Selection of a DBMS
on conversion requirements. Of
is to consider products
can have a major impact
importance in evaluating a DBMS
exhibiting the highest level user
interface.
A
set of
the poi
f unctio
level"
the DB
(sel ect
tupl es
restric
The DB
s i g n i f i
by-reco
high 1
f u n c t i
nt of
ns , t
and "1
MS pr
, retr
wh i c h
ted to
MS wi
cantl y
rd app
evel DBMS
ons and a
view of
hat is, t
ow 1 evel "
o v i d e s us
ieve, upda
satisfy
record-a
th the "
more desi
roach .
is characterized by both a
high degree of data independe
the application. With res
he DML, the distinction betwe
has traditionally centered on
er operations on sets of
te, or summarize a! 1 the rec
some conditions) or whethe
t-a-time processing ("navig
high-level" set operation app
rable than the navigational
powerful
nee from
pect to
en "high
whether
records
ords or
r one is
at ion" ) .
roach is
record-
DBMS prospects should evaluate the data independence
characteristics of a proposed product. Systems are
preferred which support an "external schema" or "subschema"
feature which permits the record image in the application
program (the user work area) to differ significantly from
the data base format. However, the subschema concept is
only one aspect of data independence. In general, it is
necessary to determine in what ways and to what extent the
application interface is insulated from performance or
internal format options. For instance, will programs have
to be modified if:
-143-
a decision is made to add or delete an index?
the amount of space allocated to an item is
increased or decreased?
chains are replaced by pointer arrays?
Other conversion related questions about DBMS products
include the following:
Are there adequate performance and formatting
alternatives? Are there too many (i.e.,
unproductive or incomprehensible) tuning options?
Are there adequate performance measurement
techniques and tools to guide the exercise of these
choices?
Does the system automatically convert a populated
data base when a new format option is selected?
Aside from tuning, does the DBMS gracefully
accommodate at least simple external changes such as
adding or deleting a record or item type?
Are there other useful high level facilities
associated with or based on the DBMS, such as a
report writer, query processor, data dictionary,
transaction monitor, accounting system, payroll
system, etc.?
Is there a utility for translating the data base
into an "interchange form;" i.e., a machine
independent, serial stream of characters?
Is the vendor committed to maintaining the product
across new operating system and hardware
releases/upgrades? Conversely, is the vendor
prepared to support the product in order released of
the operating system, so the user will not be forced
to upgrade?
What hardware environments are currently supported
and what is the vendor's policy regarding conversion
to another manufacturer's mainframe?
What programming language interfaces are available?
Gan the same DBMS features be used if there is a
migration, say, from COBOL to PL/1?
-144-
user
How intelligent is the system's technique for
organizing data on the media? Specifically, will
performance deteriorate at an inordinate rate as
updating proceeds? How often will reorganization
(cleanup) be required? Does the DBMS have a built-
in reorganization utility? How does the
determine the optimal time to reorganize?
Are the language facilities and data modeling
facilities of DBMS adequate for the anticipated long
term requirements of the enterprise? What is the
risk of having to convert to a new DBMS?
Likewise, are the performance characteristics and
internal storage structure limitations adequate to
meet the long term requirements (response times
data base sizes) of the enterprise?
Are there facilities to assist the user in
converting data from a non-DBMS environment or from
another DBMS? For instance, can a data base
loaded from one or more user defined files?
be
6.3.2 Future Technologies/Standards Impact.
In this section we discuss trends in computer hardware
technologies, DBMS software directions, and standards
development, and consider their impact on data and program
conversion. We intend to make the reader aware of what to
expect in terms of conversion problems rather than give a
complete assessment of future technologies. Therefore we
discuss only technologies and standards that will impact
conversion problems.
The first three parts discuss the areas of hardware
software, and standards and their impact on conversion
some detail. The last part summarizes the
our assessment without going into detailed
major points
reasoni ng .
i n
of
Hardware and Architectural Technologies . The cost and
performance of processor HTgic iTPTa memory continue to
improve at a fast rate. As a result, overhead costs are
more acceptable, especially when such costs save people's
time and work, and provide user oriented functions that do
not require a computer expert. In particular, one can now
trunk about using generalized conversion tools not only when
it is required as a result of hardware or software changes
Dut also as a result of a changing application that requires
a new more efficient data base organization. What could
have been a prohibitive cost for a data base conversion
tne past, may not be a major factor in the future.
i n
-145-
At the
contributes
accentuates
more cost
maintaining
Improvements
same time, the cost/performance improvement
to the proliferation of data bases and therefore
the need of generalized conversion tools. The
effective is the process of accessing and
data, the more data is collected on computers,
in hardware (as well as software) technologies
create more need for data and program conversion. In
addition, the emergence of new technologies, such as
communication networks, add another level of sophistication
to the way that data can be organized and used. Distributed
data bases, where multiple data bases (or subsets of data
bases) may reside on different machines, require tools for
the integration and the correlation of data. Invariably,
data will need to move from system to system dynamically,
possibly moving between different hardware/software systems.
In this environment, generalized tools for dynamic
conversion will become a necessity.
In recent years, two promising approaches to data
management hardware technologies have been pursued. One is
thi
backend data manag<_
both approaches can help simplify the conversion problem.
nagemeni naraware lecnnuiuyies nave ueen pu.oucu. ^i.v. >-
le specialized data management machine and the other is the
ickend data management machine. As will be explained next,
The specialized data management hardware is based on
the idea of using some kind of an associative memory device,
a device that can perform a parallel access to the data
based on its content. Such a device eliminates the
necessity for organizing the internal structure of a data
base using indexes, hash tables, pointer structures ,, etc . ,
which are primarily used for fast access. As a result, the
data can be essentially stored in its external logical form,
and the data management system can use a high level language
based on the logical data structure only. The conversion
process is simplified since data is readily available in its
logical organization. Referring to the terminology used in
previous sections, the functions of unloading and loading of
the data base can be greatly simplified. Also, no
restructuring will be required because of a change in data
base use, since the physical data base organization can be
to a large degree independent of its intended use. In
addition, the program conversion problem is simplified as a
result of the program interfacing to the DBMS using a high
level logical language.
Similar benefits can be achieved if backend machines
are used. A backend machine is a special processor
dedicated to managing storage and data base on behalf of a
host computer. The primary motive for the backend machine
is to off-load the data management function from the host to
a specialized machine that can execute this function at much
lower cost. From a conversion standpoint, the separation of
-146-
data management functions from the host promotes the need
for a high level logical interface that provides the
advantages discussed above. Another advantage is that it is
possible to migrate from one host machine to another without
affecting the data bases and their management, alleviating
the need for data conversion if the same backend machine
used with the new host.
i s
Mass storage devices, such as the video disks, make
storing very large data bases, in the order of 10 to the
10th power characters, cost effective. Converting large
data bases of this size compounds the cost considerations
merely by the processing of this large amount of data. As a
result, such data bases will tend to stay in the same
environment for longer periods of time. The use of
specialized data management machines or dedicated backend
machines in conjunction with these mass storage devices can
help postpone the need for data base conversion.
Finally,
minicomputers
DBMSs now exist
we should mention the growing use of
supporting data management functions,
on many minicomputers, with more
forthcoming. The proliferation of minicomputers which
support data bases can only increase the needs for
generalized conversion tools.
Software Devel
last years in the
techn i ques
the data ba
called "dat
users need
organ i zatio
rel at ions hi
and manipul
model only
using spec
machines di
the unload
DBMS is p
simpl if icat
in the
that cl e
se from
a indepe
not b
ns of th
ps. Th
a t i o n la
The e
ial ized
scussed
and load
rovided
ion in p
opment Trends .
data ma
arl y sep
its p h y s
ndence"
e expos
e data b
is 1 ed
nguages
ffect of
data m
previous
f u n c t i o
at the
rogram c
nagement
arate th
ical org
was intr
ed to th
ase , bu
to the d
that dep
this t r
anagemen
ly; name
n s since
1 ogica
o n v e r s i o
Much of the work over the
area has concentrated on
e logical structure of
anization. This concept,
oduced to emphasize that
e details of the physical
t only to its logical
evelopment of data access
end on the logical data
end is similar to that of
t machines and backend
ly, the simplification of
the interface to the
1 level only, and the
n for similar reasons.
At the user end of the spectrum, it seems reasonable to
assume that the diversity of data models (network,
relational, hierarchies and other views that may be
developed in the future) will be required for many more
decades. This is especially true since there are problem
areas that seem to map more naturally into a certain model.
Furthermore, it is often the case that users do not agree on
the same model for a given problem area. Obviously, this
state of affairs only accentuates the need to generalize
conversion tools that can restructure data bases from one
-147-
model to another. Even with the development of large scale
associative memories, data structures will likely provide
economic rationales for their contrived use. Another
possibility is the use of a common underlying data model
However, this
that can accommodate any of the user views,
approach will still require some type of a
conversion process between the common view and each
possi bl e user vi ews .
dynamic
of the
Standards Devel opment . There is much work and
controversy fn devel opi ng standards for DBMS. Standards
that are oriented to determine the nature of the DBMS are
hard to bring about even in a highly controlled environment
because of previous investment in application software and
data base development, and because of disagreement. For
example, there is still much controversy whether the network
model proposed by the CODASYL committee is a proper one. It
seems reasonable to assume that there will always be non-
standard DBMSs. Further, even if such a standard can be
adopted, different DBMS implementations will still exist,
resulting in different physical data bases for the same
logical data base. In addition, one can safely assume^ that
restructuring because of application needs will still be
necessary, and that changes in the standard itself may
require conversion.
A standard that is more likely to be accepted is one
that affects only the way of interfacing to a DBMS. In
particular, from a conversion standpoint, a standard
interchange data form (SIDF) will be most useful. A SIDF is
a format not unlike a load format for DBMSs. Any advanced
DBMS has a load utility that requires sequential data stream
in a pre-speci f i ed format. If a standard for this format
can be agreed upon, and if all DBMSs can load and unload
from and to this format, then the need for reformatting (as
described earlier) is eliminated. The conversion process
can be reduced to essentially restructuring only, given that
unload and load are part of the DBMS function. A
preliminary proposal for such a standard was developed by
the ERDA Inter-Working Group on Data Exchange (IWGDE) [GG2].
However, it is only designed to accommodate hierarchical
structures. Consideration is now being given to the
extension of the standard to accommodate more general
structures (i.e., networks and relations). We believe that
there are no technical barriers to the development of a
SIDF, and that putting such a standard to use would
alleviate a major part of the data conversion process.
Summary . The rationale for the points summarized below
appear in the previous parts of this section. We will only
state here our assessment of the impact on conversion
probl ems .
-148-
Hardware development will increase the need for
generalized conversion tools (in
proliferation of minicomputers,
and mass storage devices).
part ic ul ar,
computer networks,
Iost^rre , 'ac n cl n ptabl d e !: are ""* Wl ] 1 "" ke version
Special hardware DBMS machines will simplify the
fSncllonV a P nT 6SS (i " P»^*1cul.r. forNSSd unload
junctions, and program conversion) because thpv
promote interfacing at the logical level V
Software advances will not eliminate the need for
conversion but can simplify the conversion
in a way similar to DBMS machines.
process
Multiplicity of logical models is likely to
modlls! 9 t0 the need ° f conv ersion tool
exi st ,
s between
EvSn da w1th Wl 2\?*nrff 1 H n,1 r£ te -' th ? conve ^ion problem.
LJ h * standard, the implementations would be
different and non-standard DBMS will likely exist.
iuUii: r.\i^v n i'. load and unload '-«'o e n ha . n „ 9 d e
-149-
DL26
BIBLIOGRAPHY
(DL) LOGICAL DATA BASE DESIGN
NOVAK, D. and FRY, J., "The State of the Art of
Logical Data base Design," Proceedings of the Fifth
Texas Conference on Computing Systems , IEEE, Long
Beach, 1976, pp. JU-TT'.
DL29 SMITH, J. M., and SMITH, D. C. P., "Data Base
Abstractions: Aggregation and Generalization," ACM
Transactions on Data Base Systems 2 ,2(1977) : 105-133
DL30
DL31
DL34
SMITH, J. M., and SMITH, D. C.
Abstractions: Aggregation,"
ACM 20,6 (1977 ) :405-13 .
P., "Data Base
Communicati ons
of the
BUBENKO, J. A., "IAM: An Inferential Abstract
Modeling Approach to Design of Conceptual Schema,"
Proceedings of_ the ACM - SIGMOD International
Conferen "c~e~~o~n ~Management of DajU, ACM, N.Y., 1977,
pp. 62-/4.
CHEN, P. P., and YAO, S. B., "Design and Performance
Tools for Data Base Systems," Proceedings of the
Third International Conference"
19TT,
Bases, ACM, N.Y.
PP
on Very Large Data
(DT) DATA TRANSLATION
DTI
DT2
DT3
DT4
FRY, J. P., FRANK, R. L., HERSHEY, E. A., Ill
Developmental Model for Translation," Proc .
SIGFIDET Workshop on Data Description , Access
ACM, N.Y. ,
"A
1972
ACM
Control , A. L . Dean ( ed. )
pp
and
77^1TJ6T
SMITH, D. C. P., "A Method for Data Translation Using
the Stored Data and Definition Task Group
Languages," Proc . of tJie 1972 ACM SIGFIDET Workshop
o n Data Description , Access and Control , AC M , N . Y . ,
pp.TOT-TTT:
RAMIREZ, J. A., "Automatic Generation of Data
Conversion Programs Using a Data Description
Language (DDL)," Ph.D. dissertation, University of
Pennsylvania, 1973.
MERTEN, A. G., and FRY, J. P., "A Data Description
Approach to File Translation," Proc . 1974 ACM SIGMOD
Wo rkshop on Data Description , Access and Control ,
ACM, N.Y.T~PP^~^91-20b.
■150-
DT5 HOUSEL, B., LUM, V., and SHU, N., "Architecture to an
Interactive Migration Systems (AIMS)," Proc 1974
ACM SIGFIDET Workshop on Data Desc r i pti o7r~AcceTs~
and Control , ACM, N . Y .~pp7T57 -169 .
DT6 RAMERIZ, J. A., RIN, N. A., and PRYWES, N. S.
"Automatic Conversion of Data Conversion Programs
using a Data Description Language," Proc. 1974 ACM
^ IG ^ ID P Worksh QP on Data Dejcripr i o n7~Ac c eTs~and~~
control , ACM, N . Y . ~p pT707^T;
DT7 FRANK, R. L., and Yamaguchi, K., "A Model for a
Generalized Data Access Method," Proc . of the 1974
National Com P u 5 e !: Conference , AFIPS Pre"sT,~~MTn tVTTe
N.J . , pp. 4J/-444.
DT8 TAYLOR, R. w., "Generalized Data Structures for Data
Translation," Proc . Thjrd Texas Conference on
Computing Systems , Austin, Texas, 1974, pp. "6-3-1
TT
DT9
DT11
DT12
DT14
DT15
UNIVAC, UNIVAC 1100 Series Data File Converter
Programmer Reference UP-8070, Sperry Rand
Corporation, March, 1974.
DT10 YAMAGUCHI, K.
"An Approach to Data Compatibility A
Generalized Access Method," Ph.D
University of Michigan, 1975.
dissertation, The
BAKKOM, D. £., and BEHYMER, J. A., "Implementation of
3 Prototype Generalized File Translator " Proc
ACM SI GM0D International Conf. on ManagemeTTTTf
king ( ed. ) ■■*-"—-■■ — *
1975
Data, W. F
"A"C117~N.YT, pp. 99-1107
NAVATHE S. B., and MERTEN, A. B., "Investigations
into the Application of the Relation Model of Data
to Data Translation," Proc . 1975 ACM SIGMOD
International Conf . on Ma nagel Te~nt~cT Data, W. F
King led.), ACM, N . YTT pp. 123-13 8"
DT13 BIRSS E. W and FRY, J. P., "Generalized Software
for Translating Data," Proc. of the 1976 National
Computer Conference ~ ~
N.J. , pp." 889-899.
Vol. 45, AFIPS Press, Montvale,
LUM, V. Y., SHU, N. C,
Methodology for Data
IBM R & D Journal , Vol
483-497
and HOUSEL, B. C, "A General
Conversion and Restructuring "
20, No. 5, 1976, pp.
HOUSEL, B. C, et al . , "Express: A Data Extraction,
Processing, and Restructuring System," Transactions
■151-
on Data Base Systems , 2,2,
T3"4^T74.
ACM, N.Y. , 1977, pp
DT16 SWARTWOUT, D. E., DEPPE, M. E., and FRY, J. P.,
"Operational Software for Restructuring Network Data
Bases," Proc . of the 1977 National Computer
ConferenceT ~VoTT T6T IFTFS Press, Montvale, N . J . ,
pp. 499-5U8.
DT17 GOGUEN, N. H., and KAPLEN, M. M., "An Approach to
Generalized Data Translation: The ADAPT System,"
Bell Telephone Laboratories Internal Report, October
5, 1977.
(GG) GENERAL
GG1
GG2
GG3
INGALS, D., "The Execution Time Profile as a
Programmi ng Tool ,
Compil ers , ed. by
pp. 108-128.
1 n Design and Optimization of
R. Rustin, T J 7e"ntice Hal 1 , 1972",
ERDA Interl aboratory Working Group for Data Exchange
(IWGDE) Annual Report for Fiscal Year 1976, NTIS
LBL-5329.
DATE
An Introduction Data Base System,
Addison -Wesley, 1975. GG4 CODD, E. F.,
"Relational Completeness of Data Base Sublanguage,"
In Data Base Systems, Caurant Computer Science
Symposia Series, Vol- 6. Prentice Hall. 1972.
(M)
6, Prentice Hall,
MODELS-THEORY
Mil CHEN, P.P.S., "The Entity-Relationship Model - Towards
a Unified View of Data," Transactions on Data Base
Systems 1 , 1 ( 1976 ): 9-36 .
(PT) PROGRAM TRANSLATION
PT1
PT2
SHARE AD-HOC COMMITTEE ON UNIVERSAL LANGUAGES, "The
Problem of Programming Communication with Changing
Machines: A Proposed Solution," Comm . ACM , Aug.,
1958, pp. 12-18.
SHARE AD-HOC COMMITTEE ON UNIVERSAL LANGUAGES, "The
Problem of Programming Communication with Changing
Machines: A Proposed Solution, Part 2," Comm . ACM
Sept. , 1958, pp. 9-16.
■ 152-
PT3 SIBLEY EH., and MERTEN, A. G., "Transferability and
Translation of Programs and Data," Information
l^ 5 tems, COUJS LV, Plenum Press, N. Y., 1972
PT4
PT5
PT6
PT7
PT8
PT13
PP
YAMAGUCHI, K., and MERTEN, A. G., "Methodology for
Transferring Programs and Data," Proc 1974
ACM-S1GF2DET Workshop on Data DescTTp-ti bT T~Access
and Control \ ACM, N . Y . ~p"p~rT4T^T^-^ S
HOUSEL, B.
to a
1974 ACM- SIGFIDET
Access and Contn
C, LUM, V. Y., and SHU, M . ; "Architecture
;2-,? n .. Interactlve Migration System (AIMS)," Proc.
T Workshop oji Data Description — '
£l, ACM, N.Y.,~pp7 157-170. '
MEHL, J W., and WANG, C. P., "A Study of Order
Transformation of Hierarchical Structures in IMS
Data Bases," Proc. J_974_ ACM-S IGFIDET Workshop on
T7^l4u SCriP Access and- TcT7rFrVr ,-TrM~T7YT-
PP
HOUSEL, B. C, and HALSTEAD, M. H
Machine Language Decompilation,
ACM AnjT_ujn Conference , ACM, N.Y
, "A Methodology for
Proc . of the 1974
pp. 254-260.
HONEYWELL INFORMATION SYSTEMS, "Functional
Specification Task 609 Data Base Interface Package "
?00- e 7 n 3 S ! C -oS55 Uni ' Cati0nS A9GnCy Contract °CA H6Ckage '
KI 197 Z 5 ER ACM'N,t!on, n i Sl r at1 ; 9 Data BdSe P ™cedures," Proc.
j^b ACM National Conference , ACM, N.Y., pp. 359^67.
SCHINDLER S "An Approach to Data Base Application
Restructuring," Working Paper 76 ST 2 3 Data R*7*
Systems Research Group! The University of Michigan
Ann Arbor, Mich. 1976. »'"'
PT11 DALE A G and DALE, N . B., "Schema and Occurrence
1" m S f a ; Sf0r . ati0n 1n H ^rarch1cal Systems,"
5Sf: pT^l -lb r 8 atl ° na1 - C ° nfere ""- °n Han.se-ent of
PT9
PT10 SCHINDLER, S.
"" SU to S Ja N ta E BaI; rh ' "^l l ' cat1 °" P -"°9^ Conversion Due
to Data Base Changes," Proc . of the 2nd
Internat ional Conference"
8-10, 19
143-157.
on VLDB , Brussels, Sept.
SU, S
Y.
W., and LIU, B. J., " A Methodology of
5£t Villi** Pr °;- am Anal * sis and Conversion Based on
Data Base Semantics," Proceedings of the
international Conference on ManagemenT~of Data
-153-
1977, pp. 75-87.
PT14
PT15
HOUSEL, B. C
Conversion,
"A Unified Approach to Program and Data
Proceedings of the Third International
Confe rence on Very Large Data Bases , ACM, N.Y.,
1977, pp. 377-TT5T
SU, S. Y. W., and REYNOLDS, M. J., "Conversion of
High-Level Sublanguage Queries to Account for Data
Base Changes," Proc . o£ NC_C, 1978, pp. 857-875.
(R) RESTRUCTURING
Rl
R2
R3
R4
R5
R6
R7
R8
FRY, J. P., and JERIS, D., "Towards a Formulation of
Data Reorganization," Proc . 1974 ACM / SIGMOD Workshop
on Data Description , Access and Control , ed. by R.
RustTnT ACM, N.Y. , pp. 83-10TT7 -
SHU, N. C, HOUSEL, B. C, and LUM, V. Y., "CONVERT: A
High-Level Translation Definition Language for Data
Conversion," Comm . ACM 18,10, 1975, pp. 557-567.
SHOSHANI, A., "A Logi cal -Lev el Approach to Data Base
Conversion," Proc. 1975 ACM/SIGMOD In ternational
IT2^122.
Conf. on Management of Data , ACM, N.Y
PP
SHOSHANI, A., and BRANDON, K., "On the Implementation
of a Logical Data Base Converter," Proc .
International Conference on Very Large Data Bases ,
ACM, N.Y. , 1975, pp. 529-531 .
HOUSEL, B. C, and SHU, N. C, "A High-Level Data
Manipulation Lanquaqe for Heirarchical Data
Structures," Proc . of the 1976 Conference on Data
Abstraction , Definition and Structure , Salt Lake
City, Utah, pp. 155-169.
NAVATHE, S. B., and FRY, J. P., "Restructuring for
Large Data Bases: Three Levels of Abstraction," ACM
Transac tions on Data Base Systems , 1,2, ACM, N.Y. ,
1976, pp. 138^T5F:
NAVATHE, S. B., "A Methodology for Generalized Data
Base Restructuring," Ph.D. dissertation, The
University of Michigan, 1976.
GERRITSEN; ROB, and MORGAN, HOWARD, "Dynamic
Restructuring of Data Bases With Generation Data
Structures," Proc . of the 1976 ACM Conference , ACM,
N.Y. , pp. 281^28~6.
-154-
R9 SWARTWOUT, D., "An Access Path Specification Language
for Restructuring Network Data Bases," Proc . of the
1977 SIGMOD Conference , ACM, N.Y., pp. FB^TOl"
RIO NAVATHE, S. B., "Schema Analysis for Data Base
Rll
Restructuring," Proc. 3rd
on Very Large Data BaseT T
appear in TODS.
Inte r n a t i o n a 1
TT7
Conference
r9"7r: — To
EDELMAN, J. A., JONES, E. E., LIAW, Y. S., NAZIF, Z.
A., and SCHEIDT, D. L., "REORG - A Data Base
Reorganizer," Bell Laboratories Internal Technical
Report, April, 1976.
(SL) STORED-DATA DEFINITION
SL1 STORAGE STRUCTURE DEFINITION LANGUAGE TASK GROUP
(SSDLTG) OF CODASYL' SYSTEMS COMMITTEE, "Introduction
to Storage Structure Definition" (by J. P. Fry);
"Informal Definitions for the Development of a
Storage Structure Definition Language" (by W. C.
McGee); "A Procedural Approach to File Translation"
(by J. W. Young, Jr.); "Preliminary Discussion of a
General Data to Storage Structure Mapping Languaae"
(by E. H. Sibley and R. W. Taylor), Proc . 1970
ACM - SIGFIDET Workshop on Data Description , Access
and Control , ed. by E. F. Codd, Houston, Tex. , Nov.
T9T0, pp. 368-80.
SL2 SMITH, D. C. P., "An Approach to Data Description and
Conversion," Ph.D. dissertation, Moore School Report
72-20, University of Pennsylvania, Philadelphia,
Pa., 1972.
SL3
SL4
SL5
SL6
TAYLOR, R. W., "Generalized Data Base Management
System Data Structures and Their Mapping to Physical
Storage," Ph.D. dissertation, The University of
Michigan, Ann Arbor, Mich., 1971.
FRY, J. P., SMITH, D. C. P., and TAYLOR, R. W., "An
Approach to Stored-Data Definition and Translation,"
Proc . 1972 ACM - SIGFIDET Workshop on Data
Description , Access " "
Denver, Col o.
Nov.
and Control
TT727
pfK T3-55 .
ed. by A. L. Dean,
BACHMAN, C. W., "The Evolution of Storage Structures,"
Comm. ACM 15,7 (July 1972), pp. 628-34.
SIBLEY, E. H. and TAYLOR, R. W., "A Data Definition
and Mapping Language," Comm . ACM 16,12 (Dec. 1973),
pp. 750-59.
-155-
SL7
SL8
HOUSEL, B., SMITH, D., SHU, N., and LUM, V., "Define
A Non-Procedural Data Description Language for
Defining Information Easily," Proc . of 1 975 ACM
Pacific Conference , San Francisco, CA, April
pp. 62-7tn
1975,
The Sto red-Data Definition and Translation Task Group,
"Stored-Data Description and Data Translation: A
Model and Lanquage," Information Systems 2,3
(1977): 95-1 48'. "~ ~
(SM) DATA SEMANTICS
SMI
SM2
SM3
SCHMID, H. A. , and SWENSON,
of the Relational Model,"
C o nf erence , May 1975, pp.
J. R., "On the Semantics
Proc . , ACM- SIGMOD 1975
7TTT2 33:
ROUSSOPOULOS, N., and MYL0P0UL0S, J., "Using Semantic
Networks for Data Base Management," Proc . Very Large
Data Base Conference , Framingham, Mass., Sept. 1975,
pp. 144-172.
SU, STANLEY Y. W., and LO, D. H., "A Multi-level
Semantic Data Model," CAASM Project, Technical
Report No. 9, Electrical Engineering Dept.,
University of Florida, June 1976, pp. 1-29.
(TA) TRANSLATION APPLICATIONS
TA1 WINTERS, E. W., and DICKEY, A. F., "A Business
Application of Data Translation," Proceedings of the
19 76 SIGMOD International Conference o^n Mana gement
-by
of Dat a ,
June
Ed.
F977, p P
J.
189
B. Rothnie, Washington, D.C
196.
(UR) UM
RESTRUCTURING
UR1 LEWIS, K., DRIVER, B., and DEPPE, M., "A Translation
Definition Language for the Version II Translator,"
Working Paper 809, Data Translation Project, The
University of Michigan, Ann Arbor, Michigan, 1975.
UR2 LEWIS, K., and FRY, J., "A Comparison of Three
Translation Definition Languages," Working Paper DT
5.1, Data Translation Project, The University of
Michigan, Ann Arbor, Michigan, 1975.
UR3 DEPPE, M. E., "A Relational Interface Model for Data
Base Restructuring," Technical Report 76 DT 3, Data
•156-
Translation Project, The University of Michigan, Ann
Arbor, Michigan, 1976.
UR4 DEPPE, M. E., LEWIS, K. H., and SWARTWOUT, D. E.,
"Restructuring Network Data Bases: An Overview,"
Data Translation Project, Technical Report 76 DT 5,
The University of Michigan, Ann Arbor, Michigan,
19 76. a
UR5 DEPPE, M. E., and LEWIS, K. H .. , "Data Translation
Definition Language Reference Manual for Version IIA
Release 1," Data Translation Project, Working Paper
7 6 DT 5.2, The University of Michiaan, Ann Arbor,
M i c h i g a n , 1 9 7 6 .
UR6 SWARTWOUT, D. E., MARINE, A. M., and BAKKOM, D. E.,
"Partial Restructuring Approach to Data
Translation," Data Translation Project, Working
Paper 76 DT 8.1, The University of Michigan, Ann
Arbor, Michigan, 1976.
UR7 SWARTWOUT, D. E., WOLFE, G. J., and BURPEE, C. E.,
"Translation Definition Lanquage Reference Manual
for Version IIA Translator, Release 3," Data
Translation Project, Working Paper 77 DT 5.3, The
University of Michigan, Ann Arbor, Michigan, 1977.
(US) UM STORED-DATA DEFINITION
US1 DATA TRANSLATION PROJECT, "Stored-Data Definition
Language Reference Manual," The University of
Michigan, Ann Arbor, Michigan, 1972.
US2
US3
US4
DATA TRANSLATION PROJECT, "Revised Stored-Data
Definition Language Reference Manual," The
University of Michigan, Ann Arbor, Michigan, 1974.
DATA TRANSLATION PROJECT, "University of Michigan
Stored-Data Definition Language Reference Manual for
Version II Translator," The University of Michiqan
Ann Arbor, Michigan, 1975.
BIRSS, E. W., and FRY, J. P., "a Comparison of
Languages for Describing Stored Data," Data
Translation Project, Technical Report 76 DTI,
University of Michigan, Ann Arbor, Michigan,
(UT) UM TRANSLATION
Two
The
1976.
-157-
UT1 "Functional Design Requirements for a Prototype Data
Translator," Data Translation Project, The
University of Michigan, Ann Arbor, Michigan, 19 72.
UT2 "Design Specifications of a Prototype Data
Translator," Data Translation Project, The
University of Michigan, Ann Arbor, Michigan, 1972.
UT3 "Program Logic Manual for the University of Michigan
Prototype Data Translator," Data Translation
Project, The University of Michigan, Ann Arbor,
Michigan, 1973.
UT4 "Users Manuals for the University of Michigan
Prototype Data Translator," Data Translation
Project, The University of Michigan, Ann Arbor,
Michigan, 1973.
UT5 "Functional Design Reauirements of the Version I
Translator," Data Translation Project, The
University of Michigan, Ann Arbor, Michigan, 1973.
UT6 "Program Logic Manual for the University of Michigan
Version I Data Translator," Working Paper 306, Data
Translation Project, The University of Michigan, Ann
Arbor, Michigan, 1974.
"Design Specifications: Version II Data Translator,"
Working Paper 307, Data Translation Project, The
University of Michigan, Ann Arbor, Michigan, 1975.
BIRSS, E., DEPPE, M., and FRY, J., "Research and Data
Reorganization Capabilities for the Version 1 1 A Data
Translator," Data Translation Project, The
University of Michigan, Ann Arbor, Michigan, 1975.
UT9 BIRSS, E., et al . , "Program Logic Manual for the
Version 1 1 A Data Translator," Working Paper 76 DT
3.1, Data Translation Projects, The University of
Michigan, Ann Arbor, Michigan, 1976.
UT7
UT8
UT10 BODWIN, J., et al . , "Data Translator Version 1 1 A
Release 1 User Manual," Working Paper 76 DT 3.2,
Data Translation Project, The University of
Michigan, Ann Arbor, Michigan, 1976.
UT11 BODWIN, J., et al . , "Data Translator Version 1 1 A
Release 2 User Manual," Working Paper 76 DT 3.4,
Data Base Systems Research Group, The University of
Michigan, Ann Arbor, Michigan, 1976.
UT12 KINTZER
et al., "Michigan Data Translator Version
-158-
IIB Release 1 User Manual," Technical Paper 77 DT 8
Data Base Systems Research Group, The University of
Michigan, Ann Arbor, Michigan, 1977.
UT13 BURPEE, C. E., et al . , "Michigan Translator Program
Logic Manual Version IIB Release 1," Working Paper
77 DT 3.7, Data Base Systems Research Group, The
University of Michigan, Ann Arbor, Michigan, 1977.
UT14 BAKKOM, D., et al . , "Specifications for a Generalized
Reader and Interchange Form," Working Paper 77 DT
6.2, Data Base Systems Research Group, The
University of Michigan, Ann Arbor, Michigan, 1977.
UT15 DeSMITH, D., and HUTCHINS, L., "Michigan Data
Translator Design Specifications Version IIB,"
Working Paper 77 DT 3.8, Data Base Systems Research
Group, The University of Michigan, Ann Arbor,
Michigan, 1977.
UT16 KINTZER, E., et al . , "Michigan Data Translator Version
IIB Release 1.1 User Manual," Technical Paper 77 DT
Data Base Systems Research Group, The
8.1
University of Michigan, Ann Arbor, Michigan, 1977.
UT17 BAKKOM, D. E., and Schindler, S. J., "Operational
Capabilities for Data Base Conversion and
Restructuring," Technical Report 77 DT 6, Data Base
Systems Research Group, The University of Michigan
Ann Arbor, Mich. , 1977.
(Z) RELATIONAL SYSTEM
Z5 CHAMBERLAIN, D. C. and BOYCE, R. F., "SEQUEL: A
Structured English Query Languaqe," Proceedi ngs of
the ACM - SIGMOD Workshop on Data Description, Accel's
Tnd" Control , ACM, NTTT~ T^lT.
•159-
■160-
7. PARTICIPANTS
The following is a list of attendees, participants
and contributors to the workshop.
Edward Arvel
Conversion Experiences
Data Sciences Group
890 National Press Building
Washington, D.C. 20045
Marty Aronoff
Management Objectives
National Bureau of Standards
Tech B258
Washington, D.C. 20234
Robert Bemer
Standards
Honeywell Information Systems
P.O. Box 6000
Phoenix, AZ 85005
John Berg
Proceedinqs Editor
National Bureau of Standards
Tech A259
Washington, D.C. 20234
Edward Birss
Conversion Technology
Hewlett Packard
General Systems Division
5303 Stevens Creek Blvd.
Bldg. 498-3
Santa Clara, CA 95050
Don Branch
Standards
Advisory Bureau for Computino
Room 828, Lord Elgin Plaza
66 Slatter Street"
Ottawa, Ontario K1A 0T5
CANADA
Jean Bryce
Sta nderds
M. Bryce & Associates, Inc
1248 Springfield Pike
Cincinnati, OH 45215
-161-
Milt Bryce
Chairman, Standards
Jim Burrows
Chairman, Conversion
Experiences
Richard G. Canning
Management Objectives
Lt. Michael Carter
Conversion Experiences
Joseph Collica
Conversion Experiences
Elizabeth Courte
Conversion Experiences
Ahron Davidi
Conversion Experiences
Peter Dressen
Conversion Technology
Ruth F. Dyke
Conversion Experiences
Larry Espe
Management Objectives
M. Bryce & Associates, Inc.
1248 Springfield Pike
Cincinnati, OH 45215
Director, Institute for Computer
Sciences and Technology
National Bureau of Standards
Administration Bldg., Room A200
Washington, D. C. 20234
Canning Publications, Inc.
925 Anza Avenue
Vista, CA 92083
Air Force Data Systems Design Ctr
AFDSDC/SDDA, Building 857
Gunter AFB, AL 36114
National Bureau of Standards
Tech. A254
Washington, D. C. 20234
Bell Laboratories
3B210 Six Corporate Plaza
Piscataway, NJ 08854
Blue Cross of Massachusetts
100 Summer Street, 12th Floor
Boston, Massachusetts 02106
Honeywell Inforamtion Systems
P. 0. Box 6000
Phoenix, Arizona 85005
U.S. Civil Service Commission
1900 E Street, N. W., Room 6410
Washington, D. C. 20415
Nolan, Norton and Company
One Forbes Road
Lexington, Massachusetts 02173
-162-
Gordon Everest
Management Objectives
University of Minnesota
271 19 Avenue South
Minneapolis, MN 55455
El izabeth Fong
Standards
National Bureau of Standards
Tech B21 2
Washington, D.C. 20234
James P. Fry
Chairman, Conversion
Technol ogy
276 Business Administration
University of Michigan
Ann Arbor, MI 48109
Al Gabon' ault
Standards
Sper ry Univac
P.O. Box 500
Mail Station C1NW-12
Bl ue Bel 1 . PA 19424
Mr. Rob Gerritsen
Management Objectives
Wharton School
University of Pennsylvania
Philadelphia, Pennsylvania 19174
Richard Godlove
Management Objectives
Monsanto Company
800 North Lindbergh Boulevard
St. Louis, Missouri 63166
Nancy Goguen
Conversion Technology
Bell Laboratories
6 Corporate Place
Piscataway, NJ 08854
Seymour Jeffery
Host
Director, Center for Programming
Science and Technology
National Bureau of Standards
Tech A247
Washington, D.C. 20234
Samuel C. Kahn
Management Objectives
Information System Dep. - Planning
E. I. duPont de Nemours & Co.
Wilmington, Delaware 19899
Mike Kaplan
Conversion Technology
Bell Laboratories
8 Corporate Place
Piscataway, NJ 08854
■163-
Anthony Klug
Standards
Computer Sciences Department
University of Wisconsin
Madison, Wisconsin 537 06
Henry Lefkovits
Standards
H. C. Lefkovits & Associates, Inc
P.O. Box 297
Harvard, MA 01451
H. Eugene Lock hart
Management Objectives
Nolan, Norton & Company
One Forbes Road
Lexington. Massachusetts 02173
Thomas Lowe
Host
Chief
Operations Engineering Division
Center for Programming
Science and technology
National Bureau of Standards
Tech A265
Washinaton. D.C. 20234
Gene Lowenthal
Conversion Technology
MR I Systems Corporation
P.O. Box 9968
Austin^ Texas 78766
Vincent Lum
Conversion Technology
IBM Research Corp., K55/282
5600 Cottle Road
San Jose, CA 95103
Mr. John Lyon
Management Objectives
Colonial Penn Group Data Corporation
5 Penn Center Plaza
Philadelphia, PA 19103
Halaine Maccabee
Conversion Experiences
Northeastern Illinois Plan. Comm.
400 West Madison St.
Chicago . Illinois 60606
(NIPC)
William Madison
Contributor
Universal Systems, Inc.
2341 Jefferson Davis Highway
Arlington, Virginia 22202
•164-
Daniel B. Magraw
General Chairman
State of Minnesota
Department of Administration
208 State Administration Building
St. Paul , MM 55155
Robert Marion
Conversion Technology
Defense Communications Agency
11440 Issac Newton Square. North
Reston . VA 22090
Steven Merritt
Conversion Experiences
GAO, FGMS-ADP, Room 6011
411 G Street , N. W.
Washinqton, D.C. 20548
Jack Minker
ACM Liaison
Cha i rman
Department of Computer Science
University of Maryland
College Park, Maryland 20742
Thomas E . Murray
Management Objectives
Del monte Corp .
P.O. Box 3575
San Francisco, CA 94119
Shamkant Navathe
Conversion Technology
New York University
Grad. School of Business
600 Tisch Hal 1
40 West 4th Street
New York , New York 10003
Mr. Jack Newcomb
Management Objectives
Department of Finance & Admin.
326 Andrew Jackson State Off. Bldq
Nashvil le T TN 37219
Richard Nolan
Chairman, Management
Objectives
Nolan. Norton and Company
One Forbes Road
Lexington. Massachusetts 02173
T. Wil 1 iam 01 le
Management Objectives
Consul tant
27 Blackwood Close
West Byfleet
Surrey, KT14 6PP ENGLAND
Mayford Roark
Keynoter
Ford Motor Company
The American Road
Dearborn, MI 48121
-165-
C. H. Rutledge
Standards
Marathon Gil Company
539 South Main Street
Findley, OH 45840
Mr. Michael J . Samek
Management Objectives
Celanese Corporation
1211 Avenue of the Americas
New York , NY 10036
Steve Schindler
Management Objectives
and Conversion
Technology
Mr. Richard D. Secrest
Management Objectives
University of Michigan
276 Business Administration
Ann Arbor, MI 48109
Standard Oil Company (Indiana)
P .0 . Box 591
Tulsa, OK 74102
Philip Shaw
Standards
IBM Corporation
555 Bailey Avenue
San Jose, CA 95150
Arie Shoshani
Conversion Technology
Lawrence Berkeley Laboratory
University of California
Berkeley, CA 94720
Edgar Sibley
Management Objectives
Dept. of Info. Systems Mgmt.
University of Maryland
Colleqe Park, MD 20742
Prof. Diane Smith
Conversion Technology
University of Utah
Merrill Engineering Bldg.
Salt Lake City, UT 84112
Alfred Sorkowitz
Conversion Experiences
Department of Housing & Urban Devel
451 7 Street, S.W. , Room 4152
Washington, D.C. 20410
Prof. Stanley Su
Conversion Technology
Electrical Engineering
University of Florida
Gainesvil 1 e . FL 32611
■166-
Donal d Swa rtwout
Conversion Technology
276 Business Administration
University of Michigan
Ann Arbor, MI 48109
Robert W. Taylor
Conversion Technology
IBM Corporation
IBM Research. K55/282
5600 Cottle Road
San Jose, CA 95193
Jay Thomas
Standards
Al 1 ied Van Lines
2501 West Roosevel t
Broadview, IL 60153
Road
Ewart Willey
Standards
Prudential Assurance Co
142 Holborn Bars
London EC1N 2NH
ENGLAND
Major Jerry Winkler
Standards
U.S.A.F. -AFDSC/GKD
Room 3A153, Pentagon
Washington, D.C. 20030
Beatrice Yormark
Conversion Technology
Interactive Systems Corporation
1526 Cloverfield Blvd.
Santa Monica, CA 90404
A Contributor is one who could not attend but either
submitted a paper or added to the Workshop in a
manner that deserves our acknowledgement and thanks
*U S GOVERNMENT PRINTING OFFICE: 1980 328-117/6522
167-
NBS-114A (REV. 9-76)
U.S. DEPT. OF COMM.
BIBLIOGRAPHIC DATA
SHEET
1. PUBLICATION OR REPORT NO.
NBS SP 500-64
.2. Gov't Accession No,
4. TITLE AND SUBTITLE
DATA BASE DIRECTIONS - THE CONVERSION PROBLEM
Proceedings of a Workhop held in Ft. Lauderdale, FL,
Nov. 1-3, 1977
3. fecijjtertt's Accession Ho,
5. Publication Date
September 1980
$.
tag Otgahization Code
7. AUTHOR(S)
John L. Berg, Editor
8. Performing Organ. Report No.
9. PERFORMING ORGANIZATION NAME AND ADDRESS
NATIONAL BUREAU OF STANDARDS
DEPARTMENT OF COMMERCE
WASHINGTON, DC 20234
Ift-^j^cJ/Tafik/Work daitNo.
11. Contract/Grant No.
12. SPONSORING ORGANIZATION NAME AND COMPLETE ADDRESS (Street, City, state, ZIP)
National Bureau of Standards
Department of Commerce
Washington, DC 20234
Assoc, for Computing Machi
1133 Ave. of Americas
New York, NY 10036
ner t
13. Type of Report & Period Covered
Final
14. Sponsoring Agency Code
15. SUPPLEMENTARY NOTES
| | Document describes a computer program; SF-185, FIPS Software Summary, is attached.
16. ABSTRACT (A 200-word or less factual summary ol most significant information. If document includes a significant bibliography or
literature survey, mention it here.)
What information can help a manager assess the impact a conversion will have
on a data base system, and of what aid will a data base system be during a conver-
sion? At a workshop on the data base conversion problem held in November 1977
under the sponsorship of the National Bureau of Standards and the Association for
Computing Machinery, approximately seventy-five participants provided the decision
makers with useful data.
Patterned after the earlier Data Base Directions Workshop, this workshop, Data
Base Directions - the Conversion Problem , explores data base conversion from four
perspectives: management, previous experience, standards, and system technology.
Each perspective was covered by a workshop panel that produced a report included here.
The management panel gave specific direction on such topics as planning for data
base conversions, impacts on the EDP organization and applications, and minimizing
the impact of the present and future conversions. The conversion experience panel
drew upon ten conversion experiences to compile their report and prepared specific
checklists of "do's and don'ts" for managers. The standards panel provided comments
on standards needed to support or facilitate conversions and the system technology
panel reports comprehensively on the systems and tools needed with strong recommend-
ations on future research.
17. KEY WORDS (six to twelve entries; alphabetical order; capitalize only the first letter of the first key word unless a proper name;
separated by semicolons)
Conversion; data base; data-description; data-dictionary; data-directory;
data-manipulation; DBMS: languages; query
18. AVAILABILITY ^Unlimited
I I For Official Distribution. Do Not Release to NTIS
[~%1 Order From Sup. of Doc, U.S. Government Printing Office, Washington, DC
20402
I I Order From National Technical Information Service (NTIS), Springfield,
VA. 22161
19. SECURITY CLASS
(THIS REPORT)
UNCLASSIFIED
20. SECURITY CLASS
(THIS PAGE)
UNCLASSIFIED
21. NO. OF
PRINTED PAGES
178
22. Price
$5,50
USCOMM-DC
CHECK
THEM
OUT
How do those automated checkout counters, gas pumps
credit offices, and banking facilities work? What every
consumer should know about the modern electronic sys-
tems now used in everyday transactions is explained in
a 12-page booklet published by the Commerce Depart-
ment s National Bureau of Standards.
Automation in the Marketplace (NBS Consumer Informa-
tion Series No. 10) is for sale by the Superintendent of
Documents, U.S. Government Printing Office, Washington
D.C. 20402. Price: 90 cents. Stock No. 003-003-01969-1
"AUTOMATION IN THE MARKETPLACE"
A Consumer's Guide
D/H CAKE MIX .83
WALNUTS CAN .59
JELL0 PUDDING .30
► UNIVERSAL PRODUCT CODE see booklet
1 GREEN PEPPER .34
LASER SCANNER see bklt
CHRRY TOMATO .79
► ELECTRONIC CASH REGISTER see bklt
1 CUCUMBERS .34
► HANDLING OF UNC0DED ITEMS see bklt
► ELECTRONIC SCALES see bklt
GREETING CARD .60
► WEIGHTS & MEASURES ENFORCEM'T see bklt
DELICATESSEN 1.35
2.19b @49/bBR0CC0 1.07
► SPECIAL FEATURES OF COMPUTER CHECKOUT
SYSTEMS see bklt
DRUG 4.49 T
► BANK TELLER MACHINES see bklt
► COMPUTER TERMINALS see bklt
► CONSUMER ISSUES see bklt
► THANK YOU BE INFORMED see bklt
1
; :Wj
(please detach along dotted line)
ORDER FORM
PLEASE SEND ME COPIES OF
Automation in the Marketplace
at $.90 per copy.
S.D. Stock No. 003-003-01969-1
(please type or print)
NAME
I enclose $
Deposit Account No..
Total amount $
— (check, or money order) or charge my
Make check or money order payable to Superintendent of Documents.
MAIL ORDER FORM
WITH PAYMENT TO
Superintendent of Documents
U.S. Government Printing Office or any U.S. Department of
Washmgton, D.C. 20402 Commerce district office
ADDRESS.
CITY
STATE
.ZIP CODE
FOR USE OF SUPT. DOCS.
Enclosed
To be mailed
.later
Refund
Coupon refund
Postage
NBS TECHNICAL PUBLICATIONS
PERIODICALS
JOURNAL OF RESEARCH— The Journal of Research of the
National Bureau of Standards reports NBS research and develop-
ment in those disciplines of the physical and engineering sciences in
which the Bureau is active. These include physics, chemistry,
engineering, mathematics, and computer sciences. Papers cover a
broad range of subjects, with major emphasis on measurement
methodology and the basic technology underlying standardization.
Also included from time to time are survey articles on topics
closely related to the Bureau's technical and scientific programs.
As a special service to subscribers each issue contains complete
citations to all recent Bureau publications in both NBS and non-
NBS media. Issued six times a year. Annual subscription: domestic
$13; foreign $16.25. Single copy, $3 domestic; $3.75 foreign.
NOTE: The Journal was formerly published in two sections: Sec-
tion A "Physics and Chemistry" and Section B "Mathematical
Sciences."
DIMENSIONS/NBS— This monthly magazine is published to in-
form scientists, engineers, business and industry leaders, teachers,
students, and consumers of the latest advances in science and
technology, with primary emphasis on work at NBS. The magazine
highlights and reviews such issues as energy research, fire protec-
tion, building technology, metric conversion, pollution abatement,
health and safety, and consumer product performance. In addi-
tion, it reports the results of Bureau programs in measurement
standards and techniques, properties of matter and materials,
engineering standards and services, instrumentation, and
automatic data processing. Annual subscription: domestic $11-
foreign $13.75.
IMONPERIODICALS
Monographs— Major contributions to the technical literature on
various subjects related to the Bureau's scientific and technical ac-
tivities.
Handbooks— Recommended codes of engineering and industrial
practice (including safety codes) developed in cooperation with in-
terested industries, professional organizations, and regulatory
bodies.
Special Publications— Include proceedings of conferences spon-
sored by NBS, NBS annual reports, and other special publications
appropriate to this grouping such as wall charts, pocket cards, and
bibliographies.
Applied Mathematics Series— Mathematical tables, manuals, and
studies of special interest to physicists, engineers, chemists,
biologists, mathematicians, computer programmers, and others
engaged in scientific and technical work.
National Standard Reference Data Series— Provides quantitative
data on the physical and chemical properties of materials, com-
piled from the world's literature and critically evaluated.
Developed under a worldwide program coordinated by NBS under
the authority of the National Standard Data Act (Public Law
90-396).
NOTE: The principal publication outlet for the foregoing data is
the Journal of Physical and Chemical Reference Data (JPCRD)
published quarterly for NBS by the American Chemical Society
(ACS) and the American Institute of Physics (AIP). Subscriptions,
reprints, and supplements available from ACS, 1 155 Sixteenth St.,
NW, Washington, DC 20056.
Building Science Series — Disseminates technical information
developed at the Bureau on building materials, components,
systems, and whole structures. The series presents research results,
test methods, and performance criteria related to the structural and
environmental functions and the durability and safety charac-
teristics of building elements and systems.
Technical Notes— Studies or reports which are complete in them-
selves but restrictive in their treatment of a subject. Analogous to
monographs but not so comprehensive in scope or definitive in
treatment of the subject area. Often serve as a vehicle for final
reports of work performed at NBS under the sponsorship of other
government agencies.
Voluntary Product Standards— Developed under procedures
published by the Department of Commerce in Part 10, Title 15, of
the Code of Federal Regulations. The standards establish
nationally recognized requirements for products, and provide all
concerned interests with a basis for common understanding of the
characteristics of the products. NBS administers this program as a
supplement to the activities of the private sector standardizing
organizations.
Consumer Information Series— Practical information, based on
NBS research and experience, covering areas of interest to the con-
sumer. Easily understandable language and illustrations provide
useful background knowledge for shopping in today's tech-
nological marketplace.
Order the above NBS publications from: Superintendent of Docu-
ments, Government Printing Office, Washington, DC 20402.
Order the following NBS publications— Fl PS and NBSIR's—from
the National Technical Information Services, Springfield, VA 22161.
Federal Information Processing Standards Publications (FIPS
PUB)— Publications in this series collectively constitute the
Federal Information Processing Standards Register. The Register
serves as the official source of information in the Federal Govern-
ment regarding standards issued by NBS pursuant to the Federal
Property and Administrative Services Act of 1949 as amended,
Public Law 89-306 (79 Stat. 1127), and as implemented by Ex-
ecutive Order 1 1 7 1 7 (38 FR 12315, dated May 1 1 , 1 973) and Part 6
of Title 15 CFR (Code of Federal Regulations).
NBS Interagency Reports (NBSIR)— A special series of interim or
final reports on work performed by NBS for outside sponsors
(both government and non-government). In general, initial dis-
tribution is handled by the sponsor; public distribution is by the
National Technical Information Services, Springfield, VA 22161,
in paper copy or microfiche form.
BIBLIOGRAPHIC SUBSCRIPTION SERVICES
The following current-awareness and literature-survey bibliographies
are issued periodically by the Bureau:
Cryogenic Data Center Current Awareness Service. A literature sur-
vey issued biweekly. Annual subscription: domestic $35; foreign
$45.
Liquefied Natural Gas. A literature survey issued quarterly. Annual
subscription: $30.
Superconducting Devices and Materials. A literature survey issued
quarterly. Annual subscription: $45. Please send subscription or-
ders and remittances for the preceding bibliographic services to the
National Bureau of Standards, Cryogenic Data Center (736)
Boulder, CO 80303.
U.S. DEPARTMENT OF COMMERCE
National Bureau of Standards
Washington, DC. 20234
OFFICIAL BUSINESS
Penalty for Private Use, $300
PENN STATE UNIVERSITY LIBRARIES
SPECIAL FOURTH-CLASS RATE
BOOK