Alex Sellink
and Chris Verhoef
University of Amsterdam,
Programming Research Group
Kruislaan 403, 1098 SJ Amsterdam, The Netherlands
alex@wins.uva.nl, x@wins.uva.nl
We developed an assembly line to implement certain specific changes in a stockbroking system written in COBOL with embedded SQL. The changes were proposed by the maintenance team of the system. Using our architecture, it took a few hours to implement the conditional transformations from the code examples we obtained from the maintenance team. Then we could carry out the tasks completely automated. We report on the transformations, their implementation and the architecture we used. It is the intention of the company that owns the COBOL/SQL to use our architecture for similar tasks. This study was carried out in order to give the company that owns the code an indication of the effort it takes, the development process of the components that carry out such tasks, and the process to change software using our architecture.
Categories and Subject Description:
D.2.6 [Software Engineering]: Programming Environments--Interactive;
D.2.7 [Software Engineering]: Distribution and
Maintenance--Restructuring;
Additional Key Words and Phrases: Reengineering, System renovation, Software renovation factory, Automated software maintenance, Automated redesign, COBOL, embedded SQL.
During software maintenance in large companies, it appears to be the case that many and diverse changes to entire systems have to be made on a daily basis. Those tasks may seem relatively simple, but not that simple that they can be automated using ad-hoc tools that have no knowledge of the grammar of the code that has to be changed. Due to this fact, this type of maintenance has to be done by hand. Consequently, such maintenance uses a lot of precious capacity. First, since the task has to be done by hand and, second, since it is easy to overlook a case or to make mistakes. Let us give a few examples of typical tasks. The addition of explicit scope terminators in COBOL applications is a task that improves the readability of code. We learned from reliable sources that someone at a Dutch bank worked a year on adding explicit scope terminators manually. To make a comparison, we added in a mortgage system approximately 27 END-IFs per minute using a component that took less than an hour to implement using our architecture. More important, the automated process does not miss cases, makes no typographical errors, has uniform layout, etc. Similar advantages have been reported on in [31]. Another example is the adaptation to new standards. It is our experience that large companies have evolving internal standards for coding. As a consequence, they have more than one standard simultaneously. Depending on the time the programmers are working for the company, a certain standard is used. So automated adaption to the latest standards is a useful automated maintenance process. It can also be the case that decision structures need an update, since during the evolution of the system more cases have been added. At the company that we looked at, the tenth bank of the world, such changes are made to their systems on a daily base. It is recognized by the teams working on these issues that these tasks are very time consuming, not very challenging, although its hard to do it in a correct way without tool support other than compile-link-edit cycles.
Maintainers have to look for the problematic code throughout the entire system. This is time consuming and error-prone. Once they located this code they have a good local comprehension of what to do. We stress that for many of these daily tasks it is not necessary to completely comprehend such systems. Local comprehension suffices in order to specify tools that search for the typical patterns they are looking for and once they are found, we automatically change them according to their wishes. The search and replace process is 100 % automated. It will be clear that tool support for carrying out such tasks saves money and makes the maintenance process less error prone [1]. Since many such tasks are very company specific, it is not realistic to assume the existence of tools that carry out these tasks. To give the reader an idea, a typical task that we carried out was to restructure code so that it invokes a company specific routine. We discuss this tool in more detail in Section 3. Obviously, such tools do not exist, hence the proposed architecture.
Using the technology that we developed it is easy to implement such specific tasks by maintenance teams although we are convinced that a short course to learn the tools is necessary (we come back on this issue in Section 5). In order to give an idea of the process that can be carried out by maintenance teams in the future, we carried out such tasks. We report on them in this paper. We obtained from the bank a typical system: a 100 KLOC system written in OS/VS COBOL, a COBOL 74 dialect, that contains embedded SQL. Some parts are 20 years old, and some parts are brand new. Moreover, we obtained some typical tasks that had to be carried out on this system. The maintenance team selected representative tasks that were too difficult to construct an ad-hoc disposable tool for to carry out the task. We implemented the tasks using an architecture, that we call a software renovation factory in a few hours and we were able to carry out the tasks completely automated on the system. The specific tools we implemented are together called an assembly line.
The advantages of our approach are that the changes are made in a controlled way: the tools that we made are in fact the requirement specification of the tasks. This means that it is now possible to follow the history of changes in detail, that the process can be reproduced, moreover the process can be applied to other systems as well. In this way it is possible to set up such maintenance in an automated, controlled, and documented manner. Moreover, the evolution of the system and the many changes are now not only known by certain maintenance teams but they are documented in an executable way that is accessible for other parties, that can reuse the tools as well. This approach has advantages over an approach where the change history is only in the mind of the programmer.
In this section we give an idea of the architecture that we call a software renovation factory. In fact this is a development environment that is suited to implement maintenance and reengineering tasks to be carried out in an automated fashion. It consists roughly of two parts. There is one part that has to be set up by developers of such factories and there is the part that can be used by maintenance teams so that they can develop assembly lines for automated maintenance. In Figure 1, we depicted the architecture of the factory. We discuss Figure 1 from left to right. CALE stands for computer aided language engineering, and CALE-Tools help in constructing a grammar or even generating one from some electronic source, like a standard, or an on-line language manual. CALE-Tools can assess the quality of existing language descriptions, with CALE-Tools we can develop language descriptions, and we can reengineer them. In fact, CALE-Tools form a factory of their own. For a more thorough discussion on CALE-Tools we refer to [42,45]. Depending on the existence of such documentation we benefit from CALE-Tools in the construction of grammars. In some cases we can generate a grammar from an electronically available language description manual. As soon as we have a grammar, we generate from this grammar its native pattern language. For our example system written in OS/VS COBOL and embedded SQL this means that a real program is also a pattern: the pattern that matches only this program. In the generated pattern language we have for all sort names that are present in the grammar three kinds of variables available. For instance, Paragraph1 matches exactly one arbitrary paragraph in OS/VS COBOL possibly containing embedded SQL. Paragraph1+ matches one or more such paragraphs, and Paragraph1* matches zero or more such paragraphs. The stands for zero or more occurrences. The generation process from a grammar to a native pattern language is implemented as a CALE-Tool, hence the connecting arrow. For more information on the generation process, a formal definition of native pattern languages, and an example of their use we refer to [44]. In this paper we will use native patterns as well to implement the requested changes of the maintenance programmers. We believe that native patterns are very helpful when the technology that we developed will be ported to companies that use it for maintenance and/or reengineering. Since those programmers know the language quite well, the learning curve for the native pattern language is minimal. In order to have access to the pattern language it is possible to generate the documentation from the executable specification of this pattern language. This is depicted in the arrow to Docs. For OS/VS COBOL with SQL this amounts to a 25 page document, with a table of contents, etc. We use existing technology to do this [12]. From the grammar we generate generic transformations and generic analyzers. Those parts form the basis for which the user, in this case a maintenance programmer, can plug in custom-made tools. In fact, as soon as the user described the requirements specification of a tool, it is immediately executable since the generic components take care of bookkeeping issues. This architecture enables easy, reusable, and robust development of tools. In particular, it ensures maximal reuse of tools in case of different dialects. This technology has been discussed in [8].
The dashed rectangle represents the part that the maintenance team is working with. On top of the so-called Tool-Basis, the tools can be implemented with the aid of documentation of the native pattern language. The Tool-Basis contains the infrastructure that can be shared in custom tools, and takes care of the fact that the grammar, the generic transformations, and the generic analyzers are known in the custom made tools. In this paper we treat a case study that delivers a number of typical tools.
The ASF+SDF Meta-Environment is an interactive programming environment generator that takes a language definition as input (including a definition of the syntax of this language and optionally other operations on programs in the language such as, for instance, interpretation, compilation or transformation) and generates corresponding tools as output. From the syntax definition of a language in SDF various components are generated: a lexical scanner [23], a Generalized LR parser [22], a syntax-directed editor [28], a pretty printer [12], and optionally traversal functions and program analysis functions [8]. For the operations defined on programs, efficient term-rewrite engines are generated. We mention that Generalized LR parsing is particularly helpful for reengineering and maintenance, see [11] for details.
We can combine tools to obtain larger tools with a more complex behaviour. Using this component based engineering [27] many tools can easily be reused. We glue these components together with a coordination language called SEAL [29]. SEAL stands for Semantics-directed Environment Adaptation Language; it not only takes care of the coordination but also of a graphical user interface [30]. It is possible to change the coordination run-time, and to add functionality run-time, which is helpful in rapid development of custom tools.
In Figure 2 we give an idea of the look-and-feel of the ASF+SDF Meta-Environment in action. This screen dump shows the implementation of the maintenance tasks that we obtained from the maintenance team. The upper window is the ASF+SDF Meta-Environment. You can add and delete specifications. Specifications can be syntax descriptions of languages, or tools. They are called modules, and they can be edited via the edit-module window. We can also open editors that understand the syntax of a certain module. For instance the COBOL-plus window is an editor that can parse OS/VS COBOL plus CICS plus SQL. The buttons at the left-side of the COBOL-plus window are implemented using SEAL. Part of the coordination script can be seen in the upper-right window: it contains the functionality of some EvalSQLa button. This editor understands the SEAL language. We pressed the Typecheck button, that checks for type errors in the SEAL script. The SEAL window below pops up and tells us whether there are errors. As an example, we also pressed the McCabe button of the COBOL-plus window. This results in the small window with the 13 in it. We will discuss the functionality of the maintenance tasks in the next section.
We carried out the case study on part of a stockbroking system in order to show the usefulness of the architecture that we propose. The development of the case study can be characterized as evolutionary. The earliest parts are from before 1980. Today, 1998, still new functionality is added to it. The maintenance tasks we discuss are all in the vein of this evolutionary process. They are concerned with restructuring of the error handling process after an SQL statement has approached a DB2-table. The values of the exit status of an SQL statement are modified in the new release of the embedded SQL that is used in this system. In the current situation, some of the return codes are hard-wired coded into the stockbroking system. This information will now be stored in a separate program that should be called. So a transformation that we carried out implements the change from the coded constants to the call. Overtime, the structure of the error handling procedures themselves, has extended. At this moment the decision structure is not optimal anymore. The other transformation that we carried out turns those decision structures into structures that are now more natural.
The remainder of this section is devoted to a more thorough description of the assembly line in order to illustrate the process of automated software maintenance. As we see it, any automated reengineering or maintenance process consists of a number of very small steps that--together--perform a complex task. Identifying those steps requires a factory approach towards the tasks. What exactly this factory approach is, is hard to say in general. One of the fundamental issues is, in our opinion, that the process consists of three major phases. We call them the preprocessing phase, the main phase and the postprocessing phase. It is our experience that whatever problem we had to solve, it always boiled down to these three phases. The phases themselves are combinations of small steps. It is also our experience that the small steps can be reused over and over again. For instance, a postprocessing step in a control flow normalization assembly line [10], has now been used as a preprocessing step in this maintenance process. The common situation is, however, that pre- and postprocessing steps can be reused for the reason they were constructed for in the beginning.
During development of the assembly line we use the ASF+SDF Meta-Environment in an interactive way. In Figure 2 this is expressed. Once the individual steps have been identified and implemented, we use the SEAL coordination architecture to combine all the small steps into one large usually complex task. A glimpse of this can also be seen in Figure 2. In order to modify the complete system, the implemented task can be carried out batch oriented [13]. Next, we discuss the steps that are present in this particular assembly line.
There are many ways to implement the above Boolean condition, so we normalize all of them to a particular format. Let us give an example, just to give the reader an idea. The tool converts NOT ( NOT ( SQLCODE = -818 OR (-904 OR -911 ) OR -922 ) ) to SQLCODE = -818 OR -904 OR -911 OR -922. We display the two equations that take care of the above normalizations. There are more equations, but they are concerned with other normalizations like NOT X >= 1 turning into X < 1, which are not relevant for the SQL issues that we discuss at the moment.
[5] Norm-cond_L-exp(NOT(NOT(L-exp1))) = L-exp1
[9] Norm-cond_L-exp((Id1)) = Id1
We called this tool Norm-cond and on logical expressions, abbreviated L-exp it is called Norm-cond_L-exp. We mention that the underscore notation is generated from the grammar as well. Details on the generation process can be found in [8]. Equation [5] removes double negations, L-exp1 is a variable matching exactly one arbitrary logical expression. Equation [9] removes superfluous parentheses around identifiers that are logical expressions. We note that in COBOL we can use predicates that are logical expressions that can be given a name. The value of SQLCODE is such a predicate. It is not necessary to have a fixed ordering of the return codes. We take care of that in the next analysis function.
[1] crc(Lit1 Lit2 Lit3 Lit4) = true
===============================
ErrHandDB2_Sentence(+,0,
Statement1*
IF SQLCODE = Lit1 OR Lit2 OR Lit3 OR Lit4
Sentence1
) = 1
[2] crc(Lit1 Lit2 Lit3 Lit4) =
contains-818(Lit1 Lit2 Lit3 Lit4)
& contains-904(Lit1 Lit2 Lit3 Lit4)
& contains-911(Lit1 Lit2 Lit3 Lit4)
& contains-922(Lit1 Lit2 Lit3 Lit4)
[3] contains-818(Lit1* -818 Lit2*) = true
Equation [1] is a conditional one. Above the line, the condition
is stated, and if it succeeds the equation below the line is executed.
ErrHandDB2 is an analysis tool. In fact, this means that the
output sort is fixed. In this case the output is an integer. We have
three arguments to an analysis tool. The first one is the operation that
should be used, the second is the default value, and the third is the
pattern. Since we wish to count the number of occurrences we use +
as operator. If we do not find the pattern we wish to return 0
as a default value. If we find the pattern the tool will return 1.
This means that for an entire program every occurrence in any context will
be counted and added to the default value. Now we discuss the pattern.
It looks for a COBOL sentence that starts with zero or more statements
(expressed with the variable Statement1*), then an arbitrary
IF phrase containing four arbitrary literals Lit1, Lit2,
Lit3, and Lit4 in the Boolean condition. So SQLCODE = -305 OR
-502 OR -803 OR -811 matches. In order to prevent that incorrect
SQLCODE return codes are counted, we have a condition on this equation,
which is above the double line. We check whether the return codes are
the four we are looking for. The fact that the variables Lit1
- Lit4 are the same above and below the double line means that
they are bound to the same value if the rule matches. So if in the code
-305 is matched by the first variable Lit1, the crc
tool will use -305. The crc tool only returns true if
all four literals are correct. This is expressed in equation [2]
where we use the &
as Boolean connector. In equation [3]
we check whether one of the four equals -818. The variable
Lit1* matches zero or more literals, then the literal -818
and then again zero or more arbitrary literals matched by Lit2*.
The other return codes are checked analogously. It is simple to see that
crc returns true only if the four literals are the return
values that we are looking for and that there ordering does not matter.
We use the ErrHandDB2 and crc analysis tools as conditions
for the actual transformations. We discuss them next.
[1] ErrHandDB2(Program1) = 0
======================================
Add-EM948_Program(Program1) = Program1
[2] ErrHandDB2(Program1) != 0,
Program1 = COMMENT1* Ident-div1 Env-div1
Data-div1 Proc-div1
========================================
Add-EM948_Program(Program1) =
COMMENT1* Ident-div1 Env-div1
Add-EM948_Data-div(Data-div1) Proc-div1
[3] Add-EM948_Data-div( ) =
DATA DIVISION. Add-EM948_Ws-sec( )
[4] Add-EM948_Ws-sec(
WORKING-STORAGE SECTION. Data-desc1*) =
WORKING-STORAGE SECTION. Data-desc1*
01 L-EM948 PIC X(09) VALUE 'EM948 L'.
* FOR TESTING EM948 SQL-CODES
COPY A0046075.
[5] Add-EM948_Ws-sec( ) =
WORKING-STORAGE SECTION.
01 L-EM948 PIC X(09) VALUE 'EM948 L'.
* FOR TESTING EM948 SQL-CODES
COPY A0046075.
Equation [1] is a conditional one. Above the line, the condition
is stated, and if it succeeds the transformation below the line will
be performed. The condition checks whether there are zero occurrences
of the error handling code in a program. The expression Program1
is a variable from the native pattern language that represents a complete
OS/VS COBOL program that may contain embedded SQL. So it matches any
program of the system that we obtained from the bank. If there are
zero occurrences, the function Add-EM948 applied to a program,
denoted Add-EM948_Program, returns the original program unchanged.
Equation [2] is treating the case that there are occurrences of
the error handling somewhere in the program. The first equation in the
conditions contains !=
standing for not equal to. So it means
that the variable Program1 does contain occurrences. The second
condition unfolds the Program1 into optional comments
COMMENT1* and four divisions. Those divisions can be empty. If that is
the case, they match the empty word. We denote this by a space.
Add-EM948_Program states
that its only relevant to act on the DATA DIVISION: the completely
unfolded program is reiterated. Only at the variable matching the
DATA DIVISION, called Data-div1, a modification should be made.
What that is, is explained in the next two equations. Equation [3]
states that if there is a DATA DIVISION it is only necessary to
act on the WORKING-STORAGE SECTION. If there is no
DATA DIVISION equation [3] creates it: Add-EM948_Data-div(
) returns the word DATA DIVISION including a period followed by
Add-EM948 working on the sort Ws-sec. Then we have the
same situation: either the WORKING-STORAGE SECTION exists or not.
Equation [4] treats the case that it exists. It deals with
Add-EM948 applied to the WORKING-STORAGE SECTION. This is denoted
as Add-EM948_Ws-sec. The input pattern matches the terminal
WORKING-STORAGE SECTION including the separator period. The variable
Data-desc1* matches zero or more Data descriptors. The replacement
pattern is the right-hand side of the equation. It reiterates the
WORKING-STORAGE SECTION and the possibly occurring Data descriptors.
Then the special variable is added, then some comment, and the
COPY member. Equation [5] is similar to [4] except that there
was no WORKING-STORAGE SECTION so it is also created. The rest is
just a copy from the other equation. Now we have the right information
put in the WORKING-STORAGE SECTION, we can modify the PROCEDURE
DIVISION in the next tool.
[1] crc(Lit1 Lit2 Lit3 Lit4) = true
===============================
Use-EM948_Sentence(
Statement1*
IF SQLCODE = Lit1 OR Lit2 OR Lit3 OR Lit4
Sentence1) =
Statement1*
MOVE SQLCODE TO SQL-CODE IN LINKAREA-EM948
CALL 'UT100' USING L-EM948
LINKAREA-EM948
IF RETURNCODE = '9'
Sentence1
[2] crc(Lit1 Lit2 Lit3 Lit4) = true
===============================
Use-EM948_Sentence(
Statement1*
IF SQLCODE = Lit1 OR Lit2 OR Lit3 OR Lit4
Cond-body1
ELSE
Sentence1) =
Statement1*
MOVE SQLCODE TO SQL-CODE IN LINKAREA-EM948
CALL 'UT100' USING L-EM948
LINKAREA-EM948
IF RETURNCODE = '9'
Cond-body1
ELSE
Sentence1
Both conditions reuse the auxiliary crc tool that checks whether we are dealing with the correct return codes. If those values are correct, the condition is satisfied and we can make the change. The changes that we have to make in the PROCEDURE DIVISION are very local: the largest structures that we modify are COBOL sentences. Therefore, our tool has only two equations on this level (of course all the other equations on other levels are generated, as explained in [8]). The tool Use-EM948 applied to a COBOL sentence is denoted Use-EM948_Sentence. The first sentence that we are interested in consists of zero or more arbitrary statements, matched by Statement1*, then the special phrase starting with IF SQLCODE = Lit1 OR Lit2 OR Lit3 OR Lit4 followed by exactly one Sentence1. Since OS/VS COBOL is a COBOL 74 dialect, the scope of the IF has to be ended with a separator period (there is no END-IF). We recall that a sentence is one or more statements ended with a dot. We explain the replacement pattern. We leave the context intact: the Statement1* and Sentence1 are reiterated. Before we change the original IF phrase, we insert some new code above it. First, the exit status variable is copied to an auxiliary data item SQL-CODE that is a subrecord of LINKAREA-EM948. Then a company specific CALL follows, which checks the exit status and modifies the data item RETURNCODE accordingly. Now we change condition of the original IF phrase: it is changed into RETURNCODE = '9'. The second equation is a variation of the first: the Boolean condition can also be contained in an IF ELSE phrase. So the only difference is the Cond-body1 that represents anything that can occur between an IF and an ELSE. The rest of this case is the same as the first equation.
[1] Eval-SQL-a_Sentence*(
Sentence1*
Statement1*
IF NOT SQL-OK
Sentence1
IF SQL-OK
Sentence2
Sentence2* ) =
Sentence1*
Statement1*
IF SQL-OK
rsp(Sentence2)
ELSE
Sentence1
Sentence2*
Here we turn two IF phrases into one. Two IF phrases cannot occur in one sentence, since the separator period is also an implicit scope terminator. So, the Eval-SQL-a works on zero or more sentences, which is denoted by Eval-SQL-a_Sentence*. The pattern consists of a list of arbitrary sentences containing the two IF phrases that we are looking for. This is expressed with the (context) variables Sentence1* and Sentence2*. In between we have the two sentences containing the special IF phrases. The first one starts with zero or more arbitrary statements ( Statement1*), then the conditional IF NOT SQL-OK Sentence1 (the latter variable expresses that the body of the IF does not matter). Then another arbitrary conditional statement with fixed part IF SQL-OK. From the code examples we understood that the maintenance team wants this in an IF ELSE construct. The replacement pattern is conceptually simple. There is, however, a small complication with the scope termination of the IF. In the replacement pattern we use an auxiliary function rsp. This stands for remove separator period. For, the variable Sentence2 represents code that is ended with a period. But since this period is also is also an implicit scope terminator, we have to remove it from Sentence2. Of course the separator period of Sentence1 ends the scope of the new IF ELSE.
Next, we discuss the second transformation. Here, three consecutive IF phrases are turned into one nested IF phrase. In this transformation we make use of design knowledge: SQL-OK and SQL-NOT-FOUND cannot be true at the same time. This information can also be found in the file that defines the SQL auxiliary data items that are used in the code examples that we obtained from the maintenance team.
[2] Eval-SQL-a_Sentence*(
Sentence1*
Statement1*
IF NOT ( SQL-OK OR SQL-NOT-FOUND )
Sentence1
IF SQL-OK
Sentence2
IF SQL-NOT-FOUND
Sentence3
Sentence2* ) =
Sentence1*
Statement1*
IF SQL-OK
rsp(Sentence2)
ELSE
IF SQL-NOT-FOUND
rsp(Sentence3)
ELSE
Sentence1
Sentence2*
This second equation also works on lists of sentences, for the same reason as the first one. Again we have the context variables Sentence1* and Sentence2*. The sentences in between start again with zero or more arbitrary statements (matched by Statement1*). Then we have three consecutive phrases containing the control structure that has to be changed. In the replacement pattern we use the rsp tool that removes separator periods in two cases for the same reason as in the previous equation. So its another pattern, but the approach to deal with it is completely analogous to the first one.
We discuss the third equation. Although this one is very simple, it interferes with the other patterns if not applied in the right order. In order to demonstrate this to the maintenance team of the system we used two buttons in the ASF+SDF Meta-Environment. We give it another name as mentioned above: Eval-SQL-b. Apart from updating the control structures there was a requirement to start with positive conditions in all IF phrases. As we can see in the above cases, all replacement patterns satisfy this criterion already. So this next tool takes care of the remaining cases. This can be obtained by the following simple transformation:
[3] Eval-SQL-b_Sentence(
Statement1*
IF NOT SQL-OK
Sentence1 ) =
Statement1*
IF SQL-OK
NEXT SENTENCE
ELSE
Sentence1
This transformation swaps the negative condition of the IF phrase. So Eval-SQL-b works on single COBOL sentences, which is denoted by Eval-SQL-b_Sentence. We see that the input pattern is exactly the same as part of the input pattern of the first transformation. Therefore, we apply it in this ordering, since otherwise the pattern above would be broken. The NEXT SENTENCE is an empty statement in COBOL comparable to the empty statements CONTINUE or EXIT.
We discuss the next equation, which is a variation of the third:
[4] Eval-SQL-b_Sentence(
Statement1*
IF NOT SQL-OK
Cond-body1
ELSE
Sentence1 ) =
Statement1*
IF SQL-OK
rsp(Sentence1)
ELSE
asp(Cond-body1)
Again it works on a COBOL sentence. The special condition NOT SQL-OK is now part of an IF ELSE phrase. Since we swap both branches, we have to remove the separator period from Sentence1 in the ELSE branch, using rsp, and we have to add a separator period to the body of the IF branch. We have a variable Cond-body1 that matches anything that can be part of an IF branch. We have a special sort for that due to the difficult scope termination rules of COBOL, see [9] for more details on these issues. We use an auxiliary tool asp, add separation period, in order to end the scope of the swapped IF ELSE. We note that sloppiness towards separator periods in COBOL 74 dialects can be disastrous, see [41] for details on a Y2K tool that is erroneous due to sloppiness with separator periods.
Now that we have constructed the entire assembly line that carries out the various tasks that were communicated to us, we treat an example program and its output so that the reader gets an idea of the effect of the many small changes together. We modified a program of the system for this purpose. In order to fit the example in this paper, we removed parts of it that are not relevant for explanatory purposes, and we made certain issues anonymous. Here is the modified input program:
IDENTIFICATION DIVISION.
PROGRAM-ID. XXXXX.
AUTHOR. N.N.
DATE-WRITTEN. APRIL 22, 1998.
PROCEDURE DIVISION.
1010-DETERMINE-MAX-SERIALNR.
MOVE SQLCODE TO W-SQL-CODE.
IF SQLCODE = -818 OR (-904) OR -922 OR -911
PERFORM 9999-EM904-FILL
ELSE
IF IND-MAX-SERIALNR = -1
MOVE 1 TO HH-MAX-SERIALNR-MESSAGE
ELSE
IF SQL-OK
ADD 1 TO HH-MAX-SERIALNR-MESSAGE
ELSE
MOVE 'XXXXXXXXMESSAGE' TO EM900-TABLENAME.
1015-ADD-TCTMESSAGE-CT590.
EXEC SQL
INSERT INTO XXXXXXXXMESSAGE
VALUE (:DCLXXXXXXXXMESSAGE.TRANSCODE-ORIGIN
:DCLXXXXXXXXMESSAGE.EVENTTYPE
:DCLXXXXXXXXMESSAGE.SERIALNR-MESSAGE
:DCLXXXXXXXXMESSAGE.BUF-MESSAGE)
END-EXEC.
MOVE SQLCODE TO W-SQL-CODE.
IF NOT (NOT (SQLCODE = -818 OR (-904 OR -911) OR -922))
PERFORM 9999-EM904-FILL
ELSE
IF NOT SQL-OK
MOVE 'XXXXXXXXMESSAGE' TO EM900-TABLENAME.
PAR-1.
IF NOT (SQL-OK OR (SQL-NOT-FOUND))
MOVE 'FETCH' TO H-SQL-CODE
PERFORM 9999-SQL-ERROR.
IF SQL-OK
IF WS-EMP-IND-01 = C-SQL-NULL
MOVE SPACES TO DATE-INC.
IF SQL-NOT-FOUND
MOVE 'Y' TO SW-EOF-ORDERS.
PAR-2.
IF NOT (SQL-OK)
MOVE 'OPEN' TO EM900-FUNCTION
PERFORM 9999-SQL-ERROR.
IF SQL-OK
PERFORM 2000-FETCH-TRANSACTION UNTIL STOP-FETCH.
In our assembly line, we have also a button that carries out the complete maintenance task. It is the ApplyAll button in Figure 2. If we process the above program with our assembly line by pressing the ApplyAll button, we obtain the completely transformed program below.
IDENTIFICATION DIVISION.
PROGRAM-ID. XXXXX.
AUTHOR. N.N.
DATE-WRITTEN. APRIL 22, 1998.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 L-EM948 PIC X(09) VALUE 'EM948 L'.
* FOR TESTING EM948 SQL-CODES
COPY A0046075.
PROCEDURE DIVISION.
1010-DETERMINE-MAX-SERIALNR.
MOVE SQLCODE TO W-SQL-CODE.
MOVE SQLCODE TO SQL-CODE IN LINKAREA-EM948
CALL 'UT100' USING L-EM948 LINKAREA-EM948
IF RETURNCODE = '9'
PERFORM 9999-EM904-FILL
ELSE
IF IND-MAX-SERIALNR = -1
MOVE 1 TO HH-MAX-SERIALNR-MESSAGE
ELSE
IF SQL-OK
ADD 1 TO HH-MAX-SERIALNR-MESSAGE
ELSE
MOVE 'XXXXXXXXMESSAGE' TO EM900-TABLENAME.
1015-ADD-TCTMESSAGE-CT590.
EXEC SQL
INSERT INTO XXXXXXXXMESSAGE
VALUE (:DCLXXXXXXXXMESSAGE.TRANSCODE-ORIGIN
:DCLXXXXXXXXMESSAGE.EVENTTYPE
:DCLXXXXXXXXMESSAGE.SERIALNR-MESSAGE
:DCLXXXXXXXXMESSAGE.BUF-MESSAGE)
END-EXEC.
MOVE SQLCODE TO W-SQL-CODE.
MOVE SQLCODE TO SQL-CODE IN LINKAREA-EM948
CALL 'UT100' USING L-EM948 LINKAREA-EM948
IF RETURNCODE = '9'
PERFORM 9999-EM904-FILL
ELSE
IF SQL-OK
NEXT SENTENCE
ELSE
MOVE 'XXXXXXXXMESSAGE' TO EM900-TABLENAME.
PAR-1.
IF SQL-OK
IF WS-EMP-IND-01 = C-SQL-NULL
MOVE SPACES TO DATE-INC
ELSE
NEXT SENTENCE
ELSE
IF SQL-NOT-FOUND
MOVE 'Y' TO SW-EOF-ORDERS
ELSE
MOVE 'FETCH' TO H-SQL-CODE
PERFORM 9999-SQL-ERROR.
PAR-2.
IF SQL-OK
PERFORM 2000-FETCH-TRANSACTION UNTIL STOP-FETCH
ELSE
MOVE 'OPEN' TO EM900-FUNCTION
PERFORM 9999-SQL-ERROR.
As we can see, a DATA DIVISION, a WORKING-STORAGE SECTION is created, and the necessary data item and COPY statement are added. The IF SQLCODE parts are recognized, regardless extra parentheses, extra negations, and ordering of the SQL return codes. The code is changed to the new CALL and the IF uses the RETURNCODE instead. The control structure redesign is also insensitive to extra parentheses, etc. In the replacement program we can see the effect of equations [1] and [2] of Eval-SQL-a. So all the changes are performed in the right ordering.
Finally, in order to carry out such tasks on complete systems, it is not desirable to do this in an interactive way, even not using the ApplyAll facility. We see the interactive part of the ASF+SDF Meta-Environment as the development environment of the assembly line. When the assembly line is designed and implemented, an entire system should be transformed. We use the batch version of the ASF+SDF Meta-Environment to do this. We note that in [13] an elaborate discussion can be found how batch oriented system-wide maintenance and renovation tasks are carried out using the ASF+SDF Meta-Environment.
It is hard to come up with an objective measurement for maintainability. In fact, what to one programmer appeals, could feel cumbersome to the other. This is maybe best illustrated by the religious wars between programmers about the use of GO TOs, programming language, indentation style, choice of text editor, commenting style, variable naming conventions, etc [35]. So maybe we could argue that there is no such measure at all. Still, the wish from many companies is to improve maintainability--after all major part of the total cost of a system is due to maintenance, as is confirmed in many studies [32,38]; [36] gives a recent summary of these findings.
It is also known that programmers working on a system for some time, typically maintenance programmers, consider the code to be their own [47]. Due to this fact, special care should be taken towards code inspections in general [20], or temporarily outsourcing maintenance and renovation. So, when management decides to (temporarily) outsource maintenance or renovation of a system there is a serious danger that the maintenance team will reject the code that is returned to them. For, their code has been taken away, which is for them a sign that they are not doing their work properly. Then someone else fiddles around with it, and when it comes back it is broke. Harry Sneed mentioned during his keynote for the fourth Working Conference on Reverse Engineering that he experienced this phenomenon in an off-shore outsourced Year 2000 conversion. On the other hand, outsourcing can be quite effective, for instance, a recent study [25] reports that outsourcing averages about twice the productivity of in-house development in New England banking applications. In [15], a key to success for off-shore outsourcing reengineering projects appeared to be intensive communication. Note that our approach also included intensive communication.
The approach that we propose tries to avoid the abovementioned risks. First of all, the maintenance team provided us with individual changes that we carry out system wide. Second, it is planned that maintenance teams of our partners are going to make those changes themselves, thus avoiding some of the abovementioned problems and risks. This paper is the first step in this process: we do a study in order to show how such projects are carried out in general. After our presentation of the assembly line we obtained some new transformations that we carried out. A next project that has been carried out for this bank is the automation of a temporary maintenance task where COBOL 85 is transformed back to COBOL 74 (see [13] for details). Ideally, the situation could be that in-house teams can make massive changes in a cost effective way, using tools that make the task a challenge instead of a tedious error-prone time-consuming job. We have done our very best to gear the tools as much as possible towards the normal situation of programmers that are working at this bank, for instance by using native patterns [44] and a grammar that is geared towards their dialect [9]. Moreover the names of the sorts in the grammar are chosen as much as possible from their manuals. At the moment we are constructing a similar maintenance/renovation architecture for Ericsson for a number of their proprietary languages [45].
In this paper we proposed an architecture to carry out software maintenance in an automated way. We applied our approach to a real-world COBOL/SQL stockbroking system plus some real-world maintenance tasks to show its effectiveness. We were able to implement the tasks in a few hours so that we could make the changes in a completely automated fashion on the entire system. We believe that this approach towards automated maintenance is promising in the sense that it is easy to apply, fast, and less error sensitive than traditional hand-crafted maintenance.