Software Re-engineering

Software Re-engineering

Modern software engineering is mostly concerned with a paradigm of evolutionary systems .Re-Engineering offers an approach to have the evolvable systems rather than legacy systems. The process of reengineering may be applied as engineering principles to an existing system to meet the new requirements of that system .With the view of software reengineering we have discussed its different forms including reverse engineering , data restructuring , program restructuring and source code translation . The purpose of reengineering is to increase the efficiency and decrease the overhead.

After the half of 20th century, Software Industry has been grown up with tremendous high rate . Now a days softwares are used in almost every organizational activity . These systems should be maintained and evolved according to the new requirements and with the introduction of new hardware .For some systems it has been estimated that the eighty percent of total expenditures are consume just for evolving and maintenance . There is huge backlog of maintenance requests .So some organizations avoid to improve their system. Old systems that are still to be maintained are called “Legacy System”. The amount of code in legacy systems is immense. Most of these systems were developed before the use of standard software engineering techniques. Their structure and documentation may be out of date or non existing and may be there is no one to understand them . The risk in re-writing these systems is very high . Organizations do not want their legacy system obsolete .They may afford only its evolution only. “Software Engineering” is concerned with taking these legacy systems and re-implementing them to maintain them . The system may be re documented . It may be translated to a modern language , it may be implemented on distributed system rather than on mainframe and it may be implemented for different database management system. Here we define software re-engineering as “Reengineering is the systematic transformation of an existing system into a new form to realize quality improvements in operation , system capability , functionality , performance or evolvability at lower cast schedule or risk to the customer.” The technical difference between re-engineering and new software development is the stating point.

Re-engineering must be considered when an organization depends on that system and when the system is regularly maintained . It improves the system structure, creates new system documentation and make it easier to understand . The cost of reengineering depends on the extent of work. The main factors are give below:

(1) Quality Of Previous System: The lower the quality of the software and its documentation , higher the re-engineering cost.

(2) Data Conversion: If there is large amount f data to be converted then cost would be high significantly.

(3) Staff :If the staff is responsible for maintaining the system can not be involved in the re-engineering process , it will increase the cost.

There are following four forms of re-engineering:

It is the simplest form of re-engineering. Source code of one language is changed into the source code of an other language. The newly adopted language may be totally different from the old language .For example conversion of Pascal code into Java code or it may be the advanced version of the old language . For example using C++ instead of C. This translation is necessary due to the following reasons :

(1)Hardware Change: The organization using the system may change their available hardware. Sometimes older system is not compatible .

(2)Difficult To Understand and Use: When a language becomes out of date then there may be no staff to understand that language and hence it can not maintain it properly and there is need to translate it .

(3)Original Policy Changes: An organization may wish to standardize all the systems under its use to minimize cost factor . There is need to translate it .

In figure b the process of source code has been illustrated .There may be no need to understand the operation of the software in detail or to modify the system architecture. We can focus on programming language considerations such as the equivalence of program control constructs .

Source code translator is only economical if an automated translator is available does the most translation. “It may be the special program to convert one language to another language” .Complete automatic translation is impossible.

It is the transformation of representation of one format to another . Re-structuring is one of the techniques in reshaping data models ,design planes and requirements structure .Figure A shows that how a complex control logic makes a simple program difficult to understand . It is the algorithm of an air conditioner controller . While in figure B the same algorithm has been shown but with structure format .It can be read sequentially from top to bottom .

Program Re-structuring makes a program more readable and easy to understand . However program may suffer lack of modularity .Program modularization is usually carried out manually by inspecting and editing the code. However , experimental systems have been produced to provide some computer aided assistance for modularization. There are some drawbacks of program restructuring.

(1)Loss Of Comments: Comments are not the part of the re-structured program so these are lost .

(2)Loss Of Documentation: Likewise comments the documentation is also lost .

(3)Computational Demand: In re-structured programs complex algorithms are used and hence computational demand increases.

Drawbacks 1 and 2 are not main factors as the old comments and documentation are out dated.

Data re-engineering is the process of analyzing and recognizing the data structures in a system to make it more understandable. A system consists of consists of several different programs which use different file formats, these may all have to be modified as part of the data re-structuring process.

The objective of data re-engineering is often to convert the chaotic data management situation to a managed data environment. Data problems also arise because programs are now required to process much more data than was originally their developers. For example a funds management system of a finance company, was originally designed to handle up to 99 funds. The company was managing more than 200 funds and had to run 32 separate copies of the system. This was becoming increasingly expensive both in terms of human and computer resources. They therefore decided to re-engineer the system and its associated data. Some of the problems with data which can arise in legacy system made up of made up of several cooperating programs.

1)Data Mining Problems: Names may be cryptic and difficult to understand. Different names may be given to same logical entity in different programs in the system. The same name may be used in different programs to mean different things.

2)Field Length Problem: This is a problem when field length in records is explicitly assigned in the program. The same item may be assigned different length in different program. To solve this problem, other fields may be reused in some cases so that usage of a named data across the programs in a system is not inconsistent.

3)Record Organization Problem: Records representing the same entity may be organized differently in different programs. This is a problem in languages like COBOL where the physical organization of records is set by the programmers and is reflected in files. It is not a problem in languages like C++ where physical organization of the record is compiler’s responsibility.

4)Hard-coded Littorals: Literal values such as text weights are included directly in the programs rather than referenced using some symbolic names.

5)No Data Dictionary: There may be no data dictionary defining the names used, their representation and use.

Detailed analysis of the programs, which use the data, is essential before data re-engineering. This analysis should be aimed at discovering the function of identifier in the program, finding the literal values, which should be replaced with named constants. Following figure explains the process of data re-engineering assuming that the data definitions are modified, literal values named, data format recognized and the data values converted.

In the stage1 of this process, the data definition in the program is modified to improve understandability. The data itself is not effected by these modifications. The data re-engineering process may stop at this stage if the intention is simply to complete some program re-structuring process. If there are data value problems as discussed above, stage2 of the process may be entered.

If an organization decides to continue to stage2 of the process, it is then committed to stage3, data conversion. This is usually a very expensive process. Program has to be written with embedded knowledge of the old and the new organization. This processes the old data and output the converted information.

Reverse engineering is the process of analyzing the software with the objective of recovering its design and specification. The software source code will usually be available as the input to the reverse engineering process. Sometimes, source code is also not available, then executable code is the input for the process of reverse engineering.

Reverse engineering is emerging interest area with in software engineering. Software engineering itself is concerned with improving the productivity of the software development process and the quality of the system. As currently practiced, the majority of the software development effort is spent on maintaining the existing systems rather that developing new ones. The greatest part of the software maintenance process is devoted to understanding the system. This involves reading the documentation, scanning the source code and understanding the changes to be made. The implication is that if we want to improve the maintenance, we should facilitate the process comprehending the existing system. Reverse engineering provide a direct attack on the program comprehension problem.

The process of reverse engineering can be defined as “the process of analyzing a subject system to identify the system’s components and their inter relationships and create representation of the system in an other form or at higher level of abstraction.” The purpose of reverse engineering is to understand a software system in order to facilitate enhancement, corrections, documentation, redesign or programming in different language.

Following figure shows the process of reverse engineering. The process starts with an analysis phase. During this phase, the system is analyzed using automated tools to discover its structure then engineer work with the system source code and its structural model. They add information to this, which they have collected by understanding the system. All of this information is maintained in some information store, usually in the form of directed graph.

Information store may be available to compare the graph structure and the code. They may be used to add further information that inferred about the design. Documents of the various types may be generated from this information. These might include program and data structure diagrams and traceability matrices. Traceability matrices show where entities in the system are defined.

Reverse engineering generally involves extracting design and building abstraction that is less implementation dependent. Reverse engineering often involves an existing functional system as its subject, this not a requirement. You can perform reverse engineering starting from any level of abstraction or at any stage of life cycle. Reverse engineering in and of itself dose not involve changing the subject system or creating a new system based on a reverse engineered subject system. It is the process of examination, not the process of change.

Reverse engineering is a difficult process because it connects different domains. The following differences are of particular importance

1.The difference between a problem from some application domain and a solution in some programming language.

2.The gap between the concrete world of physical machines, computer programs and the abstract world of high level description.

3.The difference between the desired coherent, highly structured description of the system and the actual system whose structure may have disintegrated over time.

4.The difference between the hierarchical world of programs and the associated nature of human cognition.

5.The difference between the bottom up analysis of the source code and the top down synthesis of the application.

There are many sub areas of reverse engineering. Two important areas are

Re-documentation is creation of a semantically equivalent representation with in the same relative abstraction level. The resulting form of representation are usually considered alternate views intended for human audience.

Re-documentation is the simplest and the oldest form of reverse engineering, and many consider it to be not an intrusive, weak form of restructuring. Some common tools used to perform re-documentation are pretty printers (which display a code listing in an improved form), diagram generators (which create diagram directly from code). A key goal of these tools is to provide easier way to visualize relationships among program components so you can recognize flow paths clearly.

Design recovery is a subset of reverse engineering in which domain knowledge, external information and deduction are added to observation of the subject system to identify meaningful higher level abstractions beyond those obtained directly by examining the system itself.

Design recovery is distinguished by the sources and span of information it should handle. It recreates design abstraction from a combination of code, existing design documentation, personal experience and the general knowledge about problem and application domain. It must produce all of the information required for a person to fully understand what a program does, how it does it, and so far. Thus it deals with the far wider range of information than from unconventional software re-engineering representation.

The primary purpose of reverse engineering is to increase the comprehensibility of the system for both maintenance and new development. There are six key objectives that will guide its directions as the technology matures:

1.Cope with complexity We must develop methods to better deal with the shear volume and complexity of system. A key to controlling these attributes is automated support. Reverse engineering methods and tools, combine with CASE environment, will provide a way to extract relevant information so decision makers can control the purpose and the product in systems evolution.

2.Generate alternate views Graphical representation have long been accepted as comprehension aids. Reverse engineering tools facilitate the generation of graphical representation from other form. Reverse engineering can generate additional views from other perspectives (like control flow diagrams, E/R diagrams etc.) to aid the review and verification process.

3.Recover lost information A continuing evolution of large system leads to lost information about the system design. Modification are frequently not reflected till documentation, particularly at higher level then the code itself.

4.Detect side effect The initial design and successive modification can lead to side effects the system’s performance in subtle ways.

5.Synthesize higher abstraction Reverse engineering require method and techniques for creating alternate views that transcend to higher abstraction level.

6.Facilitate reuse A significant issue in the movement toward software reusability is the large body of existing software assets. Reverse engineering can help detect candidates for reusable software component from present system.

The cost of understanding software, while rarely seen as a direct cost, is nonetheless very real. It is manifested in the time required to comprehend software, which include the time lost to misunderstanding, By reducing time required to grasp the essence of software artifact in each life cycle phase, reverse engineering may greatly reduce the overall coast of software.

We have tried to provide a frame work for examining reengineering technologies by synthesizing the basic definitions of related terms and identifying objectives. Reengineering is rapidly becoming a recognize and important component of future CASE environments. Reengineering tools can provide a major link in the over all process of development and maintenance. Software reengineering, used with evolving software development technologies, will provide significant incremental enhancements to our productivity.


1.Elliot J. Chlkofsy and James H. Cross. “Reverse engineering and design recovery” IEEE software volume 7 , Jan. 1990.

2.Spencer Rugaber “ Program comprehension for reverse engineering ” College of computing Georgia Institute of Technology Atlanta.

3.“Perspective on legacy system re-engineering ” Re-engineering center. Software engineering institute Carnegie Mellon university Pittsburgh.

4.Stephen B. Ornburn and Spencer Rugabur “ Reverse engineering : Resolving conflicts between expected and actual software design ” College of computing and software research center Georgia Institute of Technology Atlanta.

5.Ian Sommerville “Software Engineering” 5th edition (page 699-711).