Reverse Software Engineering Large Object Oriented Software Systems using the UML Notation

TR Number
Journal Title
Journal ISSN
Volume Title
Virginia Tech

A common problem experienced by the software engineering community traditionally has been that of understanding legacy code. A decade ago, legacy code was used to refer to programs written in COBOL, typically for large mainframe systems. However, current software developers predominantly use Object Oriented languages like C++ and Java. The belief prevalent among software developers and object philosophers that comprehending object-oriented software will be relatively easier has turned out to be a myth. Tomorrow's legacy code is being written today, since object oriented programs are even more complex and difficult to comprehend, unless rigorously documented. Reverse Engineering is a methodology that greatly reduces the time, effort and complexity involved in solving the program comprehension problem.

This thesis deals with Reverse Engineering complex object oriented software and the experiences with a sample case study. Extensive survey of literature and contemporary research on reverse engineering and program comprehension was undertaken as part of this thesis work. An Energy Information System (EIS) application created by a leading energy service provider and one that is being used extensively in the real world was chosen as a case study. Reverse engineering this industry strength Java application necessitated the definition of a formal process. An intuitive Reverse Engineering Process (REP) was defined and used for the reverse engineering effort. The learning experiences gained from this case study are discussed in this thesis.

Software Engineering, Design Recovery, Program Comprehension, Reverse Engineering, Unified Modeling Language, Re-engineering