Xeditor: Inferring and Applying XML Consistency Rules
XML files are frequently used by developers when building Web applications or Java EE applications. However, maintaining XML files is challenging and time-consuming because the correct usage of XML entities is always domain-specific and rarely well documented. Also, the existing compilers and program analysis tools seldom examine XML files. In this thesis, we developed a novel approach to XML file debugging called Xeditor where we extract XML consistency rules from open-source projects and use these rules to detect XML bugs. There are two phases in Xeditor: rule inference and application. To infer rules, Xeditor mines XML-based deployment descriptors in open-source projects, extracting XML entity pairs that frequently co-exist in the same files and refer to the same string literals. Xeditor then applies association rule mining to the extracted pairs. For rule application, given a program commit, Xeditor checks whether any updated XML file violates the inferred rules; if so, Xeditor reports the violation and suggests an edit for correction?. Our evaluation shows that Xeditor inferred rules with high precision (83%). For injected XML bugs, Xeditor detected rule violations and suggested changes with 74.6% precision, 50% recall. More importantly, Xeditor identified 31 really erroneous XML updates in version history, 17 of which updates were fixed by developers in later program commits. This observation implies that by using Xeditor, developers would have avoided introducing errors when writing XML files. Finally, we compared Xeditor with a baseline approach that suggests changes based on frequently co-changed entities, and found Xeditor to outperform the baseline for both rule inference and rule application.