Abstract :
[en] Apps’ pervasive role in our society motivates researchers to develop automated techniques ensuring dependability through testing. However, although App updates are frequent and software engineers would like to prioritize the testing of updated features, automated testing techniques verify entire Apps and thus waste resources. Further, most testing techniques can detect only crashing failures, necessitating visual inspection of outputs to detect functional failures, which is a costly task. Despite efforts to automatically derive oracles for functional failures, the effectiveness of existing approaches is limited. Therefore, instead of automating human tasks, it seems preferable to minimize what should be visually inspected by engineers.
To address the problems above, in this dissertation, we propose approaches to maximize testing effectiveness while containing test execution time and human effort.
First, we present ATUA (Automated Testing of Updates for Apps), a model-based approach that synthesizes App models with static analysis, integrates a dynamically refined state abstraction function, and combines complementary testing strategies, thus enabling ATUA to generate a small set of inputs that exercise only the code affected by updates. A large empirical evaluation conducted with 72 App versions belonging to nine popular Android Apps has shown that ATUA is more effective and less effort-intensive than state-of-the-art approaches when testing App updates.
Second, we present CALM (Continuous Adaptation of Learned Models), an automated App testing approach that efficiently tests App updates by adapting App models learned when automatically testing previous App versions. CALM minimizes the number of App screens to be visualized by software testers while maximizing the percentage of updated methods and instructions exercised. Our empirical evaluation shows that CALM exercises a significantly higher proportion of updated methods and instructions than baselines for the same maximum number of App screens to be visually inspected. Further, in common update scenarios, where only a small fraction of methods are updated, CALM is even quicker to outperform all competing approaches more significantly.
Finally, we minimize test oracle cost by defining strategies for selecting, for visual inspection, a subset of the App outputs. We assessed 26 strategies, relying on either code coverage or action effect, on Apps affected by functional faults confirmed by their developers. Our empirical evaluation has shown that our strategies have the potential to enable the identification of a large proportion of faults. By combining code coverage with action effect, it is possible to reduce oracle cost by about 41.2% while enabling engineers to detect all the faults exercised by test automation approaches.