Jul 2022     Issue 19
Research
How to Make Modern Software Reliable: An Intelligent and Data-Driven Approach

Michael R. Lyu, Department of Computer Science and Engineering

Reliability is at the core of usability when it comes to modern software such as web services, cloud computing and mobile applications. As these software systems are widely utilised to provide a variety of services to billions of users today, one tiny reliability issue could lead to serious problems (e.g., network interruptions and service outages), which could further lead to revenue loss and user complaints. Thus, reliability engineering is very vital to modern software operation and maintenance.

A culmination of over 10 years of research, ADORE (Automated Data-driven sOftware Reliability Engineering) aims to solve these problems from an intelligent, data-driven perspective by analysing typical software data across three main components: Quality of Service (QoS), software logs, and mobile app reviews. These components typically work together in a closely integrated manner to enable system reliability and service quality.

QoS Prediction: A core component of web service reliability management involves predicting the quality of web services. For many researchers, QoS prediction has been an area of interest for a long time. However, there were no comprehensive real-world web service QoS datasets or tools for validating various QoS-based approaches until ADORE came to be. Its collection of real-world datasets and benchmarks for 30+ QoS prediction approaches have greatly facilitated research while benefitting the web service community as a whole. Since our research was made available, there have been requests from over 370 research institutes, including large companies and top-tier universities such as Microsoft, IBM, Amazon, Carnegie Mellon University, and Imperial College London, to use our open-source datasets and tools. Our research has also received more than 4500 citations and inspired many follow-up studies.

Log Management: ADORE provides automatic detection and identification of reliability issues through log management. Its log management tools are powered by artificial intelligence for the research community and industry via an open-source package, namely LogPAI (https://github.com/logpai), LogAdvisor, Loglizer, Log3C, and the like. These tools have been liked and recognised as useful by more than 1200 individuals from both research institutes and the industry in general. ADORE also offers a comprehensive dataset which has been downloaded more than 4000 times by 200+ organisations around the world including ETH Zurich, Microsoft, and HSBC – clearly demonstrating the impact of log-based reliability management on academia and the industry. At Huawei, LogPAI has been integrated deeply into product lines to trace the root cause of network reliability issues, while at Microsoft, it being used to enhance the effectiveness and efficiency of reliability issue predictions. Besides leveraging LogAdvisor to provide highly accurate automatic suggestions to its developers on where to insert a logging statement, our Log3C tool was also successfully deployed on three real-world datasets at a rate 1000x faster than conventional methods on a huge amount of synthetic data, detecting and reporting service system problems as they occurred.

Review Mining: Review mining serves as a major component in software reliability management, through which actionable insights are extracted from user feedback on app stores, such as Google Store, App Store, etc. The extracted insights can help companies better upgrade their mobile applications, e.g., Instagram. Moreover, they may also provide informative business suggestions and guidance to companies. Under this component, ADORE provides a combination of several tools, such as Armor (https://remine-lab.github.io/), by utilising natural language processing techniques to automatically extract edifying knowledge from user reviews. Armor can process thousands of reviews in seconds, as compared to the hours or days in human effort that it would typically cost at an accuracy rate of 70%. In our collaboration with Tencent, one of the largest IT companies in China, it is being used to extract insights efficiently yet effectively, reducing their analysis costs by a factor of at least 500. The success and impact of our toolset continue to be validated within its practical environment (e.g., deployed on 100+ industrial apps).

Although the birth of ADORE can be traced back to 2010 in terms of research, it has continued to garner a favourable reputation amongst the research community while making an impact in the industry from 2012 to this day. The work cited in this framework can be mainly attributed to the CUHK CSE alumni, including Dr. Zibin Zheng, Dr. Jieming Zhu, Dr. Pinjia He, Dr. Cuiyun Gao, and Dr. Shilin He, under the direction of Prof. Michael R. Lyu.

In summary, the ADORE project enables software reliability engineering through a set of intelligent tools and valuable datasets that demonstrates its significant impact on both academic development and industrial deployment.



Past Issue      
Contact Us
Subscribe    Email to friend    Unsubscribe
Copyright © 2024.
All Rights Reserved. The Chinese University of Hong Kong.