According to industry estimates, monthly global mobile data traffic is surpassed 30.6 exabytes by 2020 and global mobile data traffic will increase nearly eightfold between 2015 and 2021. Most of the mobile data traffic is generated by smartphones, and the total number of smartphones is expected to continue growing by 2021, which results in rapid traffic growth. In addition, the upcoming 5G networks and Internet of Things based communication are estimated to involve a large amount of network traffic. The increase in mobile data traffic and in the number of connected devices poses a challenge to network operators, service providers, and data center operators. If the transmission capacity of the network and the amount of data traffic are not in line with each other, congestion may occur and ultimately the quality of experience degrades. Mobile networks are also becoming more reliant on data centers that provide efficient computing power. However, the energy consumption of data centers has grown in recent years, which is a problem for data center operators.
A traditional strategy to overcome these problems is to scale up the resources or by providing more efficient hardware. Resource over-provisioning increases operating and capital expenditures without a guarantee of increased average revenue per user. In addition, the growing complexity and dynamics of communication systems is a challenge for efficient resource management. Intelligent and resilient methods that can efficiently use existing resources by making autonomous decisions without intervention from human administrators are thus needed.
Therefore, there are many problems with owning and maintaining a large scale network. Beyond a certain point they will be very costly and very di�cult to operate the network. A solution to this is to make the network autonomous and allow the network do detect faults in itself.
Large scale computer networks are very complex systems. As the size of a network increases, the resource demand to manage the network and cope with changes grows signiï¬Âcantly [1].
In order to make it easier to deal with these large networks the owners usually deploy a number of monitors. The purpose of deploying these monitors is so that the operators can supervise the network and discover problems. These monitors tend to aggregate data and do not perform any kind of fault analysis or detection, leaving it to the human operators to identify the problem in the network. There are also problems that some of these monitors do not discover.