The primary objective of network monitoring is to identify service-impacting failures from monitored events, before service is interrupted and customers are affected. Methods to identify causal events or Root Cause Analysis (RCA), is vulnerable to scaling issues at high event rates. Elimination of noisy events that are not causal is critical to ensuring the scalability of RCA. Further, the structure of the managed network, which is usually highly dynamic, is fundamental to the determination of which events are most likely to cause potential service impacts.
In recent work we have been studying how structural graph entropy measures, recast at the node level, can be used to filter out noisy events that originate at structurally unimportant nodes. This approach is highly applicable to dynamic networks as it only relies upon the local topology of a node and avoids expensive global computations. Using commercial data sets taken from large-scale networks of more than 200,000 nodes, we can demonstrate a strong correlation between high values of vertex entropy with the probability of network nodes producing events that escalate into service impacting incidents. Our analysis also surfaced interesting departures from the expected scale free degree distributions, indicating the presence of constraints. We developed a constrained attachment based extension to the Barabási-Albert model, which has better predictive power for degree distributions, and is much simpler than the Barabási-Bianconi fitness extensions. We present the details of this model and some early analytical results, including an intriguing link between the vertex entropy work and our constrained attachment model.
Contact: Keith Briggs () or Richard G. Clegg (richard@richardclegg.org)