Wireless Networks

DTW2025 - Stability is essential for autonomous networks

DTW2025 - Stability is essential for autonomous networks

At DTW2025, a panel featuring representatives from TM Forum, Huawei and Telkomsel discussed how intelligent core networks empower high stability autonomous networks.

Huawei’s Core Network VP Eric Luo kicked off the discussion by defining Autonomous Network Level 4 (AN L4) and identifying the key challenges in implementing this into the core network. Given the sensitive nature and critical position of the core network, and considering that operators are of course keen to prevent any incidents, achieving AN L4 in the core network requires both high stability and high efficiency.

“Stability is the foundation; without it, efficiency is meaningless”, said Luo, who noted that the industry appeared to be in consensus on this point. “Due to the complexity of core network, our top priority is helping CSPs to achieve zero outage through the implementation of ANL4, followed by improving accuracy and efficiency of demarcation. At the same time, we hope to improve O&M efficiency, achieving a real closed-loop automation in high-value scenarios.”

In the past two years, AI technology has rapidly advanced, especially with the evolution of foundation models like DeepSeek and Mauns, which are driving AN L4 forward.

However, there are still significant challenges, with Luo citing the gap between the precision and determinism required in O&M and the current capabilities of large models. This can result in accuracy deviations and model hallucination; additionally, large models are still limited in their ability to learn and reason complex problems, and addressing this insufficient level of self-learning and generalization will require significant model tuning.

Improving the data quality is of course essential, but reducing the cost of this process – not to mention adopting the effective engineering methods that can help achieve a better quality data corpus – is a key issue. In addition, the industry lacks unified standards and a shared understanding of how to assess the capability of domain-specific models in O&M. These are common challenges across the O&M domain, and while Luo acknowledged that Huawei has made progress in this area, he promoted the need for collaboration between industry partners to resolve these issues through sharing best practices.

Trihan Marsudi, GM of Network Digitalisation and Analytic Platforms at Telkomsel, noted that in the Indonesian market, high stability is not just a performance metric – it’s a survival requirement. Operating across an archipelago of more than 17,000 islands which experience frequent natural disasters such as earthquakes, Telkomsel faces unique challenges in ensuring service continuity. As autonomous network behaviours become more prevalent, it is essential to ensure that this intelligence does not compromise the five-nines availability expected from the core.

Centralized fallback systems are often not viable across Indonesia’s fragmented geography, so recovery mechanisms must be both decentralized and intelligent. True autonomy can only be achieved by network architecture that is resilient by design, with AI used not just to optimize performance but to guarantee stability under worst-case scenarios.

Olta Vangjeli, Programme Director for Cloud Native IT & Networks at TM Forum, noted that in this year’s TMF Evaluation, more CSPs are participating in the Autonomous Network Level (ANL) assessments, including several from the high-demand core network domain.

Vangieli observed that given that core networks lie at the heart of service delivery – and are under constant pressure from new technologies, traffic surges, and SLA constraints - there is a growing industry consensus that in this context, intelligence without stability is dangerous.

To address this, TM Forum works closely with CSPs and vendors to tie ANL evaluation more tightly to fault resilience, recovery speed, and service protection indicators. Operators like China Mobile, stc, and Telkomsel have shown that achieving L3+ automation is only sustainable when paired with a measurable foundation of high stability. More than 30 CSPs share their scores, but the ANL assessment is not just about this – it focuses on benchmarking these advancements. Industry-wide governance and methodological consistency are crucial for this, and evaluating core network stability is critical as this is the backbone for network continuity.

Marsudi noted that from Telkomsel’s experiences, KEIs—such as call setup success rate under disaster stress, or recovery time after signaling floods—are not just operational metrics, but essential anchors to judge whether automation translates into tangible stability and user benefit.

“The TMF matrix should evolve from “what is automated” to “what is autonomously assured with experience impact”. This calls for a more granular scoring structure and scenario-based verification, especially in the core network domain.”

Currently, scoring typically trends towards being operator-led and therefore subjective; Telkomsel is advocating for a three-party co-evaluation mechanism involving vendor tech transparency, operator use-case proof, and third-party reviews of metrics. According to Marsudi, this approach would significantly reduce evaluation inflation and promote real L4 readiness.

Telkomsel is well-positioned to advise CSPs on advancing their AN levels, given the challenge of providing stable connectivity in a nation of thousands of islands that experiences frequent natural disruptions.

Signaling storms - triggered by events like mass reconnections or infrastructure outages - can rapidly degrade core network stability. In this environment, automation alone is not enough - predictive stability is required to detect early signs of degradation and act before incidents occur.

To achieve this, Telkomsel and Huawei are working together to introduce an intelligent management plane – known as MDAF – which performs signaling storm detection and geo-aware anomaly modeling. This allows the operator to localize and contain faults faster than ever before.

Using multi-dimensional traffic modeling and anomaly behavior learning, it is possible to develop a signaling storm prediction model which Telkomsel can then use to preemptively identify high-risk zones before the storm occurs. Once risk is detected, it will simulate pressure scenarios and recommend preemptive load rebalancing or flow control policies. This enables the network to stay resilient, even when facing unpredictable usage spikes, and represents a major step toward proactive fault prevention - not just faster recovery.



More Articles you may be Interested in...