Song Pang is the senior vice president of customer engineering at NetBrain, a market leader for NetOps automation.
While modern hybrid on-premise and multicloud connected digital infrastructures can deliver IT services at levels previously unimaginable, enterprises are finding it increasingly difficult to effectively manage this new infrastructure using current processes.
As networks become increasingly expensive to operate, enterprises are incurring significant risks of outages and service degradations from misalignment of technology capabilities and operating plans. A 2022 survey found that just 46% of small-business tech professionals felt they had the required level of visibility into the majority of their organization’s applications and digital infrastructure.
Addressing The Problem With Current NetOps Processes
This struggle stems from the lack of transformation of their network management processes as new technologies were being introduced and as the organization scaled. Simply put, current network operations processes are outdated, ineffective, labor-intensive, not scalable and ignore the most valuable asset each organization already owns—the problem-solving knowledge they possess!
Troubleshooting Service Delivery
This is a big deal. Truly addressing this operational problem isn’t just a matter of adding more engineers to the mix; it requires a fundamentally new set of workflows for network operations. Those new workflows start with the high volume need for interactive troubleshooting of service delivery issues, which can be accomplished by capturing the proven network operations that are used every day through the codification and normalization of those processes. Once this knowledge has been codified, it can be shared with others as part of the new workflow for remediation.
Automatic Diagnostic Testing
A second type of workflow involves how external events begin a remediation process. Whereas most current NetOps plans are reactive, automation can respond to any external trigger and immediately begin diagnostics, long before engineers get involved. This has an additional virtue in that it captures diagnostic detail very close to the moment the anomaly is detected, while the problem still exists.
Testing Ideal Versus Actual Network Behaviors
The final set of workflows that can be adopted is the automated detection of anomalous behaviors in a continual fashion. By leveraging the knowledge of problem solving previously captured, real-time network conditions can be tested continuously to identify situations where the behaviors being delivered are not as expected and, once identified, bring subject matter expertise into use.
Making Network Operations—More Science Than Art
Effective workflows are scientific in nature, but historically, network engineers have viewed their jobs as more of an art. With years of experience in problem identification and resolution, each engineer has their own unique arsenal of practical techniques that they can employ to diagnose and fix problems.
But this “artisanal” approach to problem solving can create its own set of issues. Different engineers may approach the same problem in different ways leading to inconsistent (and not always good) outcomes, which in turn results in more service tickets, longer resolution times, and, ultimately, more pressure on already taxed service delivery budgets.
Every IT organization has within its ranks the knowledge to diagnose and address just about any network problem. Network engineers currently use the escalation process to access those few and far-between experts when they are needed. That said, it should also be no surprise then that most IT executives are holding out for a “better plan” and identify automation as the likely approach but simply don’t know how to get started. According to one analyst estimate, the market for network automation tools is expected to grow by nearly 23% annually from 2022-2030.
Traditional automation projects have usually resulted in long-duration software-development projects with huge teams of software developers with little tangible return on the investment. No wonder they have failed since these attempts have been developer-centric, required extreme detail in functional specification, were rigid in design and maintainability, and quickly outgrew cost projections. Hence, they have been shelved in favor of the traditional ad hoc and brute-force methods already in place.
But we are at an inflection point where business meets technology. For digital businesses to survive, it is important for network automation to be adopted wholesale. This is now possible due to the maturity of the same low-code/no-code approaches that have democratized the agile building and sharing of best practices and improved operations in other areas across the enterprise.
Planning No-Code Automation Projects: Consider Task Frequency And The Intent Of The Network
With traditional network automation, the amount of effort required to program desired tasks meant it was only used in the most severe situations with large, easily predefined problems and was limited by the availability of automation engineers. That limiting factor doesn’t exist with no-code, so the process for choosing and prioritizing tasks to automate is different. I suggest organizations look at two new criteria: task frequency and the intent of the network.
The more frequent the task, the more value you’ll get out of automating it. IT leaders should look for NetOps tasks that repeat often, perhaps weekly or daily. Even if automation only saves minutes on each one, the value will add up as the task repeats.
Second, IT leaders should look at the intents of their network and consider automating the tasks associated with those intents. In other words, what are the most important behaviors that the network supports? This might mean keeping latency or other KPIs for key applications within a certain range or following security policies. IT leaders should prioritize automation projects that support these high-value network intents. Here are some examples:
• Ensure that any pair of high-availability (HA) network devices are always configured identically for proper failover.
• Check for devices with open interfaces or ports that violate security policies.
• Look up MAC addresses, IP addresses or physical locations of devices as part of security investigations.
Conclusion
I suggest IT leaders interested in exploring no-code automation start by building a master list of network tasks based on these two criteria. Offloading routine tasks from the to-do list of busy network engineers will free them up for more important work and ultimately improve the performance and reliability of enterprise networks.
Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?
Read the full article here