In a recent survey from the AIOps Exchange, 91% of respondents said they are looking at machine learning-powered tools to help IT operations teams be more productive. With all the disparate data sets and tools operators have to review, and alert noise, it’s no wonder organizations are seeking some AI-powered assistance to regain control of the environment.
Artificial intelligence for IT operations (AIOps) is a buzzword right now in IT circles but the impact is real. Machine learning capabilities give IT operations teams contextual, actionable insights to make better decisions on the job. More importantly, AIOps is an approach that transforms how systems are automated, detecting important signals from vast amounts of data and relieving the operator from the headaches of managing according to tired, outdated runbooks or policies. In the AIOps future, the environment is continually improving. The administrator can get out of the impossible business of refactoring rules and policies that are immediately outdated in today’s modern IT environment.
IT operators’ day in the life: battling the machines
IT ops teams live by a runbook of fixes or policies to address common issues–like the dusty car manual that sits in your glovebox. The problem is, these once-venerable runbooks aren’t always accurate, especially as demands on the modern enterprise, with its cloud workloads, continually shift: resources appear and disappear and application requirements change according to the day’s DevOps scrum decisions. The acceleration and evolution of modern IT operations and the rise of complexity has made the standard policy-driven framework or runbook more irrelevant than ever. Fixed frameworks of automation don’t work when workloads are serverless, ephemeral, and only active moment-to-moment. What to monitor, and how, is an open question for IT Operations.
This is the scene that’s been set for AIOps. It fundamentally changes how automation is built, managed and operated in the modern enterprise. And it throws the old runbook in the trash.
A new day in the life: machines fix themselves, operators gain new skills
Now that we have AI and machine learning technologies embedded into IT operations systems, the game changes drastically. AI and machine learning-enhanced automation will bridge the gap between DevOps and IT Ops teams: helping the latter solve issues faster and more accurately to keep pace with business goals and user needs. Let’s take a look at how:
- Operators give feedback to make the system smarter so it adapts with changing circumstances. Let’s say server thresholds are set at 60%, at which time an alert is issued to rebalance workloads. IT can override that policy if it appears that a 75% server threshold hasn’t been affecting response times. The change is made and now the system has evolved with the business. Even better, an AI-enhanced monitoring system can proactively suggest an appropriate policy change.
- IT Ops teams can identify and resolve issues much faster and in a standard way. Daniel sees a database connection issue and fixes it quickly, since he knows the scenario well. The next week, Sarah, who’s never seen that error before, is befuddled–but only for a moment. The ITOM system is aware of the earlier fix and suggests it to her so she can correct the issue immediately. This means the right solution happens for each fix, regardless of who’s on deck at the moment. The runbook now lives in the system and is a fluid document.
- IT Ops teams are way more productive. A recent OpsRamp survey found that 77% of organizations said the number of open incident tickets went down after deploying an AI-powered operations system. Other benefits cited include elimination of repetitive tasks across the incident lifecycle (85%) and faster root cause analysis and problem resolution (80%).
- IT operations job descriptions will evolve, too. Certainly, AI will replace some job roles in operations, yet it will also open the door for career growth by eliminating tedious, repetitive tasks. IT operations people will have the opportunity to pursue data science and development skills so they can manage the automation of policies and actions in the ITOM ecosystem–improving the AI brain to support the business.
During the first phase of AIOps, capabilities focused on alert correlation. But the use cases are going to expand significantly in the coming years–changing how IT organizations support and improve business services. Artificial intelligence could also make the work more meaningful and fun, too, with less grunt work and an opportunity for operations professionals to make a more direct impact on the business.
Written by Ciaran Byrne, VP of Product Strategy for OpsRamp.