Skip to main content

Top 10 Practical Uses for IT Automation


Runbook Automation systems empower organizations to design, build, orchestrate, manage, and report on workflows that support IT processes. Run Book Automation is not limited by IT infrastructure elements and acts as the connecting layer between disparate IT processes and user-friendly service desk tools. With Run Book, enterprise users can automate routine, repetitive operational processes, drive down costs, and remove complexity from the datacenter. The solution introduces greater accountability into the system, and furnishes organizations with tools to measure productivity and improvements in efficiency.


1.0 RBA is the Next IT Battleground

A study of past trends reveal that in the early nineties, businesses were more concerned about the management framework they needed to adopt in order to effectively and efficiently manage their IT infrastructure. While this need was effectively met by packaged solutions like CA Unicenter, IBM Tivoli and HP OpenView, today, the focus is on configuration management database or CMDB, as it is more popularly know. This unified repository of information helps organizations understand the relationships between the various components of their information system and track their configuration. CMDB is a powerful tool, as it can sever the foundation to tie together all the different processes within IT. It is a fundamental component of the IT Infrastructure Library (ITIL) framework's Configuration Management process.

Today’s rapidly evolving and highly competitive marketplace doesn’t allow enterprises to focus only on current requirements. To leverage their IT investments effectively and ensure maximum returns on the same, it has become imperative for organizations to gain a deeper understanding of emerging trends in IT and the developments that are likely to take place in the next 2-5 years.

The need to design, build, and report on workflows that support IT operations processes has become critical and traditional IT management processes such as job scheduling products and custom scripting, are not adequate to meet this requirement. In light of the changed requirements and demands of the IT marketplace, a solution like Run Book Automation can deliver to enterprises what is seen as the need of the hour, that is… According to international market analyst firm Gartner, RBA is the next battleground for IT.


2.0 What is Run Book Automation

In a report published by Gartner in June 2006, Run Book automation, is defined as the ability to design, build, orchestrate, manage and report on workflows that support IT operations process. It refers to products that help organizations employ workflows to automate different operational tasks across IT management disciplines, to support IT management processes, which includes identifying various manual tasks that are repetitive and error prone, and putting them onto automated workflows. This is additionally supported by a lot of reporting, auditing and enforcement. It leverages technology to replicate the processes or activities that are otherwise performed manually especially routine and repetitive tasks or functions. Thus RBA enables organization to streamline their processes and make it more manageable.

In addition, there are several critical customer requirements that need to be taken into consideration. For instance, operators require a visual user interface that provides step-by-step directions through the various IT processes and procedures. RBA, therefore, leverages the visual medium to guide users through the various steps involved in triage, diagnosis, and other repetitive maintenance tasks.

Customers have different needs and processes and hence they might want to initiate these automation workflows through different modes. Some customers may like to schedule certain activities on a periodic basis; at other times, in the likelihood of a particular event they may want to automatically trigger a series of follow up actions. Hence, the level of flexibility that a Runbook Automation system offers is very important.

Another key requirement is related to the ease with which workflows can be created. It is possible that a customer has previously leveraged scripting to create automated processes. However, scripting is a specialized skill set. And not many IT professionals are equipped with the same. The customer therefore faces serious limitations when they need to broaden the base on who can create automation workflows. This raises the need for a solution that is much easier, and also more intuitive to use than scripting.

Nonetheless, most customers have already invested significant sums into their legacy solutions. And this has to be leveraged to justify those investments. This makes it necessary to identify techniques that would enable incorporation of existing scripts into any new solutions that are likely to be implemented. Additionally, most customers also insist on leveraging their existing knowledge and expertise.

Integration with the existing solution at the customer’s end is highly important especially because millions of dollars have already been invested in these management frameworks and CMDBs. The challenge is in taking these investments and adding Runbook Automation to the mix.

Customers also demand capabilities like additional filtering, ability to change different automation processes, and leverage existing processes to form blocks that can be put together to create more complex processes. This forms an important aspect of the expectations customers have from RBA systems.

Other than this, mundane but critical aspects of running processes like reporting and documentation needs to be automated. In addition to providing highly sophisticated and detailed reporting, RBA systems should also be able to create documentation on the fly.

This not only brings about greater operational efficiency but also ensures improvements in the quality of output while cutting down on the time generally spend on completing the task. It also enables an organization to leverage its resources to provide higher-level analytics and functioning, in order to ensure continuity and holistic delivery of services.

Security is another key concern for most organizations. Customers are often faced with a dilemma when they have to delegate certain tasks to different individuals. Though automation doesn’t do away with manual involvement, it puts in place processes to identify resources involved with restarting and stopping a service. Greater security is introduced into the system as manual involvement is now more role-based, which demands greater accountability from each individual user.


3.0 Why RBA is Becoming a Key Initiative

The sudden surge in demand for Run Book Automation systems is mainly driven by two factors. First, IT departments of most enterprises, small and large, are under tremendous pressure to show tangible return on investment. With no significant increase in their budgets or resources, IT departments are expected to draw a clear chart highlighting the benefits IT brings to the organization and prove improvements in service levels. which calls for mapping the various processes, followed up by detailed auditing and reporting.

The second factor responsible for the increase in the demand for Run Book Automation systems is the need for greater control over the datacenter. The adoption of the Information Technology Infrastructure library (ITIL) has given an impetus to the maturing of various operational processes. This is essential if an organization wants to maintain a predictable, repeatable and streamlined environment in their datacenters.

Today, the expectation is that IT should be able to support more and more systems and applications when the demand arises. This, in today’s environment, can be a daily occurrence. Internally, organizations are identifying the means to get more out of their current IT investments. This calls for driving greater efficiency into existing processes, identifying resources that can be freed from performing banal, iterative work to more resource intensive ones.  In light of these expectations, automation seems to be the only solution for taking existing efficiencies to the next level.

The actual availability of technology that can satisfy these internal requirements has also provided a boost to the rising demand for RBA systems. Traditional automation methods like custom coding and scripts lack best practices, change management, documentation and flexibility. However, this is imperative in an operations environment, where business rules and configuration settings change frequently.

Implementing best practices is achieved primarily by defining and automating IT operations management processes. Though most data center tools also provide evolved automation capabilities, they do not automate processes between applications. Thankfully, technology has now matured and is in a position to tackle the need that has arisen internally. Today, sophisticated IT operations management platforms and tools are available that can enable a customer to deploy a Runbook Automation solution and experience its benefits in a live environment.

No wonder then that Gartner feels, through 2012, Runbook Automation will have the highest RoI in any of the IT initiatives you can partake today because automation is the direction for IT operations going forward.


4.0 Runbook Automation and IT Infrastructure Library (ITIL)

The ITIL framework outlines best practices for all IT activities. The service support areas of ITIL including incident, problem, configuration, change, and release management make up the daily operational tasks within IT.

With ITIL, the most important processes are under service support and service delivery. RBA ties these different processes together: from incident problem management to change release configuration management to even mobility capacity management. These processes can be integrated, and then leveraging RBA the release and change processes can be automated. Prior to RBA, putting in place such processes would have involved investment into additional resources. However, automation accomplishes a much more effective result by using minimal resources.


5.0 Leveraging Existing Investments While Filling a Critical Gap

RBA is a layer that enables an organization to leverage its existing system management tools including monitoring systems and event consoles. The solution actually acts as a connecting layer between an organization’s monitoring solution and service desk solutions. Inputs from the monitoring solution are fed into the RBA system. When certain conditions coded into an alert ID are fulfilled it triggers an automated workflow. Since this is fed into the ticketing system it provides detailed auditing and tracking without necessitating the need to rely on manual data input. Thus RBA acts as a layer that sits between an organizations system management, system monitoring and service desk tools.

For instance, iConclude’s Opsforce Central is a web-based application, where Tier 1 system administrators based out of the Network Operating Center (NOC) can come to find the common repairs that are put out by iConclude or workflow automations an organization has created to meet the requirements of its specific environment.

In addition to these, iConclude also provides accelerator packs for all kinds of common infrastructures like Linux, Unix, Solaris, Databases, web servers and networking. iConclude’s solution covers all the various platforms as well as situations that can occur across the vast majority of datacenters.

A study of these workflows and diagnostics points to the fact that typically customers prefer to initiate them via any of the three possible modes listed here. In the ‘guided Run’ mode the operator running the workflow has the ability to proceed in a step-by-step manner and ascertain exactly what’s going on.

Another option, and which is emerging as the preferred one, is the automatic mode. Here the workflow is integrated into a monitoring or a service desk product whereby whenever an alert or ticket is created an automatic workflow is put into motion. The automatic workflow process follows the triage, diagnose and repair process wherein the system first gauges the problem from a criticality point of view and then assigns the next step accordingly.  In routine cases, the system might even carry out the repair itself doing away with the need to escalate it further.

The third option is running the diagnostics for system maintenance at pre-scheduled intervals depending on whether the customer wants to opt for a daily or weekly or another pre-decided cycle.


6.0 Common RBA Use Cases

Below we have provided the common Runbook Automation cases that we have encountered, and which customers are likely to face during the course of automating various business processes.

Out of The Box Repair

OpsForce Central includes critical features like an out of the box repair solution, which is a diagnostic around Windows Server. For instance, a technician attempting to isolate a problem in a windows datacenter would at first try to determine whether it is a network connectivity problem, or a problem related to CPU usage or lack of memory disk space. During the course of these tests, an audit trail is formed, which is really the key. A lot of enterprises with very large datacenters are struggling with factors like compliance issues. For instance, those in the healthcare industries are struggling with issues related to Sarbanes Oxley or HIPPA compliance, which calls for in-depth audit requirements though they have made a lot of investments in documenting their procedures.

Automation tackles the repetitive processes successfully and at the same time enables users to audit the workflow. This helps ensure that the processes that were earlier documented are being strictly adhered to. The audit features enables an organization to track a particular infrastructure that was touched, identify the individual who has touched it, provides details of their interaction, and what data was returned. This has proved highly helpful to these companies, as it has been identified as one of the best ways to attack the audit requirements laid down by HIPPA or Sox. The solution also raises an alarm whenever they detect potential problems and then offers users a diagnostic panel that lists out the potential problems in the system.

This is a common use case that can be run either manually or automatically if an organization is affected by poor server performance. But in today’s datacenters, where enterprises are looking at large server farms to service an application; they may face the need to analyze things at an application level instead of looking at it server by server.

An organization that has implemented application level monitoring or service level monitoring in their environment and has some monitors looking at web pages may get the feedback that certain key web pages are taking much longer to load than expected. These kind of issues can be challenging because a single web page is probably being run through a load balancer to a large number of web servers. Using dedicated resources to check when the pages slow down is impractical today.

However, using automation technology, you can dynamically query the load balancer to find out the IP addresses of the servers that are servicing a particular application. Information on these servers can then be garnered on a real time basis, and analyzed to identify potential problems with the server. Automation would take that information and run it through the workflow that has been implemented within the ticketing environment and escalate it through the workflow. While all the work has been done automatically, the data that has been gathered will be placed into the ticket. This would provide a technician who looks at the ticket with all the necessary information.

The manual process of information gathering, which can generally take 10-15 minutes for one server and more if there are multiple servers, can now be done automatically and the key information placed into the ticket. Often, when people commence with automation, doing this repetitive triage and diagnosis proves a great way to get quicker time to value.

Virtualization

Another example is virtual machine management. Virtualization has been gaining grounds in almost all large datacenters. However, most enterprises are finding it a challenge to manage them. A financial services company, for instance, was looking at ways to automate their daily business cycle. During the trading day, they needed to dedicate am large portion of their hardware resources to servicing their trading applications. And as the trading day came to a close, the resources needed to automatically provision other applications and bring them into play.

Automating this process helps identify the virtual machines that are running trading applications that are not heavily utilized and shutting down those virtual machines. If there are VMs being over utilized, it keeps those running, as it can take those out of service later. It then goes through all the hosts and identifies ones where it can provision the other virtual machines. It then looks at various characteristics of those hosts, to make sure there is enough CPU utilization and memory to bring those new virtual machines online. This is a common repetitive task that can be done on a daily basis to optimize the operational environment.

In ITIL, automation would help organizations garner information about their environment. This provides the ability to start doing data mining, and business intelligence gathering to help move from plain incident management to problem management. The difference essentially is that while the former is a reactive approach to problem solving the latter helps the datacenter operator or engineer take more proactive measures.

Configurable Dashboard

iConclude’s configurable dashboard helps configure charts around various dimensions and provides a visual representation of the alerts that are flowing into the datacenter, whether they are infrastructure style alerts like CPU thresholds or application style alerts like slow page loads. Touching these applications for repair helps gather drill down capabilities, which shows the configuration items that are causing the most problems; and also the actions being taken to solve them. Based on the problem area, this information can then be driven back to the respective teams be it the development team or the capacity management or capacity planning units. Thus the information gathered by doing incident management can now be leveraged to enable more proactive problem management.


OpsForce Studio

iConclude’s OpsForce Studio helps create and modify automation. If an enterprise has a large number of different back up devices to manage, and if failures came up with those, dealing with the inundation of log files would pose a serious challenge. OpsForce Studio can automate the process of analyzing those log files and restart backups on particular servers that were having problems. If the servers are unable to back up due to lack of disk space, the system can archive some files offline thus enabling the back up jobs to work without incident. Here automation goes through back up log files, identifies errors, analyzes server loads, checks disk capacity on those servers and then takes the appropriate action.

RBA can also automate the process of conducting periodic checks of an organizations network infrastructure to ensure that the devices are running smoothly and are up-to-date on firmware. By dynamically going through the routing infrastructure we can verify firmware then create tickets to go ahead and have those updated.

Runbook Automation can help glue ITIL disciplines like change management, release management and configuration management together. Here automation touches change management, waiting for a particular change request to get approved. Once the change request gets approved, the particular server indicated from the cluster can be removed, lead off transactions that were linked to that server, take it out of monitoring, hook up with the customer’s provisioning software, install the particular patch, reboot the server if necessary, bring that back into monitoring, bring it back into the cluster, and then update the ticket.

Automations should be conceptually easy to use and have reusable sub components. This makes it easier to drill into one of these, and see that this in itself is another automation, and each of these are in themselves sub automations. Therefore by building things in a hierarchical manner, you have these reusable components that make automations scalable. This can prove critical in disaster recovery.

Even if an organization has monitoring software that incorporates the ability to restart a service it is highly unlikely that it promises the level of sophistication provided by RBA. Take the example of an organization that requires a service restart. An SQL query gets the information about the machine, verifies that the particular service is running, and if it is a mail is send informing that there is no problem.

This is necessary because there are transient events that are sometimes termed false alerts. However, at times, things that are transient might just show up on the monitoring software and go away. But if there is a problem a trouble ticket has to be created, additional information added to that, an email has to be send to an escalation person informing that trouble ticket has been created, try to restart the service automatically, verify that the restart succeeded, and if it did, update the ticket, update the database that had the information about the trap, and then send an updated case email. If the agent didn’t successfully restart, the ticket has to be updated with a failed notice, find out whom to escalate to for another SQL query and then escalate that to the concerned person. If we drill into things that seem conceptually simple like restarting a service even that requires true process automation.

Another use case that can be frequently seen is in the area of dealing with clustered systems and load balancers. For instance, if an online service provider wants to take a few servers offline, they have to ensure that there are enough other servers to handle the load at that time. Automation would examine the server pools, locate how many nodes are currently available to service a particular application, check that against thresholds and only disables nodes if there are enough other servers available to handle the load. This gives the ability to automatically manage the environment based on current conditions.


7.0 Getting Started with RBA

Before investing in an RBA solution, it is necessary to analyze and identify your key business requirements. An organization needs to first determine the most common alerts and incidents that could be automated, which in turn would provide maximum RoI. It is, however, important to set realistic objectives and goals. It is imperative to plan each minor detail to ensure that you derive maximum benefit. This includes the IT strategy that you plan to follow, the platform you intend to adopt and the tool vendor.

After these initial needs have been identified, an organization needs to plan out the workflow design. The top five or 10 alerts that have been identified needs to be documented and common steps that need to be taken to remediate them should be enumerated. Once these initial steps have been successfully completed all an organization needs to do is design and develop the automation flow, which is a straightforward process using the RBAs visual user interface The next step is to pilot and implement the initial automation flows. For the pilot project, it is advisable to stick to a few selected processes and expand to other processes, domains and groups once these are running smoothly.


8.0 Keys to Ensuring RBA Success

There are three key factors that need to be adhered to in order to ensure successful implementation of an RBA system.

An organization needs to understand the complexity of its infrastructure. It is important to understand the requirements of all the different stakeholders in the initial analysis. While analyzing process requirements, it is important for an organization to plan for the future. While the needs may not seem many at the moment rapid growth can change that in a very short span of time. This might then put pressure to move to a more robust and scalable platform. Hence it is imperative to ensure that an integration plan is in place throughout the course of a Runbook Automation project.

During the initial trial phase, organizations should select processes that are likely to demonstrate quicker RoI. It is advisable to avoid highly complex processes. The complex processes may be highly important to the organization and might also have greater visibility. However, the likelihood of errors is higher during the initial trial or learning phase. Hence, going in for a less complex process will enable you to demonstrate the success of the system much more easily and in the process provide more buy-in for an organization-wide roll out.

Enterprises should adopt a phased rollout strategy that will enable them to take a more proactive approach instead of being reactive. Closely consider the auditing and data coming out of the incident problem management process, and then proceed to a more predictive mode. Leverage Runbook Automation products to automate these processes and integrate all existing tools and processes. This will enable you to predict where your datacenter needs are headed.


9.0 Gartner Recommendations 
  • Ensure you understand Runbook automation and make sure whatever you are doing is in line with the service level and business priority you have identified together with your business counterparts.
  • Make your initial project very narrow in scope so you can deliver very tangible benefits.
  • Set clear objectives.
  • Ensure your process requirements are in line with your current operational tools. Select the right tool set with the right features that are going to support your most flexible set of needs.
  • Make sure that integration is taken into consideration in the initial project, because you never want to implement this in a vacuum. Understand the different integration points, and where you want to hit in the initial target. You probably don’t want to integrate with everything in your pilot, but you definitely want to hit one or two key tools that are currently a backbone in your datacenter.
  • Obtain full support across the organization – talk to different stakeholders.
  • Put in place well defined processes and take it to the next level with Runbook Automation.


Comments

Popular posts from this blog

Seven tips for recession proofing your data centre

The credit crunch and recession have put value-for-money at the top of the business agenda.  IT budgets, and more specifically data centre operations, have been among the first to bear the brunt of the cost-cutting axe.  Operational expenditure on top of high initial capital investment means CIOs must now cut cost and increase return on investments. However, reducing investment can damage an organization’s smooth functioning so how do you find initiatives that are cost-effective with a relatively quick payback period but not at the expense of disrupting the business? Know your Cost-Cutting Sweet Spots:   Maintenance and support accounts for more than 50 percent of an organisations IT budget.  In the initial phase, an audit team should identify all DCO assets deployed.  This will enable analysis of annual spending on servers and storage devices, network components, software licenses, applications, databases, and operating systems.  Overspend...

IT Act languishes thanks to government negligence

The Indian IT Act 2000 turns two this month. However, rather than being part of the solution to the misuse of technology, its implementation seems to have opened up a Pandora’s box. In light of a recent Bombay High Court verdict on the lackadaisical track record of the Indian government in this aspect, we trace the loopholes in the Act With the recent spate of high profile cases involving the entertainment industry and the underworld, and with cases dealing with global terrorist conspiracies, the Bombay High Court has been in the news for one reason or another. However, last week saw a landmark judgement in the IT space, when a bench comprising Justices Ajit Shah and Ranjana Desai, severely censured the Union government for not appointing appropriate authorities to enforce right of remedy under the Information Technology Act (IT Act), passed by Parliament way back in 2000. Though this judgement lacked the drama and sensation associated with the more high-profile cases, in th...

Indian billing vendors look outside India

A robust, world class billing system forms one of the most critical components of a telecom operator’s infrastructure, as it has a direct impact on the bottom line. Indian vendors however have received a lukewarm response from the domestic market despite the fact that their products are on the shopping list of international telcos. It’s a strange situation. Indian software solution providers are acclaimed the world over for delivering high-quality, low-cost solutions. But when it comes to products very few have been able to achieve any significant breakthroughs. Take the case of the telecom billing solutions space. Indian telecom operators have internationally reputed systems in place. But except for one or two exceptions, none of the major telecom service providers in the country have deployed solutions developed by domestic telecom billing solution providers. This despite the fact that most Indian solution vendors boast of quite a few international telecom operators on their cli...