Monday, July 8, 2019

Incident Response - Manually Tracked? Not Any More... Vol 8 rel 4


The Landscape

In cases where incident detection and resolution cannot be automated, incidents need to be acknowledged, assigned, diagnosed, and resolved quickly keeping downtime to a minimum. The average cost of IT downtime is $5,600 per minute. Downtime, at the low end, can be as much as $140,000 per hour, $300,000 per hour on average, and as much as $540,000 per hour at the higher end. 

Legacy technologies and manual processes cause delays, mistakes, and miscommunication from IT service desks, wasting resources and time. While this is happening, leadership wants to know what’s going on every step of the way, often causing further interruptions in the resolution process. 

Teams continue to adopt and tune the Information Technology Infrastructure Library (ITIL) framework to suit their needs; and leverage new technologies to better respond to incidents when they occur. Within ITIL, incident management is a process that organizations apply to their Information Technology Service Management (ITSM) strategy to reduce MTTR, provide better customer service, and deliver more value back to the business. 

This framework can be the foundation for an organization to set a solid baseline for planning, implementation and data analysis, helping IT teams align their services to the needs of the business. Within ITIL, the incident management process can help teams reduce downtime; however, the right technology needs to be in place. 

While there are numerous new technologies that make it easier to communicate, collaborate and automate responses, many challenges still remain. 

Correlating events from multiple monitoring tools across several systems simultaneously can easily overwhelm any team. Most of the alerts are noise; however, critical issues and noise reduction need to be addressed to achieve a better signal to noise ratio. As teams move into continuous integration and deployment models, the state of systems can become more difficult to determine. Collaboration within and across teams will become more difficult until communication is facilitated with similar processes and tools. 


Common Information Technology Service Management (ITSM) pain points
  • Working in multiple systems leads to double-work, context switching, and a higher incidence of manual error.
  • Alert overload causes IT management and support engineers to miss big picture incidents while sifting through the noise of tickets and alerts.
  • Configuration management becomes more and more difficult to track with continuous integration / continuous deployment.
  • Communication and visibility often bottleneck otherwise streamlined processes.
The Incident lifecycle Overall goal 

The Incident lifecycle Overall goal - minimize recovery time of each incident and drive down MTTR over time

Steps to shorten the incident lifecycle

Identify and log the incident 

An incident can come from anywhere, at any time. Regardless of the incident’s source, it is important to remember the first two steps in incident management: identify and log. If a customer calls in an incident or reports it via email, then it’s up to a service desk agent to log it into the system. Incident logs, or tickets, should include the name of the person reporting the incident, a description of the incident and a unique identification number assigned to the ticket for tracking purposes. Tight two-way integrations between monitoring tools and service desk systems can help automate processes, saving time and improving accuracy. Ideally, incidents are resolved before customers are impacted or take notice. 


Categorize 

Teams should categorize incidents systematically. Failure to categorize properly could cause issues to be routed to the wrong people, delaying resolution. It could also negatively impact data analysis, which is important for improving incident management in the future. Incident Detection & Recording Initial Support Investigation & Diagnosis Resolution & Recovery Incident Closure 1 2 It Takes Teamwork – 

Prioritize 
Service agreements typically inform response and resolution. To determine how critical an issue actually is, and how much damage it could do, it is important to assess several questions that need to be answered before an incident is prioritized. How many people will be impacted? What is the financial implication? Are we compliant with requirements? How much worse could this issue get if left unaddressed? Once this step has been completed, then prioritize this incident amongst all other open incidents in the system. 


Respond 

Incident response can be broken down into five steps once the incident has been identified, categorized and prioritized. 


Initial DiagnosisThis is the first real function that happens once an incident is logged. A service desk employee will quickly evaluate a possible reason for what is wrong, and then attempt to fix it. This is where a knowledge base can be incredibly beneficial. Ideally, incidents are resolved before customers are impacted or take notice.


Incident Escalation - The incident is escalated when the front-line service desk agent cannot fix the issue. The service desk agent will collect all details of the incident, and escalate to the appropriate level, providing the higher tiered levels of support the information they need to resolve the incident quickly. Automation can help locate the on-call person at the next level, opening a chat room or another communication channel. System integrations can move relevant information where the on-call resource can access it easily. System integrations can change status updates for internal stakeholders and customers, minimizing interruptions and easing concerns. 

Investigation and DiagnosisAs a best practice, this step should occur throughout the course of the incident. After the incident escalation step, it then moves into the investigation and diagnosis phase and escalation to levels two and three for support. 

Resolution and Recovery Recovery is the amount of time it takes for operations to be completely restored - keep in mind, patches and bug fixes may be required even after the resolution or work-around has been identified. 

Incident ClosureOnce the incident has been restored to its regular state, then it is sent back to the service desk for closure. To maintain quality assurance, only service desk agents are allowed to close incidents. The incident owner should always check with the person who reported the incident to confirm the resolution before proceeding with incident closure. System integrations can help service agents automate this step, including posting status updates for internal stakeholders and customers.

Most Popular Incident Management Software

Enlisted below are the most popular incident management tools that are trending in the market.



#1) JIRA Service Desk

Jira Service Desk is a very popular service desk platform developed to help the IT or business service desk and customer service. This tool helps in delivering end-to-end service to the clients.


Jira Service Desk is developed on top of JIRA platform so it works better with JIRA software. It has a good performance with agile teams as it was developed for collaboration. Jira provides some exceptional templates, which are customizable in nature.


Jira comes with a lot of robust and reliable feature due to which is it used by many companies as a major bug-tracking tool. Jira in multiple ways simplifies the process in which the client contacts the organization.
Refer the figure: Architecture Diagram of Jira Service Desk: 

  • Developed by: Atlassian.
  • Type: Commercial.
  • Head Quarters: Sydney, Australia.
  • Year Founded 2002.
  • Stable Release: 7.12.0
  • Based on Language: Java.
  • Operating Systems: Cross Platform.
  • Device Supported: Windows, iPhone, Android.
  • Deployment Type: Cloud-Based, On Premise, Open API.
  • Language Support: English.
  • Price: US $10 – US $20 per month depending on the number of agents.
  • Annual Revenue: Approx. US $620 Million and growing
  • Number of Employees: Approx. 2300 employees are working currently.
  • Users: Leidos Holdings Inc., Macmillan Learning, DRT Strategies, Inc., Sounds True, Inc., Bill trust, Cap Gemini, Dominos, CHEF, Dice, and Fresh etc.

Features:

  • It supports automation and provides Jira Software Integration and customer portal.
  • Integration with confluence, Machine Learning, API and self-service.
  • It supports real-time updates with knowledge base and SLA’s.
Pros:
  • Powerful, and extensible with a good implementation.
  • Automated mail triggering to the concerned person for tasks.
  • Defect raised can be a single point for testers and developers.
  • All information regarding defect is present in the portal, hence documentation is reduced.
Cons:
  • As there are many features in the portal, it is difficult to understand at the beginning.
  • Email notifications sometimes get very slow in JIRA due to signatures and attachment.
  • Interface design can be improved.

#2) Mantis BT

Mantis BT is a renowned open source bug-tracking tool developed to meet the client requirement and it is web-based too. It has a simple and easy setup.


Mantis BT is flexible; it offers customization features and quickly updates the client through notifications. It allows the users to have access to projects. It is free and is available on the web.

It provides a crucial balance between simplicity and strength. A user can get started very quickly and collaborate with the teammates easily. It has a huge library of plugins, which can be used to create custom features as required by the clients.


Refer the below Architecture Diagram of Mantis BT:


Architecture Diagram of Mantis BT


  • Developed by: Kenzaburo Ito and many open source authors.
  • Type: Open Source.
  • Head Quarters: Sydney, Australia.
  • Founded Date: Year 2000.
  • Stable Release: 2.16.0
  • Based on Language: PHP.
  • Operating Systems: Cross Platform.
  • Device Supported: Linux, Windows, iPhone, Mac, Web-based, Android.
  • Deployment Type: Cloud-Based, On-Premise, SaaS, Web.
  • Language Support: English.
  • Price: Need to get in touch with Mantis BT for enterprise versions.
  • Annual Revenue: Approx. US $17.1 Million and growing
  • Number of Employees Working: Approx. 100 employees are working currently.
  • Users: Tetra Tech Inc., Contactx Resource Management, eNyota Learning Pvt. Ltd., Colony Brands, Inc., Spectrum Softtech Solutions Pvt. Ltd., NSE_IT etc.
Features:
  • It provides plugins, notifications, maps, full-text search, and system integration.
  • It supports audit trails and change logs with the sponsorship of issues.
  • It includes good project management, wiki integration, and many language supports.

Pros:

  • It is capable to track multiple projects and users.
  • Mantis BT Filter provided is exceptionally well.
  • Its features are simple like forms, user tracker, project information’s etc.

Cons:

  • Mantis BT UI can be improved.
  • Its child and parent class features are difficult to understand at the beginning.
  • Its automation tracking needs to be improved.
  • The tool requires a well-skilled person to work on. 

3) Pager Duty


Pager Duty is a famous incident management tool, which provides an incident response platform for the IT organizations.

It helps to increase the performance of a system by clearing the operation cycle. It supports DevOps teams to develop reliable and high-performance applications. Thousands of organization for its good features trust it.It has multiple integration and operation performing tools, automatic scheduling, reporting in detail and ensures availability at all the time.

Refer the below Architecture Diagram of Pager Duty:

  • Developed by: Alex Solomon
  • Type: Commercial.
  • Head Quarters: San Francisco
  • Founded Year: 2009.
  • Stable Release: 5.22
  • Based on Language: C#, .Net.
  • Operating Systems: Cross Platform.
  • Device Supported: Linux, Windows, iPhone, Mac, Web-based, Android.
  • Deployment Type: Cloud-Based, SaaS, Web.
  • Language Support: English.
  • Price: Starts at US $9 to $99 with required features and versions increasing.
  • Annual Revenue: Approx. US $10 Million and growing
  • Number of Employees: Approx. 500 employees are working currently.
  • Users: IBM Cloud, Spotify, FlixbusLIXBUS, XERO, EVERNOTE, AMERICAN EAGLE, GE, eBay, PAY PAL, ORACLE, WEEBLY, SIMPLE, CHEF, INDEED etc.



Architecture Diagram of Pager Duty


Features:
  • It provides a good real-time collaboration and mobile incident management.
  • It has organized event grouping and rich alerting.
  • It provides a good service grouping and user reporting.
  • It has automated escalations and security.

Pros:

  • It has very good and effective control alerts for team members.
  • It has an affordable price with powerful integration and good IOS App.
  • It includes powerful API Integration and email integration.
  • Its scheduler is very simple and easy to use.
Cons:
  • Pager Duty Interface is poor and needs to be improved a lot.
  • Its documentation and installation are not easy and simple, hence requires a strong technical person.
  • It comes with a poor support team management, which reduces the customer satisfaction.
  • In the Pager Duty tool, there should be some easy way to turn off the alerts.



#4) Victorops

VICTOROPS is a famous incident management tool, which is specially designed for the DevOps team, by allowing them with access to more features than just reporting incidents. It helps the IT to collaborate and make a communication throughout the life cycle; hence, the issues are analyzed thoroughly.

It has a graceful interface due to which the DevOps team have a swift and flawless communication consisting of abilities for collaborating, integrating, automating, measuring and allowing them to develop and deploy the software successfully.

What is VICTOROPS AND FLOW?

VictorOPS Flow


  • Developed by: Bryce Ambraziunas, Dan Jones, Todd Vernon
  • Type: Commercial.
  • Head Quarters: Greater Denver Area, Western US
  • Founded Date: Year27th Dec 2012.
  • Stable Release: 1.12
  • Based on Language: Scala
  • Operating Systems: Cross Platform.
  • Device Supported: Linux, Windows, iPhone, Mac, Web-based, Android.
  • Deployment Type: Cloud-Based.
  • Language Support: English.
  • Price: Starts at US $10 to US $60 and increases Client with the required features and increasing versions.
  • Annual Revenue: Approx. US $6 Million and growing
  • Number of Employees: Approx. 100 employees are working currently.
  • Users: CROWDTAP, CRAFTSY, SIGNIANT, SKYSCANNER, BLUE ACCORN, GOGO, CA TECHNOLOGIES, EDMUNDS, RACKSPACE etc.
Features:


  • It comes with good on-call schedules and suppressed noise.
  • It supports live call routing, reporting, Chat ops, and delivery insights.
  • VICTOROPS has API, mobile.
  • It has good run books and graphs.
Pros:
  • It has made a huge difference with on-call feature for clients.
  • It has an affordable price and a simple workflow.
  • VICTOROPS UI is very good.
  • It has a powerful integration mechanism.
Cons:
  • Improvement has to be made on the mobile application part in the tool.
  • The timeline should be increased for notification messages on a home screen.
  • VICTOROPS interface can sometimes become difficult to use due to its complexity.
  • It is not well known for its flexibility in handling and accepting alerts.


 #5) Freshservice

FRESHSERVICE is one of the popular cloud-based platform for customer support and provides all size clients with good support service. It has a powerful ticketing system and knowledge base. It keeps a good track of all client queries thereby increasing the client’s productivity.

It has a minimal maintenance, thereby keeping secured data and completely automated. It is simple and easy to use the software. It plays a vital role in analyzing and resolving issues by providing adequate solutions before they have a bad impact on the productivity of an organization.

Refer the below Architecture Diagram of Freshservice:

Architecture Diagram of Freshservice

  • Type: Commercial.
  • Head Quarters: San Francisco Bay Area, West Coast, Western US
  • Founded Date: 2010
  • Operating Systems: Cross Platform.
  • Device Supported: Linux, Windows, iPhone, Mac, Web-based, Android.
  • Deployment Type: Cloud-Based, SaaS, Web.
  • Language Support: English.
  • Price: Free Version is available and Enterprise version Starts at US $29 to US $80 and increases the Client with the required features and increasing versions.
  • Annual Revenue: Approx. US $2.6 Million and growing
  • Number of Employees Working: Approx. 100 employees are working currently.
  • Users: JUDSON UNIVERSITY, FLIPKART, CORDANT GROUP, SWINERTON, ADDISON LEE, HONDA, TEAM VIEWER, VEEVA, UNIDAYS etc.
Features:
  • It has ticketing, domain mapping, priority matrix, and powerful automation tools.
  • It supports incident, problem, change, and release management.
  • It has its own integrated game mechanics and custom mailbox.
  • It supports asset, basic, advanced and enterprise reporting.
Pros:
  • It has a simple & easy installation and configuration.
  • It has a powerful automation and self-service catalog.
  • It has a pleasant interface to work.
  • It is extremely flexible in customization.
Cons:
  • It has a poor reporting and more SLA breaches.
  • It has a poor text editor in term of functionalities.
  • It does not allow access to file and image repository.
  • Adding additional modules is not possible.

#6) OpsGenie

OPSGENIE is a popular IT incident management tool based on the cloud. It provides solution for small to large-scale organizations. It provides sophisticated situations and thorough tracking of each alert. It allows the client to integrate with many other tools and applications.

It supports both Android and IOS applications. It has a monitoring system, which ensures end-to-end flow of the application and checks if it is working correctly by sending periodic messages.

It helps to plan and prepare for incidents by determining whom to respond, which template to use, how to collaborate and by creating a status page.

Refer the below Architecture Diagram of OPSGENIE:

Architecture Diagram of OPSGENIE


  • Developed by: Abdurrahim Eke,  Berkay Mollamustafaoglu, Sezgin Kucukkaraaslan
  • Type: Commercial.
  • Head Quarters: Washington DC Metro Area, East Coast, Southern US.
  • Founded Date: 2012
  • Based on Language: JSON, HTTPS API.
  • Operating Systems: Cross Platform.
  • Device Supported: Linux, Windows, iPhone, Mac, Web-based, Android.
  • Deployment Type: Cloud-Based.
  • Language Support: English.
  • Price: Starts at US $15 to US $45 and increases with the required features and increasing versions.
  • Annual Revenue: Approx. US $12 Million and growing
  • Number of Employees: Approx. 300 employees are working currently.
  • Users: BLEACHER REPORT, CLOUD TICITY, LOOKER, OVERSTOCK, PAYMARK, POLITICO, UNBOUNCE etc.
Features:
  • It helps to plan and prepare for incidents.
  • It never misses a critical alert and always notifies the right people.
  • It gains insight to improve operational efficiency.
  • Automatic notifications, collaboration tools, and monitoring.
Pros:
  • It gives the ability to quickly enable, or disable support person by making coordination on call easy.
  • It provides detailed information about the log detail and reporting of all calls and alerts.
  • Through OPSGENIE we can turn up new numbers easily and quickly.
  • OPSGENIE has a powerful dashboard.
Cons:
  • OPSGENIE has a complicated user management system.
  • The heartbeat and scheduling UI can be much better.
  • Admin privileges can be increased.
  • If we delete anyone from scheduling, then we have to reorganize the whole schedule.
 
#7) Logic Manager

LogicManager is a famous incident management tool, which provides integrated platforms for risk management. It meets all the requirements from small to large-scale organizations with its modular and scalable features. It offers free professional services to make work easier.

It offers empowerment. It helps to see through the economy with streamlined, focused and improved risk management. It offers a wide range of integrated solutions for the business growth. It provides a strong and intuitive platform for improved risk management.

Refer the below Architecture flow of Logic Manager:

Architecture flow of Logic Manager

  • Developed by: Steven Minsky.
  • Type: Commercial.
  • Head Quarters: Greater Boston Area, East Coast, New England.
  • Founded Date: 26th Feb, 2005
  • Operating Systems: Cross Platform.
  • Device Supported: Linux, Windows, iPhone, Mac, Web-based, Android.
  • Deployment Type: Cloud-Based.
  • Language Support: English.
  • Price: Starts at US $10,000 to US $150,000 annually and increases with the required features and increasing versions.
  • Annual Revenue: Approx. US $12 Million and growing
  • Number of Employees: Approx. 100 employees are working currently.
  • Users: WESTAR, MIDDLEBURY, DigitalGlobe, RIVERMARK, ESTERA, VIRGIN PULSE, UNITED BANK, WORLD TRAVEL HOLDING, JMJ ASSOCIATES etc.
Features:
  • It finds out quickly which conditions and standards are achieved and if any compliance needs more attention on that.
  • It has a gap analysis and reports features through which it identifies high vulnerabilities.
  • It is capable of tracking and reporting client complaints that come throughout the organization.
  • Identify asses, mitigate, monitor, connect report etc.
Pros:
  • It has a powerful integration and good UI interface.
  • It helps to connect all the enterprise risk management, governance and compliance activities.
  • It is very robust in nature.
  • It has strong risk management capabilities.
Cons:
  • Logic Manager Performance decreases if many operations are performed simultaneously.
  • Its documentation is poor.
  • First-time installation setup is complex and requires a skilled professional.
  • Visit here for the official Website.

#8) Spiceworks

SPICEWORKS is a popular open source incident management tool that focuses on making the work easier of technicians and IT professionals. It has a very simple network monitor software for getting real-time updates and alert messages.

It is composed of networking tools, which allow the clients to set and troubleshoot the network. It is an online community where the users can communicate and take suggestions from each other.


Refer the below Architecture Diagram of SPICEWORKS:

Architecture Diagram of SPICEWORKS
 

  • Developed by:  Scott Abel, Jay Hall berg, Greg Kata war, and Francis Sullivan.
  • Type: Commercial.
  • Head Quarters: Austin, Texas, United States.
  • Founded Date: Year 2006
  • Language: Ruby on Rails.
  • Operating Systems: Cross Platform.
  • Device Supported: Windows, Mac, Web-based.
  • Deployment Type: Cloud-Based.
  • Language Support: English.
  • Price: Freeware and does not have any enterprises charges.
  • Annual Revenue: Approx. US $58 Million and growing.
  • Number of Employees Working: Approx. 450 employees are working currently.
  • Users: DIGIUM Inc., Server Storage IO, PELASyS, Famatech, INE etc.
Features:
  • It supports real-time network monitoring and runs inventory of devices.
  • SPICEWORKS has trace routes, connectivity dashboard, SSL checker, port scanner etc.
  • It has IP lookup, security tools, cloud cost monitor with remote support.
  • It has a subnet calculator with an internet outage heat map.
Pros:
  • SPICEWORKS has a good interface, is an open source, hence it is free and has many features.
  • Good community support and plugins.
  • Network device inventory and asset location tracking.
  • Communication, accountability, reliability, affordable etc.
Cons:
  • SPICEWORKS default database is not capable of handling heavy loads.
  • Inventory scanning starts abruptly.
  • Is open source so, frequent upgrades have been done.
  • The mobile application needs to be improved a lot.

#9) Plutora


PLUTORA is one of the giant value stream management, which capture, visualize and analyze critical indicators of speed and quality of software delivery.

It helps to manage, orchestrate and improve releases, test environments across the entire enterprise independent of technology. It increases visibility and collaboration. Its clients have complete visibility and control over the application delivery process.

Refer the below Architecture Diagram of PLUTORA:

Architecture Diagram of PLUTORA

  • Developed by: Dalibor Siroky, Sean Hamawi.
  • Type: Commercial.
  • Head Quarters: San Francisco Bay Area, Silicon Valley, West Coast.
  • Founded Date: Jan 1, 2012.
  • Operating Systems: Cross Platform.
  • Device Supported: Windows, Mac, Web-based.
  • Deployment Type: Cloud-Based, Web, SaaS.
  • Language Support: English.
  • Price: For a price quotation, the clients have to connect with the PLUTORA support team.
  • Annual Revenue: Approx. US $58 Million and growing.
  • Number of Employees Working: Approx. 300 employees are working currently.
  • Users: Cognizant, UST Global, Technology Partners, BMC, Service Now, Avocado, Value Added Resellers, eBay, MERCK etc.
Features:
  • It ensures the reliability of organizational management of software development with business strategy.
  • It provides complete visibility and control over the application delivery process.
  • It helps to improve the speed and quality of application delivery process.
Pros:
  • It provides comprehensive test environment management.
  • It coordinates the delivery pipeline.
  • It consolidates scheduling and management.
  • It maintains configuration and build on demand.
Cons:
  • Documentation needs to be improved.
  • PLUTORA installation requires a highly technical skilled person.
  • It needs to improve Interface and UI for customer satisfaction.

#10) xMatters 

"xMatters is an incident management platform that helps enterprises prevent, manage, and resolve IT incidents. 

From the Global 2000 to small workgroups and innovative DevOps teams, organizations around the world rely on the xMatters digital service availability platform to solve technology issues before they become business problems. 

The xMatters integration platform allows organizations to automate key processes with the tools they already use like ServiceNow, Splunk, Jira, and Slack. 


  • Actionable & reliable alerting
  • Product Details
  • Starting Price
  • $16.00/month/user
  • Free Version
  • Yes
  • Free Trial
  • Yes , get a free trial
  • Deployment
  • Cloud, SaaS, Web
  • Mobile - Android Native
  • Mobile - iOS Native
  • Training
  • Documentation
  • Webinars
  • Live Online
  • In Person
  • Support
  • Online
  • Business Hours
  • 24/7 (Live Rep)

Summary

Incident management process plays a very important role for an organization by improving efficiency, reducing cost and manual labor, improved visibility into operations, increased control and better client experience.

These are top 10 trending tools, which have captured the market mostly. You have all the details about the tools now and you can choose which tool will be best suited for your organization based on its features and pricing.According to the internet research, below mentioned are the tools, which are best, suited for each industry?
Small and Medium Scale Industries: MANTIS BT, FRESH SERVICE, SPICEWORKS, JIRA, and OPSGENIE are some tools which will be best suited for these organization due to their very low price or freeware and proven features with reduced manual efforts.

Large Scale Industries: Atlassian JIRA, PAGERDUTY, LOGIN MANAGER, PLUTORA, ZENDESK, VICTOROPS are some of the tools which are best for these industries as their enterprise version is costly with N number of features and security.

Moreover, they also require specific teams to handle the tools, which big companies can afford as they have a huge manpower. These tools are much suitable for large-scale industries. 


Comparison Chart:




References

  • https://www.opsgenie.com/product
  • https://logz.io/blog/incident-management-systems/
  • https://www.capterra.com/incident-management-software/
  • It Takes Teamwork - Automating the Incident Response Process ©2018 Praecipio Consulting
_______________________________________________________

“Once more unto the breach, dear friends, once more;”
___________________________________________
We would like to thank our sponsors, for without them - our fine content wouldn't be deliverable!



About Rick Ricker
An IT professional with over 23 years experience in Information Security, wireless broadband, network and Infrastructure design, development, and support.
For more information, contact Rick at (800) 399-6085 x502

No comments:

Post a Comment

Thanks for your input, your ideas, critiques, suggestions are always welcome...

- Wasabi Roll Staff