Problem Management Best Practices

One of the more overlooked ITIL processes is Problem Management. Many companies implementing ITIL, will focus on Incident and Change Management first. What companies realize after implementing Incident and Change Management is they need a Problem Management program to improve the overall availability of IT services. A mature program managing problems will lead to preventing reoccurring incidents or at least reducing the impact. Focusing on problems in your environment will increase the uptime of your IT services. Therefore, Problem Management is a critical component of your overall IT Service Management program. For these reasons, the Help Desk must be an active player to be successful.

The Help Desk plays a major role in managing incidents and problems. Accurate and thorough incident ticket documentation will significantly help the root cause analysis of incident generating problems. Assigning correct ticket categories to incident tickets will improve problem identification with ticket type trending and reporting. For these reasons, you need to understand what a problem is.

What is a Problem?

A problem is the underlying cause of one or more incidents. A problem is the main source of the fault. Related incidents are the symptoms of the problem most likely being experienced by the end users. You will hear the term root cause to describe the underlying cause of an incident. A problem is identified by a root cause analysis process causing the incidents. A problem ticket is raised based on the incident or incidents which caused the fault and possibly operational outage. When a problem has been defined, then a potential permanent fix or workaround can be developed.

What is Problem Management?

Problem Management is the life cycle process of identifying, investigating, documenting, and permanently resolving incident causing problems from the production environment. Problems are resolved by defining and implementing a solution to the problem. Unlike incident management, which is reactive and focused on restoring a service, Problem Management is focused on identifying the incident root cause and preventing recurrence of service impacting incidents. In a ticketing application, a problem management investigation ticket is created from an incident ticket or operational outage. This process creates an association between the incident and problem tickets. In most instances, there is more than one incident related to a problem. In those cases, all the incidents should be linked to the problem ticket.

Problem Management: Reactive and Proactive

A problem investigation can be initiated either proactively or reactively. Reactive Problem Management is initiated after one or more incidents occurred. The investigation will focus on finding the incident’s root cause and implementing a solution. Proactive Problem Management will focus on preventing future incidents. These preventative investigations will focus on operational data, configurations, and general continuous improvement efforts.

Problem Management Work Around

When a problem has been investigated and diagnosed, a workaround could be developed until a permanent fix can be applied. A workaround is when a full resolution is not yet available for an incident or problem, but something can be done to allow the user to complete their task. At times a solution cannot be defined for permanently resolving incident causing problems. In those cases, Problem Management attempts to minimize the impacts of the incident causing problems with a workaround. In this situation, the problem is identified as a known error. These known errors are published by Problem Management in a known error database until a time where a permanent solution becomes available.

Known Error Database

Companies implementing Problem Management realize a reduction of call handle time and first contact resolution. This is achieved by implementing a known error database, which is a key component of Problem Management. Therefore, when the Help Desk receives a contact about something broken, one place they check is if there is a workaround in the known error database.

Help Desk Role in Problem Management

The Help Desk staff play a significant role in Problem Management activities. Day in and day out the Help Desk deals with hundreds or thousands of incident tickets. The staff becomes in tune with trends of customer break-fix issues. The Help Desk knows when customers face similar issues, there are underlying problems that need to be addressed. Usually, Incident Management will be engaged for more significantly impacting incidents and a large spike of similar issues. In those cases, the incident manager will create a problem ticket once the incident is resolved. When the Help Desk sees multiple incidents over a longer period of time, they will create a proactive problem ticket for review.

