Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Problems represent the root cause of one or more Incidents or possible Incidents. Resolving a problem means resolving/preventing related Incidents.

Incident Management is concerned with restoring service as quickly as possible, whereas Problem Management is concerned with determining and eliminating root causes, which eliminates repeat problems.

The primary objectives of Problem Management are to prevent Incidents from happening and to minimize the impact of Incidents that cannot be prevented.

Use case

Problem Creation

When an incident is determined to be based on an underlying Problem, first-response support technicians create a Problem record from the Incident, automatically creating a link between the Incident record and the Problem. The new problem record imports relevant information from the Incident, such as the linked Asset. Problems are also creatable independent of existing Incidents, such as in cases where a problem is discovered internally but no Incidents have been reported.

Resolution of problems may require Changes to the system. Staff addressing the problem may determine that a shared Asset needs to be replaced or modified, and may therefore file a Change Request.

When the problem is resolved, a technician updates the record with relevant information and closes the record, in turn prompting automation to begin closing procedures for related Incidents. If the problem cannot be resolved it may be classified as a Known Error and a permanent work-around supplied. This will also update the related Incidents.

Processing of Records

Once created, Problem records can be linked to Incidents from either the Problem record or from an Incident record.

Priority, a measure of the Problem's urgency and relative importance, is set by default to the Priority of the spawning Incident, but can be changed at the time of creation. Problem Priority, a measure of impact and risk, particularly with regards to IT service operations, may be determined separately as part of Problem Classification procedures. Together Priority and Problem Priority help IT managers make factual decisions for scheduling and planning and help determine the category of related Change Requests. By default, new Problems have a Problem Priority of Standard (0), indicating no special review or authorization is needed.

Diagnosis

The Problem follows its own workflow separate from the Incident. Ops team members open the problem record in the default state of Pending Diagnosis to indicate that a diagnosis has yet to be determined, or Diagnosed if no further steps are required. A record may sit in a state of Pending Diagnosis for some time before staff actually begin to perform the diagnosis, so Start Clock and Stop Clock buttons let users indicate how much time was spent diagnosing a particular problem.

As part of the diagnosis, technicians select the service involved based on existing Incident reports or based on the most likely service to be affected by the Problem. If a particular Asset is identified as the source of the problem's root cause, staff can quickly link the Problem to the Asset record.

Once a diagnosis is supplied, staff will move the record into a state of Diagnosed and fill out the Root Cause description. They may suggest a temporary fix, or workaround, for the problem and related incidents if a permanent solution is not readily available. For example, use Printer B3 for now instead. While determining a workaround and/or permanent resolution, technicians can use Start and Stop Clock buttons to track the time spent determining workarounds and solutions for this Problem.

If at any time during the root cause diagnosis or determination of the proper solution staff need more information from a separate process, such as Incident details from first level support, the Problem record may be placed in a state of Pending More Information.
If a Problem is deemed too risky or of lower priority than more imminent issues, it may be put in a status of Deferred to reflect no ongoing diagnosis or pending changes.

Solution

In most cases where a Problem's root cause deals with an Asset, a Change Request will be submitted to make the appropriate fixes to the Asset. Change Requests are creatable directly from the Problem record, instantly linking them. While a problem is waiting for a Change Request it can be put in the status of Pending Change.

If the Change Request linked to a Problem is closed, the system will send an email notification to the problem assignee so that the individual can take additional steps to close the Problem record if it was pending change for resolution.

A problem whose root cause is known but for which there is no permanent resolution is considered a Known Error. Known Errors should have Workarounds to allow Incident Management to restore service as quickly as possible. The Update Incident with Workaround button allows staff working the Problem to quickly disseminate workaround information to linked Incidents with the click of a button. Clicking the button will post the workaround in all related incidents and change their status to Workaround Provided, which will also trigger an email to both the end user and the assigned staff person of the Incident(s).

A similar button is used to transfer the problem Solution to related Incidents. Clicking the Update Incidents with Solution button in the problem will populate the Solution field of the Problem into the Incident Solution field and set the status of the Incident to Closed, emailing the customer.

Once a permanent resolution is determined and implemented, staff users enter the description in the Resolution field and set the status to Resolved. If the resolution contains information that is useful outside of this problem's particular scope, the Add to Knowledgebase? field can be set to Yes to make the Resolution field available via FAQs.

Ownership

Problems are owned by the staff member who creates the Problem record. Since only internal staff will see Problem records, groups may share responsibilities between Incident Management and Problem Management and multiple individuals may share ownership over time.

Workflow