You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 12 Next »

Problems Table

Overview

Problems represent the root cause of one or more Incidents or possible Incidents. Resolving a problem means resolving/preventing related Incidents.

Incident Management is concerned with restoring service as quickly as possible, whereas Problem Management is concerned with determining and eliminating root causes (and hence eliminating repeat problems).

The primary objectives of Problem Management are to prevent Incidents from happening and to minimize the impact of Incidents that cannot be prevented.

Use case

Problem Creation

When an incident is determined to be based on an underlying Problem, first-response support technicians create a Problem record from the Incident, automatically creating a link between the Incident record and the Problem. The new problem record imports relevant information from the Incident, such as the linked Configuration Item. Problems are also creatable independent of existing Incidents, such as in cases where a problem is discovered internally but no Incidents have been reported. 

The Submitter, along with their Department and contact information are automatically populated, and the reporting source of the Problem is also captured in the record.

Resolution of problems may require Changes to the system. Staff addressing the problem may determine that a shared CI needs to be replaced or modified, and may therefore file a Change Request.

All updates throughout the lifecycle of the Problem Record are logged in the record's History along with the date and time, and updates can also be logged in the Additional Notes field, which logs the updater and time stamp of the update. When the problem is resolved, a technician updates the record with relevant information and closes the record, in turn prompting automation to begin closing procedures for related Incidents. If the problem cannot be resolved it may be classified as a Known Error and a permanent work-around supplied. This will also update the related Incidents.

Processing of Records

Once created, Problem records can be linked to Incidents from either the Problem record or from an Incident record.

Priority, a measure of the Problem's urgency and relative importance, is set by default to the Priority of the spawning Incident, but can be changed at the time of creation. Problem Priority, a measure of impact and risk, particularly with regards to IT service operations, may be determined separately as part of Problem Classification procedures. Together Priority and Problem Priority help IT managers make factual decisions for scheduling and planning and help determine the category of related Change Requests. By default new Problems have a Problem Priority of Standard (0), indicating no special review or authorization is needed.

Diagnosis

The Problem follows its own workflow separate from the Incident. Ops team members open the problem record in the default state of "Pending Diagnosis" to indicate that a diagnosis has yet to be determined, or "Diagnosed" if no further steps are required. A record may sit in a state of Pending Diagnosis for some time before staff actually begin to perform the diagnosis, so Start Clock and Stop Clock buttons let users indicate how much time was spent diagnosing a particular problem.

As part of the diagnosis, technicians select the service involved based on existing Incident reports or based on the most likely service to be affected by the Problem. If a particular Configuration Item is identified as the source of the problem's root cause, staff can quickly link the Problem to the CI record.

Once a diagnosis is supplied, staff will move the record into a state of "Diagnosed" and fill out the Root Cause description. They may suggest a temporary fix, or "workaround", for the problem and related incidents (e.g. "use Printer B3 for now instead") if a permanent solution is not readily available. While determining a workaround and/or permanent resolution, technicians can use Start and Stop Clock buttons to track the time spent determining workarounds and solutions for this Problem.

If at any time during the root cause diagnosis or determination of the proper solution staff need more information from a separate process, such as Incident details from first level support, the Problem record may be placed in a state of "Pending More Information".

If a Problem is deemed too risky or of lower priority than more imminent issues, it may be put in a status of "Deferred" to reflect no ongoing diagnosis or pending changes.

Resolution 

In most cases where a Problem's root cause deals with a Configuration Item, a Change Request will be submitted to make the appropriate fixes to the Configuration Item. Change Requests are creatable directly from the Problem record, instantly linking them. While a problem is waiting for a Change Request it can be put in the status of "Pending Change".

If the Change Request linked to a Problem is closed, the system will send an email notification to the problem assignee so that the individual can take additional steps to close the Problem record if it was pending change for resolution.

A problem whose root cause is known but for which there is no permanent resolution is considered a Known Error. Known Errors should have Workarounds to allow Incident Management to restore service as quickly as possible. The "Update Incident with Workaround" button allows staff working the Problem to quickly disseminate workaround information to linked Incidents with the click of a button. Clicking the button will post the workaround in all related incidents and change their status to Workaround Provided, which will also trigger an email to both the end user and the assigned staff person of the Incident(s).

A similar button is used to transfer the problem Solution to related Incidents. Clicking the "Update Incidents with Solution" button in the problem will populate the Solution field of the Problem into the Incident Solution field and set the status of the Incident to Closed, emailing the customer.

Once a permanent resolution is determined and implemented, staff users enter the description in the Resolution field and set the status to Resolved. A Closure Category for the resolution of the Problem can be selected, and If the resolution contains information that is useful outside of this problem's particular scope, the "Add to Knowledgebase?" field can be set to Yes to make the Resolution field available via FAQs. In addition, a required "Review for Knowledgebase" field is displayed in that Status, and if set to Yes, a Knowledge Article can be created from the Problem for review to be added to the Knowledge Articles table.

Ownership

Problems are "owned" by the staff member who creates the Problem record. Since only internal staff will see Problem records, groups may share responsibilities between Incident Management and Problem Management and multiple individuals may share ownership over time.

Workflow

Problem Fields

This section contains an overview and screenshot examples of the information stored in a Problem record in the out of box system.

Details tab:

The common area is shown in all tabs, and shows the progress of the Problem's lifecycle, such as its Status, its Team Assignment, as well as the Assigned Person (or owner) of the Problem record.

 The details tab contains most of the information pertaining to the Problem, including the Submitter, Submitter Department and Contact Information, Source of the Problem (i.e. how it was reported), the Location of the problem, the Business Service and Service impacted by the Problem, as well as the Impact, Urgency, and resulting Priority of the Problem. If a CI has been identified, it can also be selected here. In addition, all Diagnosis details can be logged here along with the Risk and Root Cause analysis, and any additional working diagnosis notes with time stamps. Finally, the Resolution section in this tab allows for the provisioning of a Workaround for the problem (which can later be pushed into the related Incidents), or to pull an existing Solution from the Known Error database. If the Status is Resolved, a Closure Category can also be specified for the Problem, and a determination can be made for if the Problem should be included in a future Major Problem Review. The Date Resolved is then automatically logged in the History tab of the record to provide an audit trail.

 

The Related Records tab shows the list of Incidents related to the Problem. If a Workaround or Solution has been provided, it can also be pushed from the Problem record to all of its related Incidents. The Problem can also be linked to an existing Change Request, or to create a new one, copying over all of the relevant information directly from the Problem record to the Change Request.

In addition, the Knowledge Management section of this tab also allows the Problem to be directly integrated and converted into a record in the Knowledge Articles table with a status of Pending Review.

 

SLA Tab:

The SLA tab displays all of the SLA thresholds for the Problem, including thresholds for the SLA Diagnosis Warning, SLA Diagnosis Time, SLA Resolution Warning, and SLA Resolution Time. 

 

Automation

The following rules run in the Problems table. Each of them either runs when a record is created or edited, or on a scheduled basis.

Creation actions

Rule Trigger

When a Problem is created via Email, Web, or API.

Description

This rule runs the following If-then-else action to set the assignment of the Problem record according to the predefined values in the related Service:

In addition, it also sets the SLA ID based on the saved search: Active, request type is problem, and sla type is corporate, and then sets the SLA targets for the Problem based on the SLA and Priority.

Edit: Set Alert Color and Send Notifications (Web/API)

Rule Trigger

When a Problem is created edited via Web or API and meets the saved search critera: Diagnosis SLA Breached=No and Working time to diagnosis changed last modification

Description

If then action: I: Update SLA Details

If Diagnosis SLA Breached=No

If Working time to Diagnosis is greater than SLA Diagnosis Warning Time and Alert Color is default

Set Alert Color to Orange

If Working time to Diagnosis is greater than SLA Diagnosis Time,

 set the Diagnosis SLA Breached to Yes

Set the alert color to red. 

Email the Assigned person and team of breach of diagnosis

 

If Resolution SLA Breached=No

If Working time to Resolution is greater than SLA Resolution Warning Time and Alert Color is default, set alert color to Orange

If Working time to Resolution is greater than SLA Resolution Time,

Set alert color to Red

Set Resolution SLA Breached to Yes

Notify the assigned person and team of breach

 


Edit: Status Changes (Web/API)

Rule Trigger

When a Problem is edited via Web or API and Status changed during the record's last modification.

Description

If Status changes to Diagnosed, set the Diagnosis Clock Status to Stopped.
If Status changes to Pending Change, set the Resolution Clock Status to Stopped
If Status changes from pending change to some status other than Deferred or Resolve, set the Resolution Clock to Running
If status changes to Resolved, set the Resolution Clock Status to Stopped (and if Diagnosis clock is not stopped, set it to stopped too).


TB: Refresh Elapsed Time fields

Rule Trigger

This rule runs every 20 minutes using the Saved Search: Diagnosis Clock Status is Running or Resolution Clock Status is Running and Date Updated is more than or = 20 minutes old (so if someone updated it in the meantime, we don't need to do it again)

Description

U: Set Date SLA Checked to NOW() (that will trigger an update of the elapsed time fields).



  • No labels