Problems are created to investigate and diagnose the Problem records track the investigation and diagnosis of a root cause of one or more Incidents or possible Incidents and to decide upon a solution that will not just resolve the incident, but will prevent further incidents from being caused. Resolving a problem means resolving/preventing related Incidents.incidents. Problems identify the underlying issue such that resolving the problem not only resolves the incident, it prevents further incidents. Incident Management is concerned with restoring service as quickly as possible, whereas Problem Management is concerned with determining and eliminating root causes (, and hence eliminating repeat problems).
The primary objectives of Problem Management are to prevent Incidents incidents caused by a similar root situation from recurring and to minimize the impact of Incidents incidents that cannot be prevented.
Problem Creation and Overview
When an incident is determined to be based on an underlying Problemproblem, first-response support technicians create a Problem problem record from the Incidentincident, automatically creating a link between the Incident incident record and the Problemproblem. The new problem record imports relevant information from the Incidentincident, such as the linked Configuration Itemconfiguration item. Problems can also be created without being linked to existing Incidentsincidents, such as in cases where a problem is discovered internally but no Incidents have been reported.
The Submittersubmitter, along with their Department department and contact information, are automatically populated based on the person who creates the problem, and the reporting source of the Problem is also captured in the record. Resolution of problems may require changes to the system. Staff addressing the problem may determine that a shared CI needs to be replaced or modified, and may therefore file a Change Requestchange request. All updates throughout the life cycle of the Problem Record problem record are logged in the record's History history along with the date and time, and updates can also be logged in the Additional Notes field, which logs the updater and time stamp of the update.
When the problem is resolved, a technician updates the record with relevant information and closes the record, in turn prompting automation to begin closing procedures for related Incidents. If the problem cannot be resolved it may be classified as a Known Error and a permanent work-around supplied. This will also update the related Incidents.
Processing of Records
Once created, Problem problem records can be linked to Incidents incidents from either the Problem problem record or from an Incident incident record.
Priority, a measure of the Problemproblem's urgency and relative importance, is set by default based on the combination of Impact and Urgency and how the way in which the Problem Priority group has been Group is configured. The default priority matrix for problems is shown below:.
These values may be changed to match a company's preferences. See the Impact, Urgency, and Priority Management section for more details. Problem Prioritypriority, a measure of impact and risk, particularly with regards to IT service operations, may be determined separately as part of Problem Classification procedures. Together Priority priority and Problem Priority problem priority help IT managers make factual decisions for scheduling and planning and help determine the category of related Change Requestschange requests.
The Problem problem record follows its own workflow separate from the Incidentincident. Ops team members open the problem record in the default state status of " Pending Diagnosis " to indicate that a diagnosis has yet to be determined, or " Diagnosed " if no further steps are required. The default Problem SLA has specified targets for completing diagnosis, based on the priority, and a diagnosis clock is started upon creation of the problem record. As part of the diagnosis, technicians select the service involved based on existing Incident incident reports or based on the most likely service to be related to the Problemproblem. If a particular Configuration Item configuration item is identified as the source of the problem's root cause, staff can quickly link the Problem problem to the CI record.
Once a diagnosis is supplied, staff will move the record into a status of " Diagnosed, " fill out the Root Cause description and any further analysis, and save the record. They may also suggest a temporary fix, or "workaround", for the problem and push that to related incidents, e.g. "use Printer B3 for now instead", if a permanent solution is not readily available.
Diagnosis and Resolution Clocks
When the status is changed changes from Pending Diagnosis to any of Diagnosed, Pending Change, Resolved, or Deferred, the diagnosis clock is stoppedstops, the Diagnosis End Time field is populated automatically populates with the current time, and the Resolution resolution clock is startedstarts. The diagnosis clock measures the working support hours between creation of the problem and diagnosis of the problem against the SLA diagnosis target.
The resolution clock measures time between when diagnosis is completed and the problem is resolved or deferred, excluding the time when the Status status is Pending Change, against the SLA resolution target for problems. While the Status status is in Pending Change, both clocks are stopped, since the people handling problems may not have control over when a change request can be scheduled and implemented.
If there are related incidents, and there is a workaround available exists while the problem investigation is being resolved, it is easy to in progress, you can send the workaround to all of any related Incidents and the incident submitters. This is done by:
To do this...
- Enter the details of the workaround into the Workaround field.
- Going to On the Related Records tab and changing , change the Workaround Provided field to Yes and then clicking the button ".
- Click 'Update Incident with Workaround". The "Update Incident with Workaround" button allows staff working the Problem to quickly disseminate workaround information to all linked Incidents with the click of a button. Clicking the button will post the workaround in all related incidents and change their status to Workaround Provided, which will also trigger .' This posts the workaround to all related Incidents and changes the status to Workaround Provided. When the Incident status changes to Workaround Provided, the system sends an email to both the end user (submitter) and the assigned staff person of the Incident(s)Incidents, along with any CC's.
Deferring and Seeking More Information
If at any time during During the root cause diagnosis or determination of the proper solution, staff may need more information from a separate process, such as Incident details from first level support, the Problem record may be placed in a state of "for example incident details. In this case, technicians change the problem's Status to Pending More Information " and send an email sent to staff members with requests for further information. If a Problem problem is deemed too risky or of lower priority than more imminent issues, it may be put in a status of "Deferred" change the status to Deferred to reflect no ongoing diagnosis or pending changes.
In most cases where a Problemproblem's root cause deals with a Configuration Itemconfiguration item, a Change Request change request will be submitted to make the appropriate fixes to the Configuration Itemconfiguration item. When Change Requests are creatable created directly from the Problem problem record, they are instantly linking themlinked. While a problem is waiting for a Change Request change request to be implemented, it can be put in the status of " Pending Change". If the Change Request change request linked to a Problem problem is closed, the system will send an email notification to notify the problem assignee so that the individual can to take additional steps to close the Problem problem record if it was pending change for resolutionin a status of Pending Change.
A problem whose root cause is known but for which there is no permanent resolution is considered a Known Error. Known Errors should have Workarounds workarounds to allow Incident Management to restore service as quickly as possible. When the problem is resolved either directly, or through a change request, all related incidents can be closed at once with the click of a button. On the Related Records tab, the button " Update Incidents with Solution " can be used to push the Resolution field of the problem into the Resolution field of the linked incidents, closing the incident and triggering an email to the customer with the resolution.
Once the permanent resolution is determined and implemented, staff users enter the description in the Resolution field and set the status to Resolved. A Closure Category for the resolution of the Problem problem can be selected, and If the resolution contains information that is useful outside of this problem's particular scope, the "Add to Knowledgebase?" field can be set to Yes to make the Resolution field available via FAQs. In addition, a required " Review for Knowledgebase " field is displayed in that Statuswhen closing, and if set to Yes, a Knowledge Article knowledge article can be created from the Problem problem for review to be added to the Knowledge Articles table.
It is important to learn from problems to reduce the likelihood of repeat issues. There are two fields intended to facilitate analysis after a major problem.
- Include in Major Problem Review: this is a required field.
- Major Problem Review notes: which this is used to write up notes about what was done well in the crisis, what didn't work so well, and what could be done to prevent such a crisis in the future. This field is only visible if the Include in Major Problem Review field has a Yes value.
Problems can be marked as known errors by setting the Known Error field to a Yes value. All Known known errors can be easily viewed by support technicians using the Known Errors saved search when working on incidents or service requests. Known errors that for development systems have a status called In Development, and these are shown in the search called Known Errors in Development. Only known errors for the production environment, which should be set to a status of Resolved, are shown in the Known Errors in Production search.
. Additional saved searches for Known Errors Unresolved and Resolved Known Errors show problem records that are still unaddressed, and those that have been resolved, respectively.
For known errors that should be visible to end users, we recommend that they are pushed into a knowledgebase article, which can be done directly from the problem record.
Problems are " owned " by the staff member who creates the Problem problem record. Since only internal technical staff will generally see Problem problem records, groups may share responsibilities between Incident Management and Problem Management and multiple individuals may share ownership over timebe given responsibility for a problem without actually owning it.
This section contains an overview and screenshot examples of the information stored in a Problem record in the out-of-the-box system.
The common area is shown in all tabs, and shows the progress of the Problem's lifecycleproblem, such as its Status, its Team Assignment, as well as the Assigned Person or owner of the Problem problem record. The details
The Details tab contains most of the information and updates pertaining to the Problemproblem, including the Submitter, Submitter Department and Contact Information, Source of the Problem problem (i.e. how it was reported), the Location of the problem, the Business Service and Service impacted by the Problemproblem, as well as the Impact, Urgency, and resulting Priority of the Problem. If a CI has been identified, it can also be selected here.
In addition, all
Work Status Tab
All Diagnosis details can be logged here along with the Risk and Root Cause analysis, and any additional working diagnosis notes with time stamps. Finally, the The Resolution Details section in this tab allows for the provisioning detailing of a Workaround for the problem (, which can later be pushed into updated in the related Incidents)incidents, or to pull an existing Solution Resolution from the Known Error databasesubset of problems. If the Status status is Resolved, a Closure Category can also be specified for the Problemproblem, and a determination can must be made for if the Problem as to whether the problem should be included in a future Major Problem Review. The date the Problem problem was resolved is then automatically logged in the History tab of the record to provide an audit trail.
The Time tab displays the SLA thresholds for the problem, including thresholds for the SLA Diagnosis Time, SLA Resolution Time, and the progress against those thresholds. Amount of time spent working on the problem can also be logged here.
The Related Records tab shows the list of Incidents incidents related to the Problemproblem. If a Workaround workaround or Solution resolution has been provided, it can also be pushed from the Problem problem record to all of its related Incidentsincidents. The Problem problem can also be linked to an existing Change Requestchange request, or can be used to create a new one, copying over all of the relevant information directly from the Problem problem record to the Change Requestchange request. In addition, the Knowledge Management section of this tab also allows the Problem problem to be directly integrated and converted into a record in the Knowledge Articles table with a status of Pending Review.
The SLA tab displays the SLA thresholds for the Problem, including thresholds for the SLA Diagnosis Time, SLA Resolution Time, and the progress against those thresholds.
The following rules run in the Problems table. Each of them either runs when a record is created or edited, or on a scheduled basis.
Rule Trigger: When a Problem is created via Email, Web, or API.
Description: This rule runs the following If-then-else action to set the assignment of the Problem record according to the predefined values in the related Service:
In addition, it also sets the SLA ID based on the saved search: Active, request type is problem, and SLA type is corporate, and then sets the SLA targets for the Problem based on the SLA and Priority.
Rule Trigger: When a Problem is created edited via Web or API and meets the saved search critera: Diagnosis SLA Breached=No and Working time to diagnosis changed last modification
If then action: I: Update SLA Details
If Diagnosis SLA Breached=No
If Working time to Diagnosis is greater than SLA Diagnosis Warning Time and Alert Color is default
Set Alert Color to Orange
If Working time to Diagnosis is greater than SLA Diagnosis Time,
set the Diagnosis SLA Breached to Yes
Set the alert color to red.
Email the Assigned person and team of breach of diagnosis
If Resolution SLA Breached=No
If Working time to Resolution is greater than SLA Resolution Warning Time and Alert Color is default, set alert color to Orange
If Working time to Resolution is greater than SLA Resolution Time,
Set alert color to Red
Set Resolution SLA Breached to Yes
Notify the assigned person and team of breach
Edit: Status Changes (Web/API)
Rule Trigger: When a Problem is edited via Web or API and Status changed during the record's last modification.
- If Status changes to Diagnosed, set the Diagnosis Clock Status to Stopped.
- If Status changes to Pending Change, set the Resolution Clock Status to Stopped
- If Status changes from pending change to some status other than Deferred or Resolve, set the Resolution Clock to Running
- If status changes to Resolved, set the Resolution Clock Status to Stopped (and if Diagnosis clock is not stopped, set it to stopped too).
TB: Refresh Elapsed Time fields
Rule Trigger: This rule runs every 20 minutes using the Saved Search: Diagnosis Clock Status is Running or Resolution Clock Status is Running and Date Updated is more than or = 20 minutes old (so if someone updated it in the meantime, we don't need to do it again)
Description: U: Set Date SLA Checked to NOW() - that will trigger an update of the elapsed time fields.
Reporting and Statistics
There are numerous default reports measuring different metrics for Problemsproblems, and a few of the most relevant and frequently used ones are listed here as examples:.
Known Error Trend Analysis
This is a Trend graph that shows the number of Problems problems marked as Known Errors over the course of a pre-defined period of time (by default, this is since the beginning of the current year).
This is a Segmented Bar Chart that shows the number of Problems problems that were reported for each CI type, segmented by the name of the Configuration Item that they were reported for.
This report shows the major problems by month over the past year:
It shows further details . Additional details are in the HTML version portion of the report.