Human performance is inherently unpredictable. Human beings are, after all, human. However, those working in industrial risk management have developed means of modeling types of human error, and the rates at which they can occur. One such technique is known as THERP (Technique for Human Error Rate Prediction). The method uses Boolean logic to model and predict human error rates. Hence it can be integrated Probabilistic Risk Assessment techniques - particularly Fault and Event Tree analysis. A THERP analysis is most effective when the tasks are routine and proceduralized, and when the persons involved are not under stress.
Human Error / THERP
When used in Event Tree modeling, an initiating event could be an emergency situation such as a leak of a hazardous chemical. Items in the tree that could incorporate human error include:
- Failure to recognize that a leak has occurred;
- Incorrectly identifying the nature of the leak;
- Closing the wrong valves in response; and
- Responding with the wrong emergency equipment.
When building a THERP model, errors are categorized and then assigned a probability. For example, an operator may be required to close a valve. Potential errors include:
- Failing to close the valve;
- Closing the wrong valve; and
- Partially closing the valve.
If the likelihood for these errors is low then they can simply be added together to obtain the overall error rate, corresponding to the OR Gate in a Fault Tree. For example, if the respective likelihoods for the above errors are 0.01, 0.03 and 0.03 then the overall error rate is 0.07 (excluding second order terms).
Types of Error
Errors can be either slips or mistakes. A slip occurs when a person makes an error, even though that person knew what to do and how to carry out a task. A mistake (sometimes referred to as a cognitive error) occurs when a person acts on an incorrect train of reasoning, often because he was not properly informed as to what to do or how to do it.
Errors can also be classified as either those of commission (doing something wrong) or omission (not doing something that should have been done). Errors of commission can then be divided into the following categories:
- Errors of Selection - error in the use of controls or equipment;
- Errors of Sequence - required action is carried out in the wrong order;
- Errors of Timing - task is executed before or after when required; and
- Errors of Quantity - inadequate amount or in excess.
The error rates can be modified with a 'recovery factor' which allows for corrective action to be taken before the consequences of the error affects the overall system performance (corresponding to the Fault Tree AND Gate). For example, if there is a 30% chance that the operator will take immediate corrective action on closing the wrong valve then the error rate for the second item in the above list falls to (0.03 * 0.7), or 21%.
A data base of Human Error Probabilities (HEPs) is needed in order to develop the model. Three sources exist for the collection of data suitable for the generation of HEPs. They are:
- Data derived from relevant operating experience;
- Data derived from experimental research; and
- Data derived from simulator studies.
As with any Fault/Event Tree approach to risk management, the results of the analysis are used to identify those activities which contribute the most to system failure: the "Important Few". Corrective actions focus on these high contribution items.
Although THERP provides a useful way of integrating human error in Probabilistic Risk models, it does have two drawbacks. First, it is time-consuming and expensive to build a credible data base of human error rates. Second, human beings are not equipment items that fail in some statistically measurable manner. The action of a human depends on many impossible-to-classify issues such as whether he or she is tired, is under the influence of a medication, or their general mood that day.
A final point to keep in mind is that, in most situations, it is important to avoid blaming a person for making an error. For this reason, when analyzing incidents, it is preferable to use the term operating error, rather than operator error.