FMEA analysis: what it is and why it’s important
Functional Safety Engineer
They are performed at the appropriate level of abstraction during the concept and product development phases. There are different kinds of safety analyses classification. One of them is to divide them between quantitative methods (predict the frequency of failures) and qualitative methods (identify failures but do not predict the frequency of failures). Examples are:
Qualitative analysis methods include:
— qualitative Failure Modes and Effects Analysis (FMEA) at the system, design or process level;
— qualitative Fault Tree Analysis (FTA);
— Hazard and Operability Analysis (HAZOP);
— qualitative Event Tree Analysis (ETA).
Quantitative analysis methods include:
— quantitative FMEA;
— quantitative FTA;
— quantitative ETA;
— Markov models;
— reliability block diagrams.
What is FMEA and why it’s important?
FMEA is one of the most widely known inductive analysis used in the automotive sector.
FMEA – is a step-by-step approach for identifying all possible failures in a design, a manufacturing or assembly process, or a product or service. The most important outcomes are:
– identification of new hazards not previously identified during the hazard analysis and risk assessment (HARA);
– evidence for the suitability of safety concepts;
– definition of safety measures for fault prevention or fault control;
– identification of design requirements and test requirements.
There are different types of FMEA. The main 2 groups are: Design and Process FMEA. The main goal of Design FMEA (DFMEA) is to uncover potential failures associated with the product design – product malfunctions, shortened life etc. On the other hand the main objective of Process FMEA (PFMEA) is to uncover failures related to process reliability, impact product quality, etc.
There are several trusted publications describing the approach to the Design and Process FMEA like: Failure Mode and Effect Analysis Handbook (issued by AIAG & VDA) or Design & Process FMEA (issued by SAE).
How important is SW Safety Analysis?
The best illustration of the role of SW safety-oriented analyses can be found in Annex E from the Part 9 of ISO26262:2018 (2nd edition):
The best time for safety analysis as per the diagram above is SW Architectural Design. It can be done using deductive or inductive analyses. One of the most efficient solution for inductive analysis, is SW FMEA. The best if it is supported by a deductive approach like SW FTA along with DFA.
When it comes to the SW FMEA availability of standards/guidelines is very low and it is not easy to verify the sources of publications. One of them is “Effective Application of SW Failure Modes Effect Analysis” by Ann Marie Neufelder.
This is a very complex approach containing a lot of different approaches, methods and steps. Especially for the companies which are at the beginning of ISO26262 journey, it might be very difficult to adapt it to their technology and processes without extensive support and time.
Spyrosoft FUSA Engineers over the various ISO26262 projects were able to develop their own approach to SW FMEA. It is continuously evolving to adapt new technologies and lessons learnt. It is also worth to add that SpyroSoft safety analyses reports based on the SW FMEA already passed several FUSA Assessments.
What are the steps of SW FMEA?
The general approach to SW FMEA is similar as for Design FMEA (AIAG & VDA). The Safety Analysis Sheet columns for SW FMEA might be organised as follows:
First step of SW FMEA contains 4 steps as shown in the flow diagram below:
- Potential Failure– list of all possible failures for each component/function. Each Component/Function shall have at least one Potential Failure linked. The most effective way to find all potential failures is HAZOP methodology.
HAZOP (Hazard and Operability Study) – systematically investigates each element in an Architecture or any other model/system/process. The goal is to find potential situations that would cause that element to pose a hazard or limit its operability. In this approach key words like “NO or NOT”, “PART OF” etc. help in defining potential failures in function/element.
- Potential Failure Causes– list of all possible causes of the defined Potential Failure. Each Potential Failure shall be linked to at least one Potential Failure Cause.
Each Potential Failure Cause shall contain Occurrence (O) rating. It describes the potential of the failure cause to occur in customer operation, according to the rating table, considering results of already completed detection controls.
- Potential Failure Effects– list of all possible effects of the defined Potential Failure. Each Potential Failure shall be linked to at least one Potential Failure Effect.
Generally in SW FMEA approach we are dividing two kinds of Failure Effects:
- Potential Failure Effects which might directly violate Safety Goal/s
- Potential Failure Effects which might impact system/subsystem, local entities, results in non-compliance with the regulations, poor performance, loss of intended functions etc.
Each Effect shall be rated using Severity (S) measure. It is associated with the most serious failure effect for a given failure mode on the function being evaluated.
- Prevention/Detection– list of all assumed/implemented safety measures for each Potential Failure Cause. Also called Fault Avoidance/Control.
Each prevention/detection control shall be rated using Detection (D) measure. It is an estimated measure of the effectiveness of the detection control to reliably demonstrate the failure cause or failure mode before the item is released for production.
- Second step of SW FMEA analysis is Risk Assessment, FMEA SW Validation and Optimization
- Risk Assessment– Compute SxO and DxO ratings and assess if each Failure is secured enough by Safety Mechanisms or more Avoidance/Control measures are needed
- SW FMEA Validation– The most optimized solution is fill “SM Implementation”, “Requirement Coverage” and “Test coverage” columns by respective functions whenever the Safety Mechanism was requested in Fault Avoidance/Control columns. Since the SW FMEA is being performed on the SW Architecture level:
- The first contact point shall be SW Architect responsible for implementing SMs within SW Architecture and filling „SM implementation” column
- „Requirement Coverage” column is maintained by requirements engineer (Req Eng.) It shall contain already existing SW Safety Requirements or brand new ones derived directly from SW FMEA.
- „Test Coverage” column shall be filled with Test Cases IDs (on Unit or Integration level) which are made for checking the Safety Mechanisms correct behaviour during start-up/normal operation.
- SW FMEA Optimisation– Whenever some safety rating (S,O,D) is not passing the risk assessment or simply prevention/detection measures are missing or there is some systematic issue to be resolved „Improvement Action” section shall be filled out. As soon as improvements are implemented – all ratings shall be re-evaluated to be able to check if there is some further action needed.
Do not forget about the proper Safety Analysis Verification Report required by ISO26262.
It shall summarise and communicate the results of the SW FMEA activity, confirmation of the effectiveness of the implemented actions, record of risk analysis and risk reduction to acceptable level.
What are the hints and best practises to perform a good SW FMEA?
- Make sure that Functional Safety Concept(FSC)and its technical implementation (TSC) is available and understandable. Main FUSA critical SYS functionalities shall be defined and ASIL rated. SW Safety requirements should be specified.
- One of the most important input for the SW FMEA is SW Safety Architecture. There are at least two types of diagramswhich can bring a great value, when creatingthe SW FMEA:
- Static View: the FUSA critical components shall be clearly marked. The critical signal path shall be shown from the HW peripherals through MCAL, ECUAL, BSW up to Application Layer (in AUTOSAR projects). Based on such diagram FUSA critical components (also QM ones if they are on the way of critical path) shall be taken into consideration for SW FMEA.
- Dynamic View: where faults like: delay of transmitted data, blocking access to communication channel, memory corruption can be identified.
- In most cases the Severity, Occurrence, Detection rankings from the AIAG, VDA, SAE etc. standards are not suitable for SW FMEA Analysis. Mostly because the level of abstraction is differentinSW development than in production, in which these variables originate. Also the Risk Matrix shall be tailored to the specific project needs. Be aware that each customized approach shall be discussed with the OEM/customer.
- It is very useful to maintain potential failures and safety measures database. Most of theSW failures/safety measures are common for different projects/technologies.
If you struggle with the SW FMEA or anyother SYS/SW Safety Analyses – please visit our website for more information
You can book face-to-face or online training for you and your team to learn about the basics and processes based on a well-structured life example case studies and exercises.