Focus area: Transforming Processes
Format: Teaching Session + Applied Practice
Duration: ~4 Hours
Audience: Quality Engineers & Leaders
Jump to Workshop Sections
1. Introduction: FMEA Is Powerful — and Often Underperforming
Failure Mode and Effects Analysis (FMEA) is one of the most powerful risk prevention tools in the quality professional's toolkit. It is also, in many organizations, one of the most poorly executed. FMEA teams sit through multi-hour sessions filling in cells in a spreadsheet, assigning Severity, Occurrence, and Detection ratings that feel more like organizational politics than engineering judgment, and producing Risk Priority Numbers (RPNs) that nobody quite trusts. The result is an FMEA document that satisfies an audit requirement but fails to prevent the failures it was designed to anticipate.
In 2019, the Automotive Industry Action Group (AIAG) and VDA (Verband der Automobilindustrie) jointly published a major revision to the FMEA standard that addresses many of the most persistent weaknesses of traditional FMEA practice. This revision — commonly called the AIAG-VDA FMEA or the Harmonized FMEA — introduced four new conceptual improvements that make FMEA both more rigorous and more practically useful.
This session is designed for quality professionals who already understand FMEA fundamentals and want to take their analysis capability to the next level. We will explore what changed, why it matters, and how to apply the new concepts to generate failure modes your traditional FMEA would have missed.
"FMEA should scare you — in the best possible way. If your FMEA review did not generate at least three 'I never thought of that' moments, you probably did not dig deep enough."
2. FMEA Fundamentals: A Rapid Refresher
2.1 What FMEA Does
FMEA is a systematic, proactive method for identifying potential failure modes in a product design or manufacturing process before those failures occur in the field. It accomplishes this by:
- Identifying the functions of each component or process step
- Identifying ways each function can fail (failure modes)
- Analyzing the effects of each failure mode on the system and the customer
- Identifying the potential causes of each failure mode
- Assessing current controls that prevent or detect each cause
- Calculating a Risk Priority Number (RPN) = Severity x Occurrence x Detection to prioritize corrective action
2.2 The Two Primary FMEA Types
| Type | Design FMEA (DFMEA) | Process FMEA (PFMEA) |
|---|---|---|
| Focus | Analyzes product designs to identify failure modes that could affect product function, safety, or regulatory compliance. | Analyzes manufacturing or assembly processes to identify failure modes that could affect process output quality, throughput, or safety. |
| When Applied | During design phases, before design is frozen. Most valuable when applied early enough to influence design decisions. | During process planning, before process validation. Most valuable when applied before tooling and fixtures are committed. |
| Team Composition | Design engineers, systems engineers, reliability engineers, manufacturing representatives, customer input. | Process engineers, quality engineers, operators, maintenance, tooling, material experts. |
| Primary Output | Design risk mitigation actions: design changes, design verification tests, safety requirements. | Process control plan inputs: control methods, inspection points, mistake-proofing devices, operator instructions. |
2.3 The RPN Limitation
The traditional Risk Priority Number (RPN = S x O x D) has significant limitations that experienced practitioners have long recognized and that the AIAG-VDA revision explicitly addresses:
- RPN equivalence fallacy: An RPN of 100 could be generated by 10x10x1, 5x5x4, or 4x5x5 — but these represent dramatically different risk profiles. A 10-Severity/10-Occurrence/1-Detection scenario (certain to cause a critical failure that will definitely be detected before reaching the customer) is completely different from 10-Severity/1-Occurrence/10-Detection (rare but catastrophic if it escapes detection).
- Inflation and gaming: Teams learn to assign ratings to achieve acceptable RPNs rather than to accurately represent risk. Severity scores get underrated to avoid mandatory actions; Detection scores get overrated to make the number look better.
- No absolute risk threshold: The RPN scale provides no objective basis for determining what constitutes an acceptable versus unacceptable risk level. Different organizations and teams use different thresholds inconsistently.
The AIAG-VDA revision replaced the RPN-based prioritization system with an Action Priority (AP) system that directly addresses the Severity-Occurrence-Detection interaction in a more nuanced and defensible way.
3. The AIAG-VDA Revision: Four New Concepts
3.1 Concept 1 — The Seven-Step Approach
Traditional FMEA practice was often applied inconsistently because there was no standardized process for how to conduct the analysis — teams jumped to filling in the spreadsheet without adequate preparation. The AIAG-VDA standard introduces a structured seven-step process that ensures the analysis is thorough before the team begins assigning ratings:
| Step | Name | Purpose and Key Activities |
|---|---|---|
| 1 | Planning & Preparation | Define the FMEA scope, team, timing, and analysis boundaries. Identify what IS and IS NOT included. Prevent scope creep that dilutes focus. |
| 2 | Structure Analysis | Decompose the system into its elements. For DFMEA: system > subsystem > component. For PFMEA: process flow > process step > work element. Ensures no element is overlooked. |
| 3 | Function Analysis | Identify the intended functions of each structural element and the relationships between them. 'What is this supposed to do?' before asking 'How can it fail?' |
| 4 | Failure Analysis | Identify failure modes, failure effects, and failure causes for each function. This is the traditional FMEA core — now informed by the structured foundation of Steps 2 and 3. |
| 5 | Risk Analysis | Assign Severity, Occurrence, and Detection ratings. Determine Action Priority (AP) using the new AP table rather than calculating RPN. |
| 6 | Optimization | Develop and implement actions to reduce high-priority risks. Confirm effectiveness of actions and update the FMEA to reflect the improved risk profile. |
| 7 | Results Documentation | Document decisions, rationale, and lessons learned. Ensure the FMEA is accessible and usable for future design or process revisions and lessons learned transfer. |
3.2 Concept 2 — Structure and Function Analysis
One of the most significant improvements in the AIAG-VDA approach is the explicit requirement to conduct Structure Analysis and Function Analysis before beginning Failure Analysis. This seemingly simple addition has a dramatic impact on FMEA quality.
When teams jump directly to identifying failure modes, they inevitably miss failure modes associated with functions they have not explicitly identified. By requiring teams to first map the complete structural hierarchy and then systematically assign functions to each structural element, the method creates a comprehensive inventory of 'things that must work' before asking 'what happens when they do not.'
Structure Analysis in Practice
- For DFMEA: Document the system-level item > subsystem elements > component elements in a hierarchical tree. Include all interfaces between elements — many of the most important failure modes occur at boundaries between components, not within individual components.
- For PFMEA: Document the process steps in the manufacturing or assembly flow > the work elements within each step (the 4M inputs: Man, Machine, Material, Method) > the quality characteristics produced. This 5M+E decomposition (4M + Measurement + Environment) ensures comprehensive cause identification.
Function Analysis in Practice
- State functions in verb-noun format: 'Transmit torque,' 'Seal fluid at 200 PSI,' 'Locate component within +/- 0.5mm.' Vague functions like 'Support structure' produce vague failure modes and vague causes.
- Identify BOTH normal functions and required characteristics. A seal's function is to seal fluid; its characteristics include the specific pressure range, temperature range, and fluid compatibility requirements. Failure modes can result from function failure or characteristic failure.
- Use function analysis to discover missing structural elements: if you identify a function that no structural element owns, you have found either a design gap or an unintended function dependency.
3.3 Concept 3 — The Action Priority (AP) System
The Action Priority (AP) system replaces the Risk Priority Number (RPN) as the primary risk prioritization mechanism. Instead of a single calculated number, the AP system uses a lookup table that captures the interaction between Severity, Occurrence, and Detection in a way that reflects actual risk priorities more accurately.
The AP system produces three priority levels:
- High (H): Action is required. The current risk is unacceptable. The team must develop, assign, implement, and verify the effectiveness of risk-reduction actions before product launch or process release.
- Medium (M): Action should be considered. The team should evaluate what actions are available and document the decision — either implement an action or provide documented justification for accepting the current risk level.
- Low (L): Action at team discretion. Risk may be acceptable. Document the team's assessment and the rationale for any decision to accept or reduce the risk.
The Key Improvement: Severity Cannot Be Ignored
In the AP table, any failure mode with a Severity rating of 9 or 10 (safety-critical or regulatory non-compliance effects) automatically receives a High action priority regardless of Occurrence and Detection ratings. This directly addresses the traditional FMEA flaw where teams could assign high Severity ratings but 'balance' them with favorable Occurrence and Detection ratings to produce an acceptable RPN — a practice that leaves unacceptable safety risks in the system.
A Severity of 10 means someone could be hurt or a regulatory requirement could be violated. No amount of low Occurrence or high Detection capability should be sufficient to make that risk 'acceptable' without positive action. The AP system enforces this logic.
3.4 Concept 4 — Prevention and Detection Controls Distinction
Traditional FMEA combined prevention controls (actions that reduce the probability of a cause occurring) and detection controls (actions that identify a failure mode or cause after it has occurred) in a single 'Current Controls' column. The AIAG-VDA revision separates these into distinct columns — and this seemingly minor structural change has significant practical impact.
| Control Type | Definition | Examples |
|---|---|---|
| Prevention Controls (PC) | Actions that reduce the probability that a failure cause will occur. Prevention controls affect the Occurrence rating. | Design rules and standards, process parameters, mistake-proofing (poka-yoke), qualified supplier requirements, operator training programs, preventive maintenance schedules. |
| Detection Controls (DC) | Actions that identify the presence of a failure cause or failure mode before the effect reaches the next customer. Detection controls affect the Detection rating. | In-process inspection, dimensional gauging, statistical process control, functional testing, visual inspection, automated vision systems, end-of-line testing. |
The separation matters because prevention controls and detection controls address different aspects of risk — and the actions needed to improve each are completely different. A team that wants to improve a high-Occurrence failure mode needs to improve Prevention Controls (change the design or process to make the failure cause less likely). A team that wants to improve a high-Detection rating needs to improve Detection Controls (add or improve inspection or testing methods). Conflating the two in a single column obscures which type of action is actually needed.
4. Thinking More Expansively: Finding Failure Modes You Have Never Seen
4.1 The Completeness Problem in FMEA
Even with excellent process discipline, FMEA teams routinely miss failure modes. The reason is cognitive: people identify failure modes based on their experience, and by definition they cannot directly experience failure modes they have never encountered. This is the fundamental completeness problem of FMEA, and it is why experienced teams still produce incomplete analyses.
The AIAG-VDA revision and complementary best practices offer several techniques for expanding the range of failure modes your team considers:
Technique 1: Boundary Condition Analysis
Most failures occur at boundaries — between components, between operating conditions, between nominal and extreme values. Systematically asking 'what happens at the boundary of each specification or operating condition?' surfaces failure modes that in-range analysis misses.
- For each characteristic, ask: what happens at exactly the minimum specification? What happens at exactly the maximum? What happens just outside those limits?
- For each interface, ask: what happens when both connected components are at their worst-case specifications simultaneously?
- For environmental conditions, ask: what happens when temperature, humidity, vibration, or chemical exposure is at the extreme end of the expected operating range?
Technique 2: Customer Use-Abuse Analysis
Customers do not always use products the way they were designed to be used. Use-abuse analysis systematically considers how customers actually use (and misuse) products in the field, identifying failure modes that laboratory testing environments miss.
- Observe actual customer use patterns — not instructions-for-use compliance, but real behavior. What shortcuts do people take? What do they use the product for that was not intended?
- Interview field service technicians and complaint handlers — they see the full range of customer use patterns and associated failures. Their knowledge is an untapped FMEA resource.
- Consider the full product lifecycle: installation, first use, normal use, maintenance, end of life, and disposal. Failure modes occur at each phase, not just during 'normal use.'
Technique 3: Similar System Mining
Every organization has an internal library of prior failure experiences — in previous designs, processes, supplier quality issues, and customer complaints. Systematically mining this library for relevant failure modes before conducting FMEA dramatically improves completeness.
- Create and maintain a 'lessons learned' database organized by failure mode and cause type. Before each FMEA, review all relevant entries.
- Use Pareto analysis of historical warranty and field failure data to identify the failure mode categories that historically cause the most problems. Ensure these categories are thoroughly explored in the current analysis.
- Review FMEA documents from similar products or processes — even those from different business units or even publicly available competitor recall data.
Technique 4: DFMEA-PFMEA Interface Analysis
Some of the most important failure modes occur at the interface between design intent and manufacturing process reality. Explicitly connecting DFMEA and PFMEA analyses surfaces these interface failure modes:
- Each design characteristic identified in the DFMEA should be explicitly tracked to the process step in the PFMEA that produces it. If a design characteristic has no corresponding PFMEA process step, that is a gap.
- For each DFMEA cause that includes a manufacturing process assumption (e.g., 'weld quality'), the PFMEA should explicitly address the failure modes associated with that manufacturing process.
5. Workshop Flow for a 4-Hour Session
| Time Block | Duration | Content & Activities |
|---|---|---|
| 0:00 – 0:30 | 30 min | FMEA Fundamentals Refresh and the RPN Problem. Brief refresher for prerequisite knowledge check. Present the RPN equivalence fallacy with a worked example. Poll: Has your organization experienced an RPN-driven false sense of security? |
| 0:30 – 1:15 | 45 min | The AIAG-VDA Seven-Step Approach. Walk through each step. Compare to participants' current practice. Small groups: identify which steps their organization currently skips or shortchanges, and the consequences. |
| 1:15 – 2:00 | 45 min | Structure and Function Analysis Practice. Groups practice function analysis on a simple product or process (a pen, a stapler, a coffee maker assembly step). Write functions in verb-noun format. Identify missing structural elements. |
| 2:00 – 2:15 | 15 min | Break. Display the Action Priority concept and have participants predict AP levels for three example Severity-Occurrence-Detection combinations. |
| 2:15 – 2:45 | 30 min | Action Priority System Deep Dive. Teach the AP table. Contrast AP vs. RPN for five examples. Groups: reclassify three existing FMEA items from their own experience using AP logic. What changes? |
| 2:45 – 3:30 | 45 min | Expanding Failure Mode Discovery. Teach four techniques. Groups select one technique and apply it to a simple case study product or process. Generate at least five new failure modes they believe would not have appeared in a standard analysis. |
| 3:30 – 3:50 | 20 min | Prevention vs. Detection Controls Application. Groups revisit their failure modes and classify all controls as Prevention or Detection. Identify gaps: are all high-Occurrence items covered by strong Prevention Controls? |
| 3:50 – 4:00 | 10 min | Commitments and Q&A. One change each participant will implement in their next FMEA. Open Q&A. |
6. Discussion Questions for Q&A
Diagnosis and Reflection
- Think about the last FMEA your team conducted. Which of the seven AIAG-VDA steps were well executed? Which were skipped or shortchanged? What was the impact on analysis quality?
- Has your organization ever experienced a significant field failure or customer escape that a more thorough FMEA would have anticipated? What failure mode or cause was missed? Which of the four expansive thinking techniques might have surfaced it?
- How does your organization currently handle Severity-9 and Severity-10 failure modes? Does your current approach ensure that high-severity risks receive mandatory action regardless of Occurrence and Detection ratings?
Application
- Which of the four failure mode expansion techniques (boundary condition analysis, use-abuse analysis, similar system mining, DFMEA-PFMEA interface analysis) do you believe would generate the most value in your current work context? Why?
- How would switching from RPN-based prioritization to the Action Priority system change the priorities on your most recent FMEA? Would any High-AP items be currently underweighted under your RPN system?
- What would it take to implement the full AIAG-VDA seven-step approach in your organization? What is the biggest barrier, and who would need to champion the change?
7. Conclusion: FMEA That Actually Prevents Failures
The goal of FMEA is not compliance. It is prevention. An FMEA document that satisfies an audit requirement but fails to surface the failure modes that will actually cause field problems has failed at its fundamental purpose, regardless of how complete its columns appear.
The AIAG-VDA improvements address the most persistent weaknesses of traditional FMEA practice: the false security of RPN-based risk assessment, the incompleteness generated by jumping to failure modes before completing structure and function analysis, the confusion between prevention and detection strategies, and the systematic underweighting of high-severity risks.
But even the best methodology cannot compensate for teams that do not think expansively. The four failure mode expansion techniques in this guide are not optional extras for advanced practitioners — they are fundamental practices for any team that wants FMEA to actually prevent the failures it is designed to anticipate. Apply them rigorously, and your FMEA will generate failure modes that surprise you. That surprise is not a sign of a flawed prior analysis. It is the sound of prevention working.
The failure modes that will hurt your customers next year are already deterministic — they exist in your current designs and processes. FMEA is the systematic practice of finding them before the field does.
| KEY TAKEAWAYS 1. The AIAG-VDA seven-step approach ensures Structure and Function Analysis is completed before Failure Analysis — dramatically improving completeness. 2. The Action Priority (AP) system replaces RPN with a more defensible prioritization approach that ensures Severity-9/10 items always receive mandatory action. 3. Separating Prevention Controls (affect Occurrence) from Detection Controls (affect Detection) clarifies which actions are needed to reduce specific risks. 4. Boundary condition analysis, use-abuse analysis, similar system mining, and DFMEA-PFMEA interface analysis expand the failure mode universe beyond team experience. 5. An FMEA that does not surprise you is probably incomplete. The goal is to find the failure modes that would otherwise only be discovered in the field. |