Gage R&R: Complete Guide to Measurement System Analysis (MSA), Repeatability & Reproducibility
Master Measurement System Analysis (MSA) with Gage R&R studies. Learn repeatability, reproducibility, acceptance criteria (%R&R, %Tolerance, NDC), continuous and attribute Gage R&R methods with step-by-step examples.
What is Gage R&R?
Gage R&R (Repeatability and Reproducibility) is a statistical method to assess the quality of a measurement system. It quantifies how much variation in measurements comes from the measurement system itself versus the actual process variation.
Gage R&R answers critical questions: "Can I trust my data? Do different operators get the same results? Does repeating a measurement give consistent values?"
Measurement System Analysis (MSA)
Gage R&R is part of Measurement System Analysis (MSA), which validates the accuracy (measuring the true value) and precision (getting consistent results) of your measurement system before collecting data for your Six Sigma project.
Why Validate Your Measurement System?
Too many business problems are analyzed with non-reliable data. If data quality is poor, you must stop and fix it during the Measure phase before proceeding with analysis.
- Avoid wrong decisions: Bad measurements lead to incorrect conclusions about process performance
- Prevent Type I errors (Producer risk): Scrapping good parts thinking they're defective
- Prevent Type II errors (Customer risk): Shipping bad parts thinking they're good
- Enable reliable capability analysis: Process capability calculations require valid measurement systems
- Support control charts: SPC requires confidence that variation is real, not measurement error
Type I Error (Producer Risk)
Problem: Measurement system rejects good parts as defective.
Impact: Scrapping good products, increased costs, lower yield than actual.
Example: A diameter measuring 10.00mm (within spec 9.95-10.05mm) is measured as 10.08mm and rejected.
Type II Error (Customer Risk)
Problem: Measurement system accepts bad parts as good.
Impact: Shipping defective products to customers, quality escapes, customer complaints.
Example: A diameter measuring 10.07mm (out of spec) is measured as 10.02mm and shipped to customer.
Accuracy vs Precision
A complete Measurement System Analysis evaluates both Accuracy (measuring the true value) and Precision (consistency of measurements).
Accuracy
Definition: How close measurements are to the true value (reference standard).
Accuracy Components:
- Bias: Systematic offset from true value
- Linearity: Constant bias across measurement range
- Stability: Consistent measurements over time
- Resolution: Ability to detect small changes (1/10th rule)
Test method: Measure reference standards and compare to known values.
Precision (Gage R&R)
Definition: How consistent measurements are when repeated.
Precision Components:
- Repeatability (Equipment Variation): Same operator, same part, repeated measurements
- Reproducibility (Appraiser Variation): Different operators, same part
- Operator×Part Interaction: Certain operators struggle with certain parts
Test method: Gage R&R study with multiple operators and parts.
Variance Decomposition
σ²observed = σ²true process + σ²measurement system
σ²measurement system = σ²repeatability + σ²reproducibility
σ²reproducibility = σ²operators + σ²operator×part interaction
Continuous Gage R&R Method
For continuous data (measurements like diameter, weight, temperature), use the AIAG (Automotive Industry Action Group) standard Gage R&R method.
Study Design - AIAG Recommended Protocol
- n = 10 parts (minimum) selected to represent full process variation
- p = 3 repetitions — each operator measures each part 3 times
- q = 3 operators — at least 3 different operators
- Total measurements: n × p × q = 10 × 3 × 3 = 90 measurements
Sample Size Requirements
Good: n×p×q ≥ 90 — Reliable results. Recommended for important characteristics.
Acceptable: 40 ≤ n×p×q < 90 — Adequate but less confident. Use when constraints exist.
Not Acceptable: n×p×q < 40 — Insufficient data. Increase parts, repetitions, or operators.
Gage R&R Acceptance Criteria
| Metric | Good (<10%) | Acceptable (10-30%) | Not Acceptable (>30%) |
|---|---|---|---|
| % Study Variation (%R&R) | < 10% | 10% - 30% | > 30% |
| % Tolerance (%P/T) | < 10% | 10% - 30% | > 30% |
| Number of Distinct Categories (NDC) | ≥ 5 | 3 - 4 | < 3 |
Critical: If ANY metric is in the red zone (>30% or NDC < 3), your measurement system is NOT acceptable. You MUST fix the gage issue before continuing your Six Sigma project! Proceeding with bad data will lead to wrong conclusions.
Understanding the Three Gage R&R Metrics
1. % Study Variation (%R&R or %GRR)
Formula: %R&R = (6 × σmeasurement system) / (6 × σobserved) × 100%
What it measures: Proportion of observed variation caused by the measurement system.
Why it matters: Indicates relative usefulness of the gage for control charting and process capability analysis. Low %R&R means most variation you observe is real process variation, not measurement error.
Example: If %R&R = 8%, then 8% of what you observe comes from the gage, and 92% is true process variation. Excellent!
2. % Tolerance (%P/T)
Formula: %P/T = (6 × σmeasurement system) / (USL - LSL) × 100%
What it measures: Measurement system variation relative to specification tolerance.
Why it matters: Indicates relative usefulness for determining part acceptance/rejection. If measurement variation is large compared to tolerance, you'll make many wrong accept/reject decisions.
Example: Tolerance = 0.10mm, 6σmeasurement = 0.015mm. %P/T = 15%. Acceptable but should improve.
3. Number of Distinct Categories (NDC)
Formula: NDC = √2 × (σpart-to-part / σR&R)
What it measures: Number of non-overlapping confidence intervals that span the product variation range. Represents how many distinct groups your measurement system can discern.
Why it matters: Indicates measurement system's ability to detect differences in the measured characteristic. Higher NDC = more discrimination power.
- NDC ≥ 5: Excellent discrimination. Can detect meaningful process changes.
- NDC = 3-4: Marginal. Limited ability to distinguish parts.
- NDC < 3: Poor. Essentially a go/no-go gage.
Continuous Gage R&R Calculation Example
Scenario: Measuring shaft diameter (mm)
- Specification: 10.00mm ± 0.05mm (LSL=9.95mm, USL=10.05mm, Tolerance=0.10mm)
- Study design: 10 parts, 3 operators, 3 repetitions = 90 measurements
- Results from ANOVA:
- σobserved = 0.0187mm
- σR&R (measurement system) = 0.0041mm
- σpart-to-part = 0.0182mm
Step 1: Calculate % Study Variation (%R&R)
%R&R = (6 × σR&R) / (6 × σobserved) × 100%
%R&R = (6 × 0.0041) / (6 × 0.0187) × 100%
%R&R = 0.0246 / 0.1122 × 100%
%R&R = 21.9%
Acceptable (10-30%) but should be improved
Step 2: Calculate % Tolerance (%P/T)
%P/T = (6 × σR&R) / (USL - LSL) × 100%
%P/T = (6 × 0.0041) / (10.05 - 9.95) × 100%
%P/T = 0.0246 / 0.10 × 100%
%P/T = 24.6%
Acceptable (10-30%) but improvement recommended
Step 3: Calculate Number of Distinct Categories (NDC)
NDC = √2 × (σpart-to-part / σR&R)
NDC = 1.414 × (0.0182 / 0.0041)
NDC = 1.414 × 4.44
NDC = 6.3 → rounded to 6
Good (≥ 5) — excellent discrimination ability
Interpretation
Overall Assessment: The measurement system is marginally acceptable.
- %R&R = 21.9% (Yellow zone) — measurement system contributes 22% of observed variation. Should be improved for better process capability analysis.
- %P/T = 24.6% (Yellow zone) — measurement variation consumes about 25% of tolerance. May cause some accept/reject errors.
- NDC = 6 (Green zone) — good discrimination power. Can distinguish 6 different quality levels.
Recommendation: Use the gage but prioritize improvements. Investigate main sources of variation (repeatability vs reproducibility) to target improvements.
Attribute Gage R&R Method
For attribute data (pass/fail, good/bad, OK/KO decisions), use the Attribute Agreement Analysis method.
Study Design - Recommended Protocol
- 3 operators evaluate the same 50 parts
- 3 repetitions — each operator evaluates each part 3 times (blind — don't see previous evaluations)
- Reference standard: Expert panel determines true value (good/bad) for each part
- Balanced sample: Ideally 50% good and 50% bad parts in the sample
- Total evaluations: 50 parts × 3 operators × 3 repetitions = 450 evaluations
Attribute Gage R&R Metrics
Without Considering Reference (Precision):
- % Repeatability (Operator 1, 2, 3): How consistent is each operator with themselves?
- % Reproducibility (All Operators): Do all operators agree on each part?
By Considering Reference (Accuracy):
- % Accuracy vs Reference (Operator 1, 2, 3): How often does each operator match the reference?
- % Accuracy All Operators vs Reference: Overall accuracy across all operators
- % Concordant Appraisals: (# measures equal to reference / total measures) × 100%
- % OK rated KO: False reject rate (Producer risk)
- % KO rated OK: False accept rate (Customer risk)
Acceptance Criteria
| % Agreement Metric | Good (≥90%) | Acceptable (80-89%) | Not Acceptable (<80%) |
|---|---|---|---|
| Repeatability (each operator) | ≥ 90% | 80% - 89% | < 80% |
| Reproducibility (all operators) | ≥ 90% | 80% - 89% | < 80% |
| Accuracy vs Reference (each operator) | ≥ 90% | 80% - 89% | < 80% |
| Concordant Appraisals | ≥ 90% | 80% - 89% | < 80% |
| % OK rated KO (False Reject) | ≤ 1% | 1% - 5% | > 5% |
| % KO rated OK (False Accept) | ≤ 1% | 1% - 5% | > 5% |
Critical: For an acceptable attribute measurement system, ALL agreement percentages must be ≥ 90% (ideally). If ANY metric is below 80%, the measurement system is NOT acceptable. Retrain operators, clarify inspection criteria, or redesign the inspection process.
Attribute Gage R&R Example
Scenario: Visual inspection of welds (Good/Bad)
- Study design: 30 parts (15 Good, 15 Bad per reference), 3 operators, 2 repetitions
- Total evaluations: 30 × 3 × 2 = 180 evaluations
- Reference: Expert welding inspector determined true quality
Results
Repeatability (Operator consistency with self):
- Operator A: 27/30 parts consistent = 90.0% (Good)
- Operator B: 24/30 parts consistent = 80.0% (Acceptable)
- Operator C: 28/30 parts consistent = 93.3% (Good)
Accuracy vs Reference:
- Operator A: 54/60 evaluations match reference = 90.0% (Good)
- Operator B: 46/60 evaluations match reference = 76.7% (Not Acceptable)
- Operator C: 55/60 evaluations match reference = 91.7% (Good)
Error Rates:
- % OK rated KO (False Reject) = 1.1% (Good)
- % KO rated OK (False Accept) = 6.7% (Not Acceptable)
Interpretation
MEASUREMENT SYSTEM NOT ACCEPTABLE
Critical Issues:
- Operator B's accuracy (76.7%) is below 80% threshold — needs retraining
- False accept rate (6.7%) is too high — bad welds are being shipped to customers!
- Operator B's repeatability (80%) is marginally acceptable but concerning
Action Required: STOP production inspection. Retrain Operator B on weld defect identification. Clarify inspection criteria with visual aids. Rerun Gage R&R after training.
How to Improve a Failing Gage R&R
For High Repeatability (Equipment Variation)
- Improve equipment: Calibrate, maintain, or replace worn gages
- Increase resolution: Use more precise measurement device (1/10th rule)
- Improve fixturing: Better clamping, positioning, or alignment
- Control environment: Temperature, humidity, vibration
- Redesign test method: Simplify measurement procedure
For High Reproducibility (Operator Variation)
- Train operators: Standardized training on measurement procedure
- Clarify procedures: Detailed work instructions with photos
- Reduce subjectivity: Use objective criteria, limit judgment calls
- Automate reading: Digital displays vs analog scales
- Poka-yoke (error-proofing): Fixtures that force correct measurement position
For Attribute Measurement Systems
- Operational definitions: Crystal-clear defect definitions with visual examples
- Limit samples: Reference samples showing borderline cases
- Reduce categories: If using 3+ categories (Good/Marginal/Bad), consider reducing to 2
- Better lighting/environment: Optimize inspection conditions
- Regular recertification: Periodic operator requalification
Automated Gage R&R with AI
DMAIC Suite™ automates Gage R&R calculations, ANOVA analysis, and AI-powered interpretation for both continuous and attribute MSA.
Automated Calculations
- One-click %R&R, %Tolerance, NDC calculation
- Automatic ANOVA with variance components
- Attribute agreement analysis (repeatability, reproducibility, accuracy)
- Visual charts: X-bar, Range, by Operator
AI Interpretation
- AI explains if measurement system is acceptable
- Identifies main contributors (equipment vs operators)
- Suggests specific improvement actions
- Flags operators needing retraining