Six Sigma Green Belt

of 63 /63
SIX SIGMA GREEN BELT TRAINING Indian Statistical Institute, New Delhi - 110016. SANJAY KUMAR LEAD-AUDITOR 1

Embed Size (px)

Transcript of Six Sigma Green Belt

Page 1: Six Sigma Green Belt


Indian Statistical Institute, New Delhi - 110016.



Page 2: Six Sigma Green Belt

Six Sigma Green Belt Training Quality The totality of features and characteristics of a product or service that bear on its ability to satisfy stated or implied needs. Two Aspects of Quality

1. The External Aspect ⇓

Meaning fitness for use.

2. The Internal Aspect ⇓ Meaning compliance with specifications.

“Quality then was to satisfy to satisfy customer needs it is in fact to delight customers”

External Aspects (Customer’s Voice)


DESS, BENCH Marking, Tolerance Design ⇓

Internal Aspects ⇒ Specifications ⇓

Compliance with Specifications

Quality Guru – Deming, Juran and Shewhart We are in Business to Earn Profile Today Tomorrow All Time to come In an ethical and socially useful way Equation Then: Cost + Profile = Price Equation Now: Profit = Price – Cost

Reduction in cost is essential for survival


Page 3: Six Sigma Green Belt

Bill Smith, Father Of Six Sigma Smith introduced his statistical approach aimed at increasing profitability by reducing defects. His approach was, “ if you want to improve something, involve the people who are doing the job.” He always wanted to make it simple so people would use it. The origin of six sigma can be traced to the 1970s when Motorola faced with serious quality – related problems, embarked on ambitious journey to achieve “ Zero defects” in its products. This project was named “ Six Sigma” by Mikel Harry, then a senior staff engineer with Motorola’s Government Electronic group. Six Sigma is a highly disciplined approached used to reduced the process variations to the extent that the level of defects are drastically reduced to less than 3.4 per million process, product or service opportunities (DPMO). This is termed as 3.4 defects Per Million opportunities (3.4×10⎯6 DPMO) Sigma (σ) is Greek letter that is used in statistic to describe variability of a process. This means “standard deviation”. Most of us may be familiar with the normal distribution and its properties. We are aware of the properties of normal distributions.

99.73% of the area lies within means µ ±3σ 95.73% of the area lies within means µ ±2σ 68.26% of the area lies within means µ ±σ

PPM ( Part Per Million ) : How many out of million (10,00,000 = 106) Percentage (%) How many out of 100 0.01% = 0.01 x 10,00,000 = 100 PPM 100 SIX SIGMA PROCESS CAPABILITY Sigma Defects per million opportunities 6 Sigma 3.4 (World Class) 5 Sigma 230 4 Sigma 6,200 (Average) 3 Sigma 67,000 (Non-competitive) 2 Sigma 310,000 1 Sigma 7,00,000 Sigma Quality Level: 0.8406 + 29.37 – 2.221× ℓn (ppm) The sigma quality level can be approximately determined using the (Schmidt and Launsby1997) equation: 0.8406 + 29.37 – 2.221× ℓn (ppm) ⇒ this is called Sigma Scale Six Sigma

• A top Driven, Disciplined Step By Step Approach (DMAIC) for Continual Improvement of Quality for Benefit to all concerned.

• A system of practices to improve processor by eliminating defects.


• A disciplined data driven approach and methodology for eliminating defects in any process.

Page 4: Six Sigma Green Belt

What is Six SigmaSix Sigma means several thing. It is a statistical measurement. It tells us how good our product, services and process really are. The Six Sigma method allows us to draw comparisons to other similar or dissimilar products, services and process and help us in bench marking and plan for improvement. A Six Sigma process is process is Best - in -Class. On the other hand, four-sigma process is average. In this sense, the sigma scale of measure provides us with a “goodness micrometer” for gauging the adequacy of our products, services and process. Six Sigma: Problem-by-Problem Approach.

Critical Business Issue To

Critical Process To

Critical To Quality Characteristics To

Defining The Problem Terminologies in Six Sigma Customer: Anybody who is Recipient of a product of service is called a customer. He may be external or internal. Voice of Customer: An organization going in for Six Sigma must listen to the customer. Customers requirements may be in the form of LINGALOR SPECIFICATIONS. Hence customers requirements have to be translated into criteria’s to be incorporated in the development of a process leading to product or service. Critical to Satisfaction (CTS): Critical to satisfaction of Customer, The aspects which will give him sufficient confidence on the party. For example: Critical bugs will be fixed within a stipulated time. Medical productivity in terms of Number of Transactions per unit time is at least0.90. Call Quality rating is at least 0.85.The other measures are cost (CTC) and Delivery (CTD). CTQ Tree is a tool that aids in translating customer Language into Quantified requirements for products or services. This helps in translating Broad Customer requirement in specifics. Ensures all aspects of customer needs are identified. Critical to Quality (CTQ): It is a parametric Representation of the voice of the customer. Usually external customer specifies product / service CTQ. For example call center application the maximum time for waiting for response is 60 seconds. What is Critical To Quality Characteristics (CTQ):

• The requirements of the output of the process and measures of Critical process issue are called a CTQ.


• CTQs have to be derived from customers requirements, risks, economics, regulations and process / product FMEAs.

Page 5: Six Sigma Green Belt

Quality: It is the totality of features and characteristics of a product or services that satisfy the customers stated and implied needs: ISO Definition. Quality in Six Sigma: A state in which value entitlement is realized for the customer as well as for the provider in every aspect of the business relationship covering the entire supply chain. It is a WIN –WIN approach for all Cost of poor Quality: The cost of poor quality is defined as those costs associated with the non-achievement of product or service quality as defined by the requirements established by the organization and its contracts with customers and society. Cost of poor Quality categories and Elements: There are four categories – prevention, appraisal, internal failure and external failure. Each category contains elements and sub elements. Prevention: The prevention is defined as the experience gained from the identification and elimination of specific causes of failure cost to prevent the recurrence of the same or similar failure in other product and services. Prevention cost like planning and training. Appraisal Cost: The appraisal cost is the assurance that the product or service is acceptable as delivered to customers. Appraisal cost like inspection and testing. Internal failure costs: Internal failure costs is defined to include basically all costs required to evaluate, dispose of, and either correct or replace non confirming products or services prior to delivery to the customer and also to correct or replace incorrect or incomplete product or service description. Internal failures like re-design of modules, reworking on effort estimation, loss on productivity etc. External failure cost: The External failure cost includes all costs incurred due to nonconforming or suspected nonconforming product or service after delivery to the customer. External failures like Delayed submission of developed modules, customers dissatisfaction etc. All these costs are called components of cost of it is the hidden cost of failing quality to meet customer requirement. Process: Process is the requires of activities which result in a product or service. Key process in input variable (KPIV): The input variable, which influences the output of a process. i.e. The time and Temperate are key input variables for Heat Treatment process. Key process output variables (KPOV): The output variables, which influences the performance of Critical to Quality (CTQ). Defects: A feature in a product / service that causes dissatisfaction to a customer is called a Defect. ANYTHING THAT DISSATISFIES YOUR CUSTOMER Process capability: Process capability is defined as the ability of your process to satisfy customer requirement. A process is said to be not capable if it fails to meet customer requirement. Note:

I. Lower DPU increase customer satisfaction and decreased warranty cost. II. Lower DPU reduces COPQ and decreased manufacturing cost per unit.


III. Higher process capability indices increase Six Sigma rating and reduce DPU.

Page 6: Six Sigma Green Belt

Unit: It may be a product or process, a line of software, a transaction etc. A “ Unit” may be as diverse as a:

• Piece of equipment • Lien of softare • Order • Technical Manual • Medical claim • Wire transfer • Hour of labour • Billable dollar • Customer contact.

Opportunity: A unit may have more than one type of defect. Each is an opportunity. A watchcase may have pits, Burr etc. In a letter of credit (L.C.) opportunities are name, address, shipping instructions, currency etc, are different opportunities for getting a defect.

Metric: Metric is a representative indicator of performance of a process, product or services.

I. If we do not measure. We do not know our status, so we cannot improve. II. Defects per unit: Total Number of defect in a sample divided by Total number

of unit in the sample. III. Defects per opportunity:

DPO = DPU / No. of opportunity × unit IV. Defects per Million opportunities (DPMO):

DPMO = DPU × 10⎯6 No. of opportunity per unit

V. Throughput Yield: Output divided by Input VI. Rolled throughput yield: Rolled throughput yielded is the product of yields of

all sub process. 0.93⇒ 0.95⇒ 0.95⇒ 0.95 If there are four process and each process is having 95% YIELD, The rolled throughput yield (RY) = (0.95)4 = 0.81. For other examples: i. Let us assume that a part goes through ten operations. At each stage 99% parts are good and 1% are reject, we get good 90.43% parts at the end of the tenth stage.

ii. If we start with a batch of 1000 parts we get 904 good parts and scarp or rework 96 parts, the RTY of the process is 90.43%.

Calculation of DPU, DPO, DPMO, Yield & Sigma level. Defect = 34, Unit = 750, Opportunities per unit = 10 1. DPU = D/U= 34/750 = 0.045 2. DPO = D /(U × O) = 34 / 750 × 10 = 0.0045

3. Yield = e (-DPU) = 2.7183(-0.045) = 0.956 = 95.6%

4. DPMO = DPO × 106 = 4500


5. Sigma Level = 2.611

Page 7: Six Sigma Green Belt

Technical terminology of Six Sigma Management CTQ: A CTQ is a measure or proxy of what is important to a customer.

I. Example of CTQ are the mean and range of the waiting times in a physician; office for forum patients selected each at 10.00 am, 2.00 pm, 4.00 pm.

II. The percentage of error in ATM transactions for bank’s customers per month. III. The number of car accidents per month on a particular stretch of highway. Six

Sigma projects are designed to improve CTQs. Unit: A unit is the item (e.g. product or component, service or service step or time period to be studies with a Six Sigma project). Defective: A non-conforming unit is a defective unit. Defect: A defect is a non-conformance on one of many possible quality characteristics of a unit that causes customer dissatisfaction. Defect Opportunity: A defect opportunity in each circumstance in which a CTQ can fail be met. There may be many opportunities for defects within a defined unit. For example, a service has four component parts. If each component part contains three opportunities for a defect, then the service has 12 defect opportunities in which a CTQ can fail to be met. Defects per unit (DPU): Defects per unit refers to the average of all the defects for a given number of unit, that is, the total number of defects for n units divided by n, the number of units. If you are producing 50-page documents the units is a page. If there are 150 spelling errors, DPU is 150/50 = 30. Defects per Opportunity (DPO): Defects per opportunity refers to the average of all the defects for a given number of unit, that is, the total number of defects for units divided by the total number of opportunities. DPO = DPU / Total number of opportunities. Defects Per Million Opportunities (DPMO): DPMO equals DPO multiplied by one million. Yield: Yield is the proportion of units within specification divided by the total number of units. If 25 units are served to customers and 20 are good, then the yield is 20/25 = 0.80. Rolled Throughput Yield (RTY): Rolled Throughput Yield is the product of the yields forms each step in a process. RTY is the probability of a unit passing through each of K independent steps of a process the first time without incurring one or more defects an each of the K Steps. RTY = Y1 × Y2 × ………… Yk where K = number of steps in a process or the number of component parts or steps in a product or service. Each yield Y for each step or component must be calculated to compute the RTY. For those steps in which the number of opportunities is equal to the number of units, Y= 1 – DPU. Where Y = e-DPU.


For example, if a process has three independent steps and the yields from the first step (Y1) is 99.7% the yield from the second step is (Y2) is 99.5% and the yields from the third step (Y3) is 89.7% then the RTY is 88.98% (0.997 × 0.995 × 0.897)

Page 8: Six Sigma Green Belt

KANO MODEL: Kano surveys embrace a set of market research tools used for three purposes:

• To improve existing products, services or processes or to create less- expensive version of existing products, services, or processes called Level A surveys.

• To create major new features for existing products, services, or processes called Level B surveys.

• To invent and innovate an entirely new product, services, or processes is called Level C surveys.

KANO CATEGORIES: There are six KANO category classifications for cognitive images.

• One Dimensional (O): User satisfaction is proportional to the performance of the feature, the less performance, the less user satisfaction, and the more performance, the more user satisfaction.

• Must –Be (M): User satisfaction is not proportional to the performance of the feature, the less performance, the less user satisfaction to the feature, but high performance creates feelings to indifference to features.

• Alterative (A): Again, user satisfaction is not proportional to the performance to the feature. However, in this case, low level of performance creates feelings of indifference to the features, but high levels of performance create feelings of delight to the features.

• Reverse (R): The researcher’s a prior judgment about the user’s view of the feature is the opposite of the user’s view.

• Indifferent (I): The user is indifferent to the presence and absence of the feature. • Questionable (Q): There is contradiction to user’s response to the feature.

Customer satisfied Completive Pressure Expected Quality One-Dimensional Attractive Product (Exciting Quality) Product Fully Dysfunctional Functional Must-Be (Quality) (Basic Quality) Customer Dissatisfied Kano Features categories of Quality













Page 9: Six Sigma Green Belt

The Six Sigma Methodology: The Six Sigma methodology also uses a modified Shewhart cycle PDCA (Plan-Do-Check-Act) Deming’s PDSA (Plan- Do- Study- Act), which is called the DMAIC (Define- Measure –Analysis –Improve –Control) The variation is getting reduced as it passes through a funnel of the six methodology. This is something called the breakthrough strategy,

All possible Xs

Few ‘x’s Six Sigma Approach: A five phase approached called DMAIC is followed: D: Define project’s purpose and scope and get background on the process and customer. M: Measure, focus the improvement record by gathering the current information. A: Analyses, identify the root cause and confirm them with Data. I: Improve, Develop, and try out and implement solutions that address the root cause. C: Control, Evaluate the solutions and maintain the gains by setting up controls, standardizing and documenting work methods, and process, anticipating future improvements. Define phase:

A. Identify project CTQs. B. Develop team charter. C. Define process Map.

1. Choose Critical Business and process Issue. 2. Understand the voice of the customers. 3. Define the process and CTQs. 4. Define the team and training needs. 5. Define scope and opportunities of the project. 6. Develop the charter. 7. Map the process.






Control SPC, fail-safing, Control Plan

Design of

FMEA, Multi-vari

Process map, C&E, MSA, Cpk


Page 10: Six Sigma Green Belt

Measure Phase: A. Select CTQs (Customer, Product, Process) B. Establish and validate measurement system. C. Establish process capabilities.

1. Select the key product. 2. Create product tree. 3. Define performance variables

and measurement process. 4. Determine Data type and create check sheets. 5. Create detailed process map. 6. Select & measure performance variable carry out MSA.

Analysis Phase: A. Bench marking & Goal setting. B. Gap analysis & Root cause analysis C. Identify sources of variations.

1. Establish performance capabilities. 2. Benchmark performance metrics. 3. Discover Best in class performance. 4. Conduct Gap Analysis. 5. Identify success factors. 6. Define performance goal.

Improve Phase: A. Select & diagnose the performance variable. B. Establish the optimum solution. C. Establish the tolerance on X’s.

1. Create possible solutions for root cause. 2. Select solution – Reduction of process variations. 3. Propose and confirm casual variables. 4. Create and implement plans. 5. Verify performance improvement and evaluate benefits.

Control Phase:

A. Select the variable for establishing controls. B. Establish control system. C. Evaluate the control system.

1. Summarize and communicate results. 2. Define – validate – Implement- Monitor control system. 3. Fix owner ship. 4. Recommend future plan. 5. Train teams.


6. Monitor performance metrics.

Page 11: Six Sigma Green Belt

Statistical methods in Six Sigma: • Planning and collection of Data. • Presenting data. • Summarization of data. • Analysis of data and • Drawing valid inference from data, which are usually subject to variation.

What is statistical thinking?

Statistical thinking is a philosophy of learning and action based on the following fundamental principles: • All work occurs in a system of interconnected process. • Variation exits in all process and • Understanding and reducing variation are keys to success.

Deming Once Said “ If I had to reduce my message for management to just a few words I had say it all had to do with reducing variation.” Relationship: Between satisfaction thinking and statistical methods.

Statistical Thinking Statistical Methods Benefits of statistical thinking:

• Provides a theory and methodology for improvement. • Helps identify where improvements is needed. • Provides a general approach to take. • Suggests tools to use.

A complete improvement approach includes alls elements of satisfied thinking. Process ⇒ Variations ⇒ Data

Expanding world of statistics. The way we think

Organizational Improvement

Product process Improvement

Problem Solving



Process → Variation → Data → Statistical Tools




nal I



Page 12: Six Sigma Green Belt

Use of statistical thinking Depends on level of activity and job responsibility

Where we’re Executives Headed

Managerial process Managers

to guide us

Where the work Gets done Workers

Examples of operational processes

• Manufacturing • Order Entry • Delivery • Distribution • Billing • Collection • Service

Examples of Strategically thinking at the operational level

• Work process are mapped and documented • Key measurement are identified

- Time plots displayed • Process management and improvement utility

- Knowledge of variation, and - Data

• Improvement activities focus on the process, not blaming employees. Examples of Managerial process:

• Employee Selection • Training and Development • Performance Management • Recognition and Reward • Budgeting • Setting objectives and goals • Project Management • Communication • Management Reporting • Planning





Page 13: Six Sigma Green Belt

Examples of Strategically thinking at the Managerial level

• Managers use meeting management techniques. • Standardized project management systems are place. • Both project process and results are reviewed. • Process variation is considered when setting goals. • Measurement is viewed as a process. • The number of suppliers is reduced. • A variety of communication media are used.

Examples of Strategic Processes

• Strategic plan development • Strategic plan development • Acquisitions • Corporate Budget development • Communications – Internal and External • Succession planning and Deployment • Organizational Improvement

Examples of Statistical Thinking at the Strategic Level

• Executives use system approach. • Core processes have been flow charged. • Strategic direction defined and deployed, • Measurement system is place. • Employee, customer, and benchmarking studies are used to derive

improvement. • Experimentation is encouraged.

Robustness in Management

• Develop strategies that are insensitive to economic trends and cycles. • Design a project system that is insensitive to

o Personal Changes o Changes in project scope o Variations in business conditions.

• Responds to differing employee needs • Adopt flexible work hours. • Enable personnel to adopt to changing business needs.


• Ensure meeting effectiveness is not dependent on facilities, equipment, or participants.

Page 14: Six Sigma Green Belt

Understanding Human Behaviour • Different people have different methods and styles of working, learning and

thinking. • Different people take in process and communicate information in different

ways. • People vary – they are different.

- Day to day - Person to person - Group to group - Organization to organization

Three ways to reduce variations and improve quality: Process Robustness Analysis

• Identify those uncontrolled factors the affect process performance o Weather o Customer use of products o Employee knowledge, skills, experience work habits. o Age of Equipment

• Design the process to be insensitive to the uncontrollable variants in the factors.


Control the process Eliminate special Case variation.

Improve the system Reduce common Case Variation.

Anticipate variation Design Robust Process and Products

Quality Improvement

Page 15: Six Sigma Green Belt

Population: Collection of all elements under consideration and about which we are trying to draw conclusions. Population elements may be:

• Objects • Entities • Units • People ……… etc.

Generally each has one or more characteristics (attributes) of interest when a particular characteristic is measure we obtain a value, which varies from case to case – hence each characteristics is termed as variable. Recording the value of a variable for each case amounts to collecting data. Sample: A subject of the element selected from a population with a view to draw inference about the population characteristics.

• A sample is part of population. • Objective of statistics is to drawl conclusion about the population using sample

data. Population

Sample A portion or subset of the population Sample data should be

• Relevant • Representative • Adequate • Reliable

Advantages of sample • Sampling is less costly (cost effectiveness). • Total enumeration may not also be free from errors (Inspection Fatigue). • Sampling inspection may have relatively less inspection error and sampling error

can be estimated. • When inspection is destructive, sampling is the only way.

Types Sample Random Sample: Each member of the population has an equal chance of being selected. Simple Random Sample: All samples of the same size are equally likely.

• Assign a number to each member of population number table. Software program or a calculate

• Data from members of the population that correspond to these numbers become members of the sample.


Page 16: Six Sigma Green Belt

Simple Random sample: • Each pollution element has an equal change of being selected. • Selecting 1 subject does not effect selecting others. • May use random number table, lottery.

Stratified Random Samples: Divide the population into groups (strata) (layers) and select a random sample from each group. Strata could be raw material, vendors or process, For example Sample Cluster Samples: Divide the population into individual units or groups and randomly select one or more units. The sample consists of all members from selected units (s). Cluster samples Systematic Samples: Choose a starting value of random, and then choose sample members at regular intervals. X, X, X, X, X, X, X, X, X, X, X, X, X, X, X, X, X, X, X We say we choose every Kth member, in this example K=5, every 5th member of the population selected. Convenience Sample: Choose readily available members of the population for your sample. Statistical Methods

• Descriptive statistics - Collecting and describing data.

• Inferential statistics - Drawing conclusions and / or marking decisions concerning a

population based only on sample data. Descriptive statistics

• Collect Data e.g. survey

• Present data e.g. Tables and graphs

• Characterize data e.g. sample mean

Inferential statistics (Conclusion) • Estimation

e.g. Estimate the population mean weight using the sample mean weight

• Hypothesis testing (Assumption) e.g. Test the claim that the population mean weight is

Drawing conclusions and / or marking decisions concerning a population based on sample results.


Page 17: Six Sigma Green Belt


Statistical Studies:

Enumerating Study

• Involve decision making about a population 1. Frame is listing of all population units

Examples: Name in telephone book Example: Political Poll

Analytical Study • Involves action on a process. • Improve future performance. • No identifiable universe or frame. e.g. production process

Types of Data


Primary Data Collection

Secondary Data Compilation

Observation Experimentation Survey Print or Electronic

Statistical Studies

Enumerating Study Analytical Study


Categorical (Qualitative)

Numerical (Qualitative)

Discret Continuous

Page 18: Six Sigma Green Belt

Data summarization methods: • Graphical Methods. • Tabular summarization. • Numerical Indices.

Graphical Methods: Graphic displays provide better in sight that often is not possible with words or members. Contingency table

• Shows # observations jointly in two categorical variables. Example- Male employee

Gender variable and major variable • May include raw, column or total % • Helps find relationship. • Used widely in marketing. 1. Residence: C C O O C C O O C O

Gender: M F F M M M F M M F Where C = on campus, O = off–campus, M = Male, F = Female Residence Male Female Total On – campus 4 1 5 (80) (20) (100) Off – campus 2 3 5 (40) (60) (100) Total 6 4 10 (60) (40) (100)

2. You are a marketing research analysis for visa. You want to analyze data on credit card users annual income Income: 12 20 32 45 72 46 18 55

Use: Y N N Y Y Y N Y (Income categories: US $25,000, $25,000 & over) Use categories: Y = use credit cards, N = don’t use Income No Yes Total Under $25 K 2 1 3

(67) (33) (100) Total 3 5 8 (38) (62) (100) Graphical Tools

• Bar Chart • Pie Chart • Histogram • Frequency Curve • Scatter Diagram • Control Charts


• Box Plots

Page 19: Six Sigma Green Belt

Bar Chart: Bar length Frequency Equal Bar width

Zero point





Acct. Econ. Mgmt.

Pie Charge:

• Shoes breakdown of total quantity into categories. • Useful for showing relative difference. • Angle size – (360° x percent) = 360° x 10% = 36°


Example: You are on analyst for IRI, you want to show the market shares held by windows program manufactures in 1992, Construct a BAR graph & PIE chart to describe the data. Mfg. Mkt. Share (%) Lotus 15 Microsoft 60 Word perfect 10 Others 15 Dot plot:

1. Condenses data by grouping the same values together. 2. Numerical value is located by a dot on horizontal axis. 3. Data: 21,24,24,26,27,27,30,32,38,42.

ο ο ο ο ο ο ο ο 20 25 30 35 35 40 45

Stem -and leaf display: 1. Divide each observation into step value and leaf value.

– Stem value defines class - Leaf value defines frequency

2. Data: 21,24,24,26,27,27,30,32,38,41 2 144677 3 028

4 1


Page 20: Six Sigma Green Belt

Histogram: It is bar chart of frequency distribution. It highlights the center and amount of variation in the sample of data. The simplicity of construction and interpretation of the histogram makes it an effective tool in the elementary analysis of data. Many problems in quality control have been solved with this one elementary tool alone. LSL Tolerance USL Capability A typical histogram show in the above fig, The Histogram described the variation in the variant in the process. It is used to

1. Solved problems. 2. Determine the process capabilities. 3. Compare with specification. 4. Suggest the shape of the population, and 5. Indicate discrepancies in data such as gaps.

The graph of figure use smooth curves rather than the rectangular shapes associated with the Histogram. A smooth curve represents a population frequency distribution, whereas the Histogram represents a sample frequency distribution. A measure of central tendency of a distribution is a numerical value than described the central position other data or how the data tend to buildup in the center. There are three measures to common use 1. Mean. 2. Median. 3. Mode.





Page 21: Six Sigma Green Belt

Mean: The mean is the sum of the observation divided by the number of observations. It is the most common measure of central tendency. Numerical Indices: Data can be summarized using

• Measure of central tendency. • Measure of dispersion. • The most common measure of central tendency • Affected by extreme value (outliners)

Measure of central tendency: A value, which is representative of the set up of data as most of the data is centered around the value. Important measures of central tendency Mean (Arithmetic Mean). Ungroup data: _ n

Mean (X) = X1+X2……………….Xn = ∑ Xi Where X = Average i=1 n

n = number of observed value. Group data:

X X1 X2 ………….. XkFrequency f1 f2 ………….. fk

Where n = sum of the frequencies. fi = frequency in a cell or frequency of an observed value. Xi = Cell midpoint or an observed value. k = number of cell or numbers of observed values. 0 1 2 3 4 5 6 7 8 9 10 Mean = 5 0 1 2 3 4 5 6 7 8 9 10 12 14 Mean = 6 Temp.°C (X) No. of days (f) Xf 25 2 50 26 3 72 27 4 128 28 3 29 1 30 2 Total 15 406 _ Average Temp (X) = 406/15 = 27.07


Page 22: Six Sigma Green Belt

Medium (M) The median is defined as the value, which divides a series of ordered observation so that the number of items above it is equal to the number below it.

• Robust measure of central tendency. • Not affected by extreme values. • In an ordered array, the median is the “middle” number.

Ungrouped data: I. If n or N is odd, the median is middle number (n+1). 2 II. If n or N is even no, the median is the average of the two middle numbers (n, n+1)

2 2 1. Arrange all valued in order of size from smallest to largest 2. If the number of values (n) is odd, the median is center value in the ordered list. The location of median is obtained by counting (n+1) observations from the bottom of the list.

2 Consider the data set: 490, 400, 450, 420 and 430 to find the median of this data, We first arrange the data from the smallest to largest value e.g. 400, 420, 430, 450, 490 The median is in the position (n+1) = (5+1) = 3

2 2 a. If the observation is even, the median M is given by the average of the two center observations in the ordered list.

e.g 70, 75,77,82,88,100,105,108 the median is the average of the 4th and 5th value i.e. (82 + 88) = 85 2

The median has several advantages over the mean the most important is that extreme value do not affect median as strongly as they do the mean. That is the mean is much more sensitive to outliner value as compared to the median. Group data:

n _ Cfm M = Lm + 2________ × i

fm Where M = Median. Lm = lower boundary of the cell in the median. n = total number of observations. Cfm = cumulative frequency of all cell below Lm fm = frequency of median cell. i = cell interval

The median of grouped data is not used to frequently.


Page 23: Six Sigma Green Belt

Mode: The mode of set of numbers is the value that occurs with the greatest frequency.

• A measure of central tendency. • Value that occurs most often • No affect by extreme values. • Used for either numerical or categorical data. • There may bee no mode • There may be several modes.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 Mode = 9

0 1 2 3 4 5 6 No Mode

The empirical relationship among the mean, median and mode are

Mean – Mode = 3 [mean – median] Percentile: The pth percentile of data is the value such the P percent of the observations fall at or below it. The median is the 50th percentile the first quartile is 25th percentile and the third quartile is the75th percentile.

Example: You are a financial analyst for a Bank. You have collected the following closing stock prices of new stock issues: 17, 16,21,18,13,16,12,11 Describe the stock prices in terms of central tendency. _ n

Mean (X) = ∑Xi /n = X1+X2……………+X6 i=1 6

17+16+21+16+13+16+12+11 = 15.5 6

Median (M) Raw Data: 16 16 21 18 13 16 12 11 Ordered: 11 12 13 16 16 17 18 21 Position 1 2 3 4 5 6 7 8 Position Point: ( n and n+1 ) 2 2 Median (M) = 16+16 = 16

2 Mode Mid range = X smallest + X largest = 11+21 = 16

2 2 Q1 Position = 1. (n+1) = 1.(8+1) = 2.5 4 4

Q1 = 12.3 Q3 Position = 3. (n+1) = 3.(8+1) = 6.75 = 7

4 4


Q3 = 18

Page 24: Six Sigma Green Belt

Dispersion: Variation is a fact of nature and in industrial life too. No two items produced by same process are exactly the same. Test done on the same samples may vary from chemist to chemist or from laboratory to laboratory. This is true whether the test equipment involved is automatic or manually operated. Variation can be because of lack of complete homogeneity of chemicals used in test, variation in test environment conditions or due to difference in the skill of chemists or testing variation in the test result adds to the uncertainty of decisions and hence it is important to measure variation and control. Measure of variation:


In summarizing data, the variability in the values is often an important feature of interest. major measures of dispersion are: Range (R): The range is the difference between the largest and smallest value in a data set. That is range (R) = Largest value – Smallest value Range is

• Measure of variation • Difference between the largest and the smallest observations.

Range = X largest – X smallest 7 8 9 10 11 12, 7 8 9 10 11 12, Range = 12-7 = 5 Range = 12-7 = 5

• Ignore the way which data are distributed. • Used for small samples.


Range Variance Standard and Deviation

Interquartile Range

Population Variance

Sample Variance

Population standard Deviation

Sample Standard Deviation

Page 25: Six Sigma Green Belt

Standard deviation and Variance: The most commonly used measure of dispersion is called the standard deviation. The standard deviation is a numerical value in the units of the observed values that measure the spreading tendency of the data. A large standard deviation shows greater variability of the data than does a small standard deviation.

Standard Deviation • Most important measure of variation • Shows variations about the mean. • Has the same unit.

It takes into account all the values in set of data. Population standard deviation: It is denoted by the Greek symbol σ and given by root means squared deviation from the mean µ Suppose the best result values are

X1, X2, X3,…………………. XN N

σ = ∑ (Xi - µ)2 i=1

N Where σ = Population standard deviation. Xi = Observed value. N = Number of observe value. µ is the population mean. Sample standard deviation (S):

If the sample results values are X1, X3, X3, …………………. Xn It is given by

Ungroup data: n _

S = ∑ (Xi − X)2 i=1

n Group data:

h h

σ = ∑ (fiXi2 ) −∑ (fiXi)2

i=1 i=1

n(n−1) Variance: Population variance (σ2) n

σ2 = ∑ (Xi − µ)2 i=1

N Sample variance (S2)

n _ S = ∑ (Xi − X)2




Page 26: Six Sigma Green Belt

Standard deviation of the sample test values: Xi Xi – X (Xi-X) 2

15 -5 25 18 -2 4 20 0 0 21 1 1 26 6 36 X=10 ∑ 0 ∑66

_ _ X = 100/5 = 20, S = ∑ (Xi − X)2

n-1 S = √ 66/4 = 4.062. Sample standard deviation (S) = 4.062 and Sample variance 66/ 4 = 16.5. Same facts about standard deviation formula

The above table will be used to explain the standard deviation concept. • The first column (Xi) gives five observed value and from these value the average X

= 10 is obtained. _ • The second column (Xi – X) is the deviation of the individual observed values from

the average. If we sum the deviation (0), which is always the case, but it will not lead to the measure of dispersion.

• However, if the deviations are squared, they will all be positive and this sum will be greater then zero.

• The average of the squared deviations can be found by dividing by n, however, for theoretical reasons we divide by n-1, thus, which gives an answer that has the units squared. This result is not acceptable as a measure of the dispersion but is valuable as a measure of variability for advanced statistics. It is colleted the variance and is given the symbol S2.

Coefficient of variation: The standard deviation is an absolute measure of dispersion that expresses variation in the some units as the original data. It cannot be sole basis for comparing two distributions especially if the data are measured on different scales or if larger mean has larger variation. In such cases, we use coefficient of variation. It is a relative measure of variations. It relates the standard deviation and the mean and expresses standard deviation a percentage of mean. The formula for coefficient of variations Coefficient of variation (CV) = Standard deviation (σ) ×100 Mean (µ) Example: laboratory one can complete on an average 40 analyses per day with a standard deviation of 5. Where as laboratory second can complete 160 analyses per day with a standard deviation of 15. Which laboratory shows more consistency? Lab 1: Coefficient of variation 5 / 40 x 100 = 12.5% Lab 2: Coefficient of variation 15 / 40 x 100 = 9.4% Laboratory 2 has less relative variation.


Page 27: Six Sigma Green Belt

Example: You are a financial analyst for a bank you have collected the following closing stock prices of new stock issue 17, 16,2118,13,15,12,11. Describe the volatility of the stock price. Data 17, 16,2118,13,15,12,11.

n _ S = ∑ (Xi − X)2


n-1 _ n

Mean (X) = ∑Xi /n = X1+X2……………+X8 = 15.5 i=1 8

S2 = (17-15.5)2 +(16-15.5)2 +…………….(11-15.5)2 = 11.14 8-1 S = √11.14 = 3.34 Coefficient of variation (cv) = (S/X)×100 = 3.34/15.4×100 = 21.5% Quartile: Quartiles divide the data into four equal parts. Each part contains 25% of the values Q1 is called the first or lower quartile and Q3 is called the third quartile higher quartile Q2 is the median. Inter quartile Range (IQR): It is the difference between the third and the first quartiles of a set of values. That is Inter quartile range IQR = Q3 – Q2 Inter quartile range is a simple measure of speed that gives the range covered by the middle half of the data. It reflects the variability of the middle 50 per cent of the data. The quartiles and the IQR are unaffected by extreme values. Inter quartile range ¼ of values ¼ of values

Calculation of quartile:

• Arrange the data in the increasing order and locate the median. • The first quartile in the median of the observation below the location of the median. • The third quartile in the median of the observations above the median of the



Ist Quartil

IInd Quartil

IIIrd Quartil

Q1 Q2 Q3 Max value Min value

Page 28: Six Sigma Green Belt

Example: Data below given the daily emission of Sulphur oxide of an industrial plant 15.8, 26.4 17.3 11.2 23.9 24.8 16.2 12.8 22.7 28.8 7.2 13.5 18.1 17.9 23.5 Determine the quartile and Inter quartile range Arrange the data in increasing order i.e. 7.2 11.2 13.5 15.8, 16.2 17.3 17.9 18.1 22.7 23.5 23.9 24.8 26.4 28.8 Q2 = Median = 17.9, Q1= 13.5 and Q3 = 23.9 Inter quartile range (IQR) = Q3 – Q1 = 23.3-13.5 = 10.4 Box and whisker plot Graphical display of data suing 5 – number summary. X smallest Q1 Median Q3 X Largest 4 6 8 10 12 Relationship among the measures of central tendency. Difference among mean, median and mode are shown in the above figure. When the distribution is symmetrical, the values for the mean, median, and mode are identical, when the distribution is skewed the values are different. The median is the most commonly used measure of central tendency. It is used when the distribution in symmetrical. The median becomes an effective measure of the central tendency when the distribution is to the right or left skewed. It is used when an exact midpoint of a distribution is desired. When a distribution has extreme values, the mean will be adversely affected while the median will remain unchanged. The mode is used when a quick and approximate measure of the central tendency is desired. Symmetrical Right- skewed Left- skewed Mean Median Mode Mode Mean Mean Mode

Median Median


Page 29: Six Sigma Green Belt

THE NORMAL CURVE: A population curve or distribution is developed from a frequency histogram as the sample size of a histogram gets larger and larger, the cell interval is very small, the histogram will take on the appearance of a smooth polygon or a curve representing the population is called Normal curve or Gaussian distribution. The normal curve is a symmetrical, unimodal, bell-shaped distribution with the mean, median and mode having the same value. f(z) 00 -3 -2 -1 0 1 2 3 Z All normal distributions of continuous variables can be converted to the standardized normal distribution by using the standardized normal value Z. Z = Xi − µ σ The formula for the standardize normal curve is: Z2 Z2 where = 3.14159 Z = 1 e¯ 2 = 0.3989 e¯ 2 e = 2.71828

2π² Z = Xi − µ σ

Properties of Normal distribution 1. Mean, Median and More are identical 2. It is a bell shaped curve. 3. Symmetric about the mean 4. The curve starts from –∞ to +∞ 5. The curve represents a population of infinite size. It is defined by two

parameters i.e. mean and standard deviation.


Page 30: Six Sigma Green Belt

Relationship to the Mean and Standard Deviation we have seen by the formula for the standardized normal curve, there is definite relationship among the mean, the standard deviation and the normal curve σ =1.5 σ =3.0 σ = 4.5 X Above figure show three normal curves with the same mean but different standard deviations. i.e. larger the standard deviation, the flatter the curve data are widely dispersed, and the smaller the standard deviation, the more peaked the curve data are normally dispersed. If the standard deviation is zero, all valued are identical to the mean and there is no curve. A relationship exists between the standard deviation and the area under the normal curves shown in figure. Limits % Area covered µ ±1σ 68.26% µ ±2σ 95.46% µ ±3σ 99.73% µ ± ∞ 100% -3σ -2σ -1σ µ 1σ 2σ 3σ 68.26% 95.46% 99.73% Application:

1. The main application is 99.73% of the area covered between – 3 to + 3 limits. 2. It is base for control charts. 3. It is possible to find out the percentage of the data, which are less than the

particular value, greater than particular value and between the two specified limits.


Page 31: Six Sigma Green Belt

Sigma Level: Calculate normal value Z Where Z = Xi - µ σ The Z value indicates how many sigma (σ) units the X value id from the mean (µ). For example, if the USL for process is 16, and the process average and standard deviation are calculated as 10.0 and 2.0 respectively then the Z value corresponding to the upper specification is Z = (16 –10) =3.0 2 Using the normal tables, a Z value of 3 equals a probability of 0.99865, meeting that 99.865% of the process distribution is less them than the X value that is there sigma units above the mean. That implies that 1-0.999865 = 0.00135 or 0.135% of the process exceeds this X value i.e. 0.00135 x 106 = 1350 DPMO thus the sigma level is = 4.5 Z= ±3 0.135% Mean 99.865%


Measured process

Page 32: Six Sigma Green Belt

Statistical process control: A collection of strategic, techniques and actions taken by an organization to ensure they are producing a quality product or providing a quality service.

• A methodology for monitoring a process to identify special causes of variation and signal the need to take corrective action when appropriate.

• SPC relies on control charts. • Establish state of statistical control. • Monitor a process and signal when it goes out of control. • Determine process capability.

Sources of variation: There is variation in all parts produced by manufacturing process. Chance variation is random in nature and cannot be entirely eliminated. Assignable variation is not random in nature and can be reduced or eliminated by investigating the problem and finding the cause. Variation: Types of variation:

Variation Cause Process Normal or Chance Common or Natural In control Unusual or Abnormal Special or Assignable Out of control

Cause of variation in Quality: In a manufacturing process the quality of any product will vary from product to due to various causes. 1. Chance Causes. 2. Assignable causes. Chance cases: A course variation that is small is magnitude and difficult to identify, also called random or common cause. Assignable cause: A cause of variation that is large in magnitude and easily identified, also classed special cause. Unnatural variation Assignable causes present operations. UCL _ Natural variation X Chance causes present management system. LCL Unnatural variation Subgroup Assignable causes present operations.






Page 33: Six Sigma Green Belt

Quality terminology: Quality Assurance refers to the entire system of policies, procedures and guild lines established by an organization to achieve and maintain quality. The objective of Quality Engineering is to include quality in the design of products and process and to identify potential quality problems prior to production.

• Quality control consists of making a series of inspections and measurements to determine whether quality standards are being met.

• The goal of SPC is to determine whether the process can be continued or whether it should be adjusted to archive a desire quality level.

• If the variation in the quality of the production output is due to assignable cause the process should be adjusted or corrected as soon as possible.

• IF the variation in output is due to common cause which the manager cannot control. The process does not need to be adjusted.

• SPC procedures are based on hypothesis - testing methodology. • The null hypotheses Ho is formulated in terms of production process being in

control. • The alternative hypothesis H1 is formulated in terms of the process being out of

control. • As with other hypothesis – testing procedure, both a Type I error (adjusting an in –

control process) and a Type II error (allowing an out of control to continue) are possible.

• SPC uses graphical displays known as control chart to monitor a production process.

• Control charts provide a basis for deciding whether the variation in the output is due to common cause (in control) or assignable causes (out of control).

SPC applied to services

• Nature of defect is different in service. • Service defect is a failure to meet customer’s requirements. • Monitor times, customer satisfaction.

Service quality examples: Hospitals Timeliness, responsiveness, accuracy of lab tests Grocery stores Checkout time, stocking, cleanliness Airlines Luggage handling, waiting times, courtesy Fast food restaurants Waiting times, food quality, cleanliness, employee courtesy. Catalog-order companies Order accuracy, operator knowledge and courtesy, packaging, delivery time, phone order waiting time. Insurance companies Billing accuracy, timeliness of claims processing, agent availability and responses time.


Page 34: Six Sigma Green Belt

Process charts: Tools for monitoring process variation. The figure on the following slide shows a process control chart. It has an upper limit, a centerline, and a lower limit. Control chart (Shewahrt control chart-3σ) Upper Control limit Each point represents data UCL from a sample that are plotted Sequentially. CL Centerline LCL Lower Control Limit Variables: A variables is a continuous measurement such as weight, height or volume. Attribute: An attribute is the result of a bionomical process that results in an either -or situation. The most common types of variable and attributes charts . Central requirements for property using process charts.

• You must understand the generic process for implementing process charts. • You must know how to interpret process charts. • You need to know when different process charts are used. • You need to know how to compute limits for the different types of process charts.

Understanding process variation:

• Random variation is centered around a mean and occurs with a consistent amount of dispersion.

• The type of variation cannot be controlled. Hence, we refer to it as “uncontrolled variation”.

• The statistical tools discussed in this talk are not designed to detect random variation.

• Non-random or “special causes” variation results from some event. The event may be a shift in a process mean or some unexpected occurrence.


Page 35: Six Sigma Green Belt

Process stability: Means that the variation we observe in the process is random variation. To determine process stability we use process charts. Sampling Methods: To ensure that processes are stable, data are gathered in sample. Random samples: Randomization is useful because it ensure independence among observation. To randomize means to sample is such a way that every piece of product has an equal chance of being selected for inspection. Systematic sample: Systematic samples have some of the benefits of random samples without the difficulty of randomizing. Sampling by Rational subgroup: A rational subgroup is a group of data that is logically homogenous, variation within the data can provide a yardstick for setting limits on the standard variation between subgroups. A generalized procedure for developing process charts

• Identify critical operations in the process where inspection might be needed. These are operations in which, if the operation is performed improperly, the product will be negatively affected.

• Identify critical product characteristics, these are the attributes of the product that will result in either good or poor function the product.

• Determine whether the critical product characteristic is a variable an attribute. • Select the appropriate process control chart from among the many types of control

charts. This decision process and types of chart available are discussed later. • Establish the control limits and use the chart to continually improve. • Update the limits when changes have been made to the process. .


Page 36: Six Sigma Green Belt

X-bar and R Charts: The X-bar chart is a process chart used to monitoring the average of the characteristics being measured. To set up an X-bar chart select samples from the process for the characteristic being measured. Then from the samples into rational subgroups, next, find the average value of each sample by dividing the sums of the measurements by the sample size and plot the value on the process control X-bar chart. The R Chart is used to monitor the variability or dispersion of the process. It is used in conjunction with X-bar chart when the process characteristic is variable. To develop on R chart, collect samples from the process and organize them in to subgroups, usually of three to six items. Next, compute the range, R by taking the difference of the high value on the subgroup minus the low value. Then plot the R values on the R-Charts.

I. Control charts for variables: - X-bar charts track process means. - Range charts track process variation. _ X chart control limits = _ = kUCLX = X + A2R Where X = ∑ Xi i=1

= _ k LCLX = X − A2R Where K is not sub group. R chart control limits _ _ kUCLR = D4R Where R = ∑ Ri i=1

_ k LCLR = D3R Where K is not sub group. X- Bar chart R- chart UCL UCL LCL LCL Sample Sample II. Control charts for Attributes: - We now shift to charts for attributes. These charts deal with binomial and poison processes that are not measurements. - We will now be thinking in terms of defects and defectives rather than diameter or widths. - A defect is an irregularity or problem with a larger unit. - A defective is a unit that, as a whole, is not acceptable or does not meet specifications.






Page 37: Six Sigma Green Belt

p-Charts for proportion Defectives: - The P-chart is a process chart that issued to graph the proportion of items in a sample that are defective (Non confirming to specifications). - P-charts are effectively used to determine when there has been a shift in the proportion defective for a particular product or service. - Typical applications of the P-chart include things like late deliveries, incomplete orders, and clerical errors on written forms. p-Chart. _ UCLP = P + Z σP

_ LCLP = P − Z σP

_ _ σP = ∑ P(P − P) n _ P = average % defective in sample. n = sample size. Z = 3 d _ P = n , P = Total defectives_______ = ∑ d Total sample observation ∑ n _ _ _ UCLP = P + 3 P(1 - P) UCL n _ _ _ CL LCLP = P − 3 P(1 - P) n LCL Sample Number np-Charts: - The np-chart is graph of the number of defective (or non confirming units) is a subgroup. The np-chart requires that the samplings of each subgroup be the same each time a sampling drawn. - When subgroup sizes are equal, either the p or np-chart can be used. They are essentially the same chart. - Some people find the np-chart easier to use because it reflects integer number rather than proportions. The uses for the np-chart are essentially the same as the uses for the p-chart. _ Centerline (CL) = n p _ _ _ UCLnp = np + 3 np(1 - p) _ _ _ LCLnp = np − 3 np(1 - p) _ _ _ _ n p = ∑ np , p = np , CL = np N n




nal D



Page 38: Six Sigma Green Belt

c- Charts: - The c chart is graph of the number of defects (Nonconformities) per unit. The units must be of the same sample space, this includes size, height, length, volume and so on. This means that the “area of opportunity” for finding defects must be the same for each unit several individual units can be grouped as if they are one unit of a larger size. - Like other process charts, the c-chart is used to defect nonrandom events in the life of a production process. Typical applications of the c-chart include number of flows in an auto finish, number of flaws in a standard typed letter, and number of incorrect responses on a standardized test. _ UCL Process average c = Total no. of defects UCL Total no. of sample _ Sample standard deviation σc = c CL _ _ _ UCLc = c + Zσc = c + 3 c _ _ _ LCL LCLc = c + Zσc = c − 3 c Sample Number u-charts: - The u-chart is a graph of the average number of defects per unit. This is contrasted with the c-chart, which shows the actual number of defects per standardized unit. - The u-chart allows for the units sampled to be different sizes, area, heights and so on, and allows for different numbers of units in each sample space. The uses for the u chart are the same as the c-chart. s-chart The s (standardized deviation) chart is used in place of the R-chart when a more sensitive chart is desired. These charts are commonly used in semiconductor production when process dispersion is watched very closely.



ber o

f def



Page 39: Six Sigma Green Belt

Example X-Bar R chart # Calculate sample means, sample ranges, mean of means, and mean of ranges. Sample Obs.1 Obs.2 Obs.3 Obs.4 Obs.5 Avg. Range

1 10.68 10.689 10.776 10.798 10.714 10.732 0.116 2 10.79 10.86 10.601 10.745 10.779 10.755 0.259 3 10.78 10.667 10.838 10.785 10.723 10.759 0.171 4 10.59 10.727 10.812 10.775 10.73 10.727 0.221 5 10.69 10.708 10.79 10.758 10.671 10.724 0.119 6 10.75 10.714 10.738 10.719 10.606 10.705 0.143 7 10.79 10.713 10.689 10.877 10.603 10.735 0.274 8 10.74 10.779 10.11 10.737 10.75 10.624 0.669 9 10.77 10.773 10.641 10.644 10.725 10.710 0.132 10 10.72 10.671 10.708 10.85 10.712 10.732 0.179 11 10.79 10.821 10.764 10.658 10.708 10.748 0.153 12 10.62 10.802 10.818 10.872 10.727 10.768 0.250 13 10.66 10.822 10.893 10.544 10.75 10.733 0.349 14 10.81 10.749 10.859 10.801 10.701 10.783 0.158 15 10.66 10.681 10.644 10.747 10.728 10.692 0.103

Averages 10.728 0.2204 _ X chart control limits = _ UCLX = X + A2R = 10.728 + 0.58(0.2204) =10.856 = _ LCLX = X − A2R = 10.728 − 0.58(0.2204) =10.601 R chart control limits _ UCLR = D4R = (2.11)(0.2204) = 0.46504 _ LCLR = D3R = (0) (0.2204) = 0 # You’re manager of a 500-room hotel. You want to analyze the time it takes to deliver room service food orders to room. For 7 days, you collect data on 5 deliveries per day. Is the process in control?


Day Delivery Time Mean Range 1. 7.30 4.20 6.10 3.45 5.55 5.32 3.85 2. 4.60 8.70 7.60 4.43 7.62 6.59 4.27 3. 5.98 2.92 6.20 4.20 5.10 4.88 3.28 4. 7.20 5.10 5.19 6.80 4.21 5.70 2.99 5. 4.00 4.50 5.50 1.89 4.46 4.07 3.61 6. 10.10 8.10 6.50 5.06 6.94 7.34 5.04 7. 6.77 5.08 5.90 6.90 9.30 6.79 4.22 Average 3.894

Page 40: Six Sigma Green Belt

# A manufacturer of chair wheels wishes to maintain the quality of the manufacturing process. Every 15 minutes, for a five-hour period, a wheel is selected and the diameter measured. Given are the diameters (in mm.) of the wheels. Hour # mm. Mean Range

1. 23 24 26 28 25.3 5 2. 26 24 30 27 26.8 6 3. 24 32 26 27 27.3 8 4. 24 28 31 26 27.3 7 5. 25 24 25 27 25.3 3

Average 26.35 5.8 _ X chart control limits UCLX = 26.35 + 0.729(5.8) =30.58 LCLX = 26.35 − 0.729(5.8) =22.12 R chart control limits UCLR = (2.282)(5.8) = 13.24 LCLR = (2.282) (0) = 0 # A restaurant is interested in detecting changes in the number of minutes from a party’s sitting down to getting the bill. Sample Quality Variable Mean Range

1. 23 28 21 24.0 7 2. 33 29 30 30.7 4 3. 25 27 25 25.0 2 4. 28 30 29 29.0 1 5. 29 28 28 28.3 1 6. 23 24 28 25.0 5

Average 27.1 3.5 _ X chart control limits UCLX = 27.1 + 1.02(3.5) = 30.67 LCLX = 27.1 − 1.02(3.5) = 23.53 R chart control limits UCLR = (2.575)(3.5) = 9.0125 LCLR = (3.5) (0) = 0


Page 41: Six Sigma Green Belt

Example p- chart # 20 samples of 100 pairs of jeans Sample Defective Proportion Defective

1. 6 0.06 2. 0 0.00 3. 4 0.04

20. 18 18 200

_ P = Total defectives_______ = 200 = 0.10 Total sample observation 20(100) _ _ _ UCLP = P + 3 P(1 - P) = 0.10 + 3 0.10(1 − 0.10) = 0.190 n 100 _ _ _ LCLP = P − 3 P(1 - P) = 0.10 − 3 0.10(1 − 0.10) = 0.010 n 100 # A manufacturer of running shoes wants to establish control limits for the percent defective. Ten samples of 400 shoes revealed the mean percent defective was 8.0%. Where should the manufacturer set the control limit? _ _ _ UCLP = P + 3 P(1 - P) = 0.08 + 3 0.08(1 − 0.08) = 0.121 n 400 _ _ _ UCLP = P + 3 P(1 - P) = 0.08 − 3 0.08(1 − 0.08) = 0.039 n 400 # A restaurant is interested in detecting changes in the percentage of parties leaving less than a 10% tip. Sample Result of Inspection p

1. 2 no.,38 yes 0.05 2. 1 no.,39 yes 0.025 3. 0 no., 40 yes 0.0 4. 4 no., 36 yes 0.10 5. 3 no., 37 yes 0.075 6. 2 no., 38 yes 0.05

_ P = 12 = 0.05 , σP = 0.05 × 0.95 = 0.034 6(40) 40


Page 42: Six Sigma Green Belt

Example c-chart # Count of defects in 15 rolls of Denim fabric

Sample Defects 1. 12 2. 8 3. 16

15 15 190

_ Process average c = Total no. of defects = 190 = 12.67 Total no. of sample 15 _ Sample standard deviation σc = c _ _ _ UCLc = c + Zσc = c + 3 c = 12.67 + 3 √ 12.67 = 23.35 _ _ _ LCLc = c + Zσc = c − 3 c = 12.67 − 3 √ 12.67 = 1.99 # A manufacturer of computer circuit boards tested 10 after they were manufactured. The number of defects obtained per circuit board were 5, 3, 4, 0, 2, 2, 1, 4, 3 and 2. Construct the appropriate control limits. _ Process average c = 26 = 2.6 10 _ Sample standard deviation σc = c = √ 2.6 _ _ _ UCLc = c + Zσc = c + 3 c = 2.6 + 3 √ 2.6 = 7.44 _ _ _ LCLc = c + Zσc = c − 3 c = 2.6 − 3 √ 2.6 = -2.66 # A restaurant is interested in detecting changes in the number of parties per day that are larger than 6 people.

Day No. 1 4 2 2 3 5 4 3 5 4 6 5

_ Process average c = 23/6 = 3.83 , UCLc = 3.83 + 3 √ 3.83 = 9.68 LCLc = 3.83 − 3 √ 3.83 = -2.08 > 0


Page 43: Six Sigma Green Belt

Process capability: Control limits: -The limits on a control chart used to evaluate the variations in quality from subgroup to subgroup (Non be confused with speciation limits). Tolerance: – The permissible variation in the size of quality characteristic. The different between specifications is called the tolerance. Process capability: The spread of the process. It is equal to six standard deviations when the process is in a state of statistical control. Process capability: The spread of the process. It is equal to six-standard deviation when the process is in a sate of statistical control. Procedure for process capability:

1. Take 25 subgroups of size 5 for a total of 100 measurements. 2. Calculate the range, R for each subgroup. _ 3. Calculate the average range R = ∑ R/25 4. Calculate the estimated of the population S.D.

_ σ = R/d2

5. Process capacity will equal 6σ ratios. Process capacity ratio Cp = Tolerance range

Process range = Upper specification – lower specification 6σ Where Cp = Capability index.

6σ0 = Process capability. Case-I. If the capability index is 1.00 which is desirable 6σ LSL 6σ USL CP = USL − LSL = 6σ = 1.00 6σ 6σ


Page 44: Six Sigma Green Belt

Case-II. If the capability index is greater than 1.00 which is desirable 6σ LSL USL 8σ CP = USL − LSL = 8σ = 1.33 6σ 6σ Case-III. If the capability index is less than 1.00 which is desirable 6σ LSL 4σ USL Process capability index: = = CPK = Min {(upper specification limit −X) or ( X−lower specification limit)} _ _ 3σ i.e. = Min USL − X or X − USL 3σ 3σ Interpretation of index values: Case-I. If CPK =1, then the natural control limits and customer specification are exactly equal. The process is just capable. Case-II. If CPK >1, the process is highly capable of meeting customer specification. Case-III. If CPK <1, the process is not capable. Note: - If the process is not under control, then CPK has no meaning.


Page 45: Six Sigma Green Belt

Calculation of process capability (CPK): 1. Take a lot size of 25 pcs. 2. Measure dimensions of all the pcs. Says X1, X3, X3, …………………. X25 3. Take a sample of 5 pcs. 4. Find Rang of each sample. i.e. X1, X3, X3, …………………. X5 = R1 X1, X3, X3, …………………. X10 = R2 X1, X3, X3, …………………. X15 = R3 X1, X3, X3, …………………. X20 = R4

_ X1, X3, X3, …………………. X25 = R5 5. Calculate R as per following formula. _ R = R1 + R2 + R3 + R4 + R5

_ 5 6. Calculate. σ = R , d2 = 2.326 for rang of 5 pcs. d2 7. Cp = Tolerance d2 = 8. Upper CPK = (Upper specification limit −X) = 3σ Lower CPK = ( X−lower specification limit) 3σ 9. Process capability (CPK) = Lower of upper Lower CPK Process capability for Qualitative (CPK) = u (1 − p) 3 Where p is the estimated share of nonconforming units and u is the quantile function of the normal distribution. This formula typically produces the same value for CPK as with normally distributed characteristic with the same fraction of nonconforming units (single-side). Tools and Techniques:

I. Statistical process Control (SPC): Seven tools ⇒ Pareto diagram, cause and effect diagram, check sheets, process flow diagram. Scatter diagram, Histogram and control charges stratification.

II. Failure mode and effect analysis (FMEA) III. Quality Function development (QFD)


IV. Measurement System Analysis.

Page 46: Six Sigma Green Belt

Statistical process control (SPC): SPC is comprised of seven tools. Pareto diagram, Cause and effect diagram, Check sheets, Process flow diagram, Scatter diagram, Histogram & control charts and Stratification. 1. Pareto Diagram: Alfredo Pareto (1848-1923) conducted extensive studies of the distribution of wealth in Europe. He found that there were a few people with a lot of money and many people with little money. The unequal distributions of wealth become an integral part of economy theory. Dr.Joseph Juran recognized this concept as a universal that could be applied to may filed. He coined the phrases “vital few and useful many”. Types of field failure Construction of a Parato diagram is every simple There are steps:

• Determine the method of classifying the data, by problem, cause, type of non-conformity, and so forth.

• Ranks data classification in descending order from left to right. • Decide of dollars (belt), weighted frequency or frequency is to be used to rank the

characteristics. • Collect data for an appropriate time intervals. • Summarize the data and rank order categories from largest to smallest. • Compute the cumulative percentage if it is to be used. • Construct the diagram and find the vital few.

0 F C A E B D O Types of Field Failures Pareto diagram are used to identify the most important problems, Usually, 80% of total results from 20% of the items. The Pareto diagram is a powerful quality-improvement tool. It is applicable to problem identification and measurement of progress.







Page 47: Six Sigma Green Belt

2. Cause and effect diagram (Why – Why Analyze): A cause and effect (C&E) diagram is a picture composed of lines and symbols designed to represent a meaningful relationship between an effect and it causes. It was developed by Dr.Kaoru Ishikawa in 1943 and also called as on Ishikwara diagram. C&E diagram are used to investigate either a “bad” effect and to take action to correct the caused for “good” effect and to learn those cause responsible. The figure shoes the C&E diagram with the effect on the right and causes on the left. The effects the quality characteristics that need improvement, Causes are usually broken down into the major causes of man, machine, material, measurement, work method and environment. Management and maintenance are also sometimes used for major cause is further subdivided into numerous minor cause. For example, under work methods, we might have training, knowledge, ability, physical characteristic, and so forth. C&E diagram also called. “Fish bone diagrams” because of their shape of the complete structure. Cause Effect The first step in the construction of a C&E diagram is for the project team to identify the effect or quality problem. It is placed on the right side of a large piece of paper by the team leader. Next, the major causes are identified and placed on the diagram. Determining all the minor causes requires transforming by the project team. Brainstorming is an idea - generating a technique that is well - suited to the C&E diagram. It uses the creative thinking capacity of the team.


Man Machine Material

Environment Work Methods Measurement

Quality characteristic

Page 48: Six Sigma Green Belt

3. Check sheets: The main purpose of check sheet is to ensure that the data is collected carefully and accurately by operation personnel or process control and problem soling. Data should be presented in such a form that it can be quickly and easily used and analyzed. Product: XYZ Stage: Final inspection Number inspected: xxx

Date: Jan. 21 Id: Paint Inspector / operator: ABC

Nonconforming Type Check Total Blister Light spray Drips Over spray Splatter Run Others

21 38 29 11 08 47 12

Total 159 Number Nonconforming 113

Check sheet for paint nonconformities. The figure shows a check sheet for paint non-confirming for bicycles. Hot Tub Mon Tue Wed Thu Fri Sat Sun Chemical Test (Add if needed) PH/Chlorine Temperature Add water (if needed) Clean Deck around hot tub


7.4 810°

Pool Chemical Test (Add if needed) Add water (If needed) Check Temperature Vacuum pool (if needed) Filter back wash (20lb.) Lint Filter Sweep and Hose off Deck


7.6 300 780

√ √ √

General Cleaning Vacuum Carpets Vacuum and sweep building B Clean Tables Sweep and mop wooden deck Clean outside deck, bring in chair Take out trash Empty building B Trash cons. Wash windows


√ √ √ √ √ √ √

Bathrooms Scrub sinks, toilets and showers Sweep and mop floors Empty trash and check lockers Cover Hot Tub (at end of the night) Check pool fitters – be sure it is on


√ √ √ √ √

D=daily, A = As needed List any and all deviation from this work schedule on observes side, date it and initial it.


Check sheet for swimming pool.

Page 49: Six Sigma Green Belt

4. Process flow diagram: It is a schematic diagram that shows the flow of the product or service as it moves through the various processing operations. The diagram makes it easy to visualize the entire system, identify potential trouble sport, and locate control activity. Many standard symbols are used by Engineers and Scientifics. The common symbols and their significance given below: An ellipse Start or the end of the process. A rectangle A step or a task in the process A diamond A decision point. Arrow To shows the direction of flow from one activity to the next one in a sequence. The diagram shows who is the next customer in the process¸ thereby increasing the understanding of the process. Flow diagrams are best constructed by a team, because it is rare for one individual to understand the entire process. Improvements to the process can be accomplished by eliminating steps, combining steps, or making frequently occurring steps more efficient.


Page 50: Six Sigma Green Belt

Recruitments of supervisor No Candidate approved? Yes


Negotiable terms and Prepare offer letters.


Sort applications and short-list for interview

First interview to select The best five candidate

Second interview to select

Final interview and medical check-up

Make offer after receiving approval.


Call the next Candidate

Page 51: Six Sigma Green Belt

5. Scatter Diagram: A tool to study the cause and effect relationship between two variables is known as scatter diagram. The figure shows the relationship between automotive speed and gas mileage. The figure shows that as speed increases gases mileage decreases. Automotive speed is plotted on the x-axis and is the independent. Variable. The independent variable is usually controllable. Gas mileage is on the y-axis and is the dependent, or a response, variable. The relationship or correlation between the two variables can be evaluated. Figure shows different patterns and their interpretation. At (a), we have a positive correlation between the two variables because as x increases, y increase. At (b), there is a negative correlation between the two variables because as x increase, y decreases, At (c), there is no correlation, and this pattern is sometimes referred to as a shotgun pattern. At (d), there may or may not be a relationship between the two variables. There appears to be a negative relationship between x and y, but it is not too strong. Further statistical analysis is needed to evaluate this pattern. At (e), we have stratified the data to represent different causes for the same effect. One cause is plotted with a small solid circle, and the other cause is plotted with a solid circle, and the other cause is plotted with a solid triangle. When the data are separated, we see that there is a strong correction. At (f), we have a curvilinear relationship rather than a linear one.


Page 52: Six Sigma Green Belt

FAILURE MODE AND EFFECT ANALYSIS (FMEA): Failure mode and effect analysis (FMEA) is an analytical techniques (a paper test) that combines the technology and experience of people in identifying foreseeable failure modes of a product, service, or process and planning for its elimination. FMEA can be explained as a group of activates indented to.

• Recognize and evaluate the potential failure of product, service, or process and its effects.

• Identify actions that could eliminate or reduce the chance of the potential failure occurring.

• Document the process. FMEA is a “before-the-event” action requiring a team effort to alleviate most easily and inexpensively changes in design and production. There are two types of FMEA: Design FMEA and process FMEA. FMEA Principle. The FMEA is a formal and systematic method to analyze and eliminate potential failure cause in the design and manufacturing phase. FMEA should be applied as early as possible in the design process and definitely before starting the manufacturing process. The FMEA is a relatively simple multi-step process consisting of the following tasks: 1. List all reasonably possible failure, deficiencies, omissions, unintended influences, etc.

systematically. 2. Evaluate their effects and potential impact on the product, process or customers. 3. Classify the severity or importance of the effect. 4. Identify causes of the potential failures, etc. 5. Estimate the probability of occurrence of the failure, etc. 6. Perform an evaluation of the product specification and/ or process monitoring with

regard to failure detection and avoidance. 7. Evaluate the probability of the failure detection. 8. Calculate the risk priority figure (RPF). 9. Based on these results, define sound proposals relative to design, manufacturing and/or

inspection and testing. 10. Assign the responsibilities, targets and completion dates for those changes. 11. Re-iterate the FMEA based on the changes and calculate the new risk priority figure. Risk Priority Figure (RPF): The result of the FMEA is the risk priority figure. It is calculate based on three factors according to the following formula. RPF = Severity of X Probability of Failure X Probability of Failure Failure Occurrence Detection The value points of those three factors are contained in the following table.


Page 53: Six Sigma Green Belt


RATING DEGREE OF SEVERITY 1. Customer will not notice the adverse effect or it is in significant. 2. Customer will probably experience slight annoyance. 3. Customer will experience annoyance due to the slight degradation of

performance. 4. Customer dissatisfaction due to reduce performance. 5. Customer is made uncomfortable or their productivity is reduced by the

continued degradation of the effect. 6. Warranty, repair or significant manufacturing or assembly complaint. 7. High degree of customer dissatisfaction due to component failure

without complete loss of function. Productivity impacted by scrap or rework levels.

8. Very high degree of dissatisfaction due to the loss of function without a negative impact on safety or governmental regulation.

9. Customer endangered due to the adverse effect on safe system performance with waiting before failure or violation of government regulation.

10. Customer endangered due to the adverse effect on safe system performance without earning before failure or violation of governmental regulation.


Numerical Ranking


1. 1 in 106

(CPK > 1.67) 2. 1 in 20,000

(CPK = 1.33) 3. 1 in 5,000

(CPK ~ 1.00) 4. 1 in 2,000

(CPK < 1.00) 5. 1 in 500 6. 1 in 100 7. 1 in 50 8. 1 in 20 9. 1 in 10 10. 1 in 2


Page 54: Six Sigma Green Belt

DETECTION RATINGS KNOWN CAPABILITY: Numeric Ranking Occurrence Like hood Detection Certainty

1. 1 in 106 CPK > 1.67 100% 2. 1 in 20000 CPK = 1.33 99% 3. 1 in 5000 CPK ∼1.00 95% 4. 1 in 2000 CPK < 1.00 90% 5. 1 in 500 85% 6. 1 in 100 80% 7. 1 in 50 70% 8. 1 in 20 60% 9. 1 in 10 50% 10. 1 in 2 < 50%


1. Sure that the potential failure will be found or prevented before reaching the next customer.

2. Almost certain that the potential failure will be found on prevered before reaching the next customer.

3. Low likelihood that the potential failure will reach the next customer undetected.

4. Controls may detect or prevent the next customer undetected. 5. Moderate likelihood that the potential failure will reach the next customer. 6. Controls are unlikely to detect or prevent the potential failure from

reaching the next customer. 7. Poor likelihood that the potential failure will be detected or prevented

before reaching the next customer. 8. Very poor likelihood that the potential failure will be detected or prevented

before reaching the next customer. 9. Current controls probably will not even detect the potential failure. 10

Absolute certainty that the current control will not detected the potential failure.

Quality function deployment (QFD): QFD is a system that identifies and sets the priorities for product, service and process improvement opportunities that lead to increase customers satisfaction. It ensures the accurate deployment of the “voice of the customer” throughout the organization from product planning to field service. The QFD process answers the following questions: 1. What do customers wants? 2. Are all wants equally important? 3. Will delivering perceived needs yield a competitive advantage? 4. How can we change the product, service or process? 5. How does an engineering decision affect customer perception? 6. How does an engineering change affect other technical description? 7. What is the relationship to parts development process planning and production planning?


QFD products start-up costs, reduced engineering design, changes and most important, leads to increased customers satisfaction.

Page 55: Six Sigma Green Belt

Measurement system analysis (MSA): SPC requires accurate and precise data, however, all data have measurement errors. Thus, a observed value, has two components:

Observed value = True value + Measurement error And also variation occurs due to other process and the measurement, thus

Total variation = Product variation + Measurement Measurement variation is divided into repeatability, and reproducibility. Repeatability: which is due to equipment variation. Reproducibility: which is due to appraiser (inspector), Variation: It is called Gage Repeatability (GR) and Reproducibility. Data Collection: The number of parts, appraisers, or trails can vary but 10 parts two or three appraiser, and two three trials are considered optimum, Calculations: While the order of taking measurements is random, the calculations are performed by part and appraiser. Calculations are as follows. 1. The average and range are calculated for each part by an appraiser. 2. The values in step 1 are averaged to obtain: _ _ _ = = = Ra, Rb, Rc, Xa, Xb, Xc 3. The value in step 2 are used to obtained: _ = = = = R and XDiff. Where XDiff. = XMax.− XMin. 4. The UCL and LCL for the range are determined. = = UCLR = D4 R , LCLR = D3R Where D3 and D4 are obtained from table for subgroup sizes of 2 or 3. Any range value (Ra, Rb or Rc) that is out of control should be discarded and the above calculations repeated where appropriate, or the readings should be retaken for that appraiser and part and the above calculations repeated where appropriate. = 5. Determine X for each part, and from this information, calculate the range. = = Rp. = XMax.− XMin. Analysis of Results = 1. Repeatability EV = k1R Where EV = Equipment variation (repeatability) k1 = 4.56 for 2 appraisers and 3.05 for 3 trials.


Page 56: Six Sigma Green Belt

2. Reproducibility = AV = (k2 XDiff)2 − (EV2/nr) Where AV = Appraiser variation (reproducibility) K2 = 3.65 for 2 appraisers and = 2.70 for 3 appraisers n = number of parts r = number of trial. If a negative value occurs under the square root sign, the AV value defaults to zero. 3. Repeatability and Reproducibility R & R = EV2 + AV2

Where R & R = Repeatability and Reproducibility. 4. Part variation PV = j Rp Where PV = Part variation. Rp = range of the part averages. j = dependent on number of parts.

Part 2 3 4 5 6 7 8 9 10 j 3.65 2.70 2.30 2.08 1.93 1.82 1.74 1.67 1.62

5. Total variation TV = (R&R)2 + PV2

Where TV = Total variation. The percent of total variation is calculated using the equations below. %EV = 100 (EV/TV) %AV = 100 (AV/TV) %R&R = 100 (R&RV/TV) %PV = 100 (PV/TV) Evaluation It repeatability is large compared to reproducibility, the reasons may be 1. The gage needs maintenance. 2. The gage should be designed to be more rigid. 3. The clamping or location for gauging needs to be improved. 4. There is excessive within – part variation. If reproducibility is large compared to repeatability the reasons may be 1. The operations needs to be better trained are how to use and read the gage. 2. Calibrations on the gage are not legible. 3. A fixture may be needed to help the operator use the gage consistently. Guidelines for acceptance GR&R (% R & R) are: Under 10% error – Gage system is satisfactory. 10% to 30% errors – May be acceptable based upon importance of application, lost of gage, cost of repairs etc. Over 30% error – Gage system is not satisfactory. Identify the causes and take corrective action.


Page 57: Six Sigma Green Belt

Example: A log of length specification 7.0 + 2.5 is out from bigger logs. Data collected is as follows: Sample A-Inspector B-Inspector T1 T2 T3 Avg. Range T1 T2 T3 Avg. Range1. 7.3 7.2 7.2 7.23 0.1 7.0 6.9 7.2 7.03 0.9 2. 6.8 6.9 7.1 6.93 0.3 7.1 7.1 6.9 7.03 0.2 3. 7.2 7.2 7.0 7.13 0.2 7.0 7.1 7.0 7.03 0.1 4. 7.1 7.3 7.1 7.17 0.2 7.0 7.0 7.1 7.03 0.1 5. 6.8 6.9 7.1 6.93 0.3 6.7 6.9 6.9 6.83 0.2 _ _ = = = XA = 7.08 , XB = 6.99 , XDiff. = XMax − XMin = 0.09 _ _ = = RA = 0.22 , RB = 0.18 , R =0.002 , UCLR = D4R = 0.51 = Equipment variation (EV) = k1R = 0.2 × 3.05 = 0.61 = Appraiser variation (AV) = (k2 XDiff)2 − (EV2/nr) AV = (0.09 × 3.65)2 − (0.612/3×5) = 0.29 Where n = number of parts/ sample, r = number of trials.

Trials 2 3 Observer 2 3 k1 4.56 3.05

k2 3.65 2.70

Total R &R = EV2 + AV2

Total R &R = (0.61)2 + (0.29)2 = 0.29 EV% = EV × 100____ = 0.61 × 100 = 12.2% Total tolerance 5 AV% = AV × 100___ = 0.29 × 100 = 5.8% Total tolerance 5 EV% = R&R × 100_ = 0.68 × 100 = 13.6% Tolerance 5 Equipment variation more than 10% Thus, on basis of R&R we do calibration or replace.


Page 58: Six Sigma Green Belt

Linearity: Reference value 2.00 4.00 6.00

1 2.10 4.00 5.7 2 2.5 4.1 5.6 3 2.8 4.1 7.2 4 3.0 4.1 7.8 5 1.8 4.5 6.2 6 1.9 3.8 6.2 7 3.2 3.8 6.5 8 1.7 4.0 5.2 9 2.5 3.5 5.5 10 1.5 3.7 6.0

Range 1.7 1.0 2.6 Y To fit a straight line Y = A + BX Linearity Normal equations are: ∑Y = nA + B∑X ∑XY = A∑X + B∑X2

5.3 = 10A + 12B 23 = 0.4 & A = 0.05 B = 0.05 = bias. A = 0.4 = linearity Bias X Yi = A + BXi + Error 10

δL = 2 [ ∑ ( Yi−A−BXi)] = 0 δA i=1

⇒ ∑Yi = nA + B∑Xi () δL = i [ ∑ ( Yi−A−BXi)Xi] = 0 δA ⇒ ∑XiYi = A∑Xi + B∑Xi2



Jan Feb Mar Apr May 7.0 9.0 10.0 11.0 9.0 8.5 10.0 10.5 11.5 6.5 9.0 11.0 9.5 8.2 6.2 6.5 10.5 10.5 7.5 9.2 5.0 10.4 11.0 6.9 8.5 8.7 9.5 11.5 6.5 8.2 10.0 9.8 10.5 6.0 7.0 10.5 10.2 10.2 7.0 7.0 8.0 10.1 10.5 7.5 7.0 7.5 10.0 10.7 7.0 6.2 8.07 10.05 10.49 7.91 7.48 5.5 2.0 2.0 5.50 2.8

Avg. Range

Page 59: Six Sigma Green Belt

_ R = 17.8 / 5 = 3.56 _ UCL R = D 4R = 1.777 ×3.56 = 6.32612. _ LCL R = D 3R = 0.223 ×3.56 = 0.79388 = X = 8.8 = _ UCL X = X + A 2R = 8.8 + 0.308 × 3.56 = 9.89648 LCL X = 8.8 − 0.308 × 3.56 = 7.70352 These calculations help to comment that the data is not stable with respect to setting of the process. This say setting of Machine. Part Number Appraiser A 1 2 3 4 5 Trial 1 0.34 0.50 0.42 0.44 0.26 Trial 2 0.42 0.56 0.46 0.48 0.30 Trial 3 0.38 0.48 0.40 0.38 0.28 _ X

0.38 0.51 0.43 0.43 0.28

R 0.08 0.08 0.06 0.10 0.04 Appraiser B Trial 1 0.28 0.54 0.38 0.46 0.30 Trial 2 0.32 0.48 0.42 0.44 0.28 Trial 3 0.24 0.44 0.34 0.40 0.36 _ X

0.28 0.49 0.38 0.43 0.31

R 0.08 0.10 0.08 0.06 0.08 = Xa = (0.38 + 0.51 + 0.43 + 0.43 + 0.28)/5 = 0.41 = Xb = (0.28 + 0.49 + 0.38 + 0.43 + 0.31)/5 = 0.38 = Ra = (0.08 + 0.08 + 0.06 + 0.10 + 0.04)/5 = 0.07 = Rb = (0.08 + 0.10 + 0.08 + 0.60 + 0.08)/5 = 0.08 = X = 0.41 − 0.38 = 0.03 = R = (0.07 + 0.08)/2 = 0.08 UCLR = 2.574 × 0.08 =0.21, LCLR = 0


Page 60: Six Sigma Green Belt

None of the range values are out of control. = X1 = (0.38 + 0.28)/2 = 0.33 = X2 = (0.51 + 0.49)/2 = 0.50 = X3 = (0.43 + 0.38)/2 = 0.41 = X4 = (0.43 + 0.43)/2 = 0.43 = X5 = (0.28 + 0.31)/2 = 0.30 Rp = 0.50 − 30 = 0.20 EV = 3.05 × 0.08 = 0.24 AV = (3.65 × 0.03)2 − (0.242/5×3) = 0.09 R &R = (0.24)2 + (0.09)2 = 0.26 PV = 2.08 × 0.20 = 0.42 TV = (0.26)2 + (0.42)2 = 0.49 %EV = 49%, %AV = 18%, %R&R = 53%, %PV = 86% The Gage system is not satisfactory. The equipment variation in (repeatability) is quite large is relation to the appraiser variation (reproducibility). Regression analysis Relationship among variables. In scientific research and industrial problem soloing often a situation is encountered where in a number of variables are involve with possible interactions or relationship among themselves. Regression analysis is a statistical technique for investigating and modeling functional relationship among these variables in such situations. As an example, consider the family income and age at marriage of the girl. One may be interested to find out whether they are related and if so, what is the form of relationship. The relationship may be expressed in the form of an equation or model connecting one of the variables known as response or dependant variable with one or more other variables known as the response or the dependant variables with one or more other variable know as explanatory or predictor or independent variables. Applications of regression analysis are numerous and occur almost every filed, including engineering, quality control, physical sciences, economics management, life and biological sciences, social sciences etc. The simplest case of the regression analysis is the one where there are only two variables, one dependent variable and one independent variable, and the relationship between them is


Page 61: Six Sigma Green Belt

linear. This is known as simple linear regression. When there are more than one independent variable and the relationship considered is linear we have what is known as multiple regression. When the relationship is not liner we may have to consider a nonlinear model like polynomial regression model, multiplicative model etc. Regression analysis may be carried out for various purpose like (a) summarize / describe data in multiple variable set, to determine the levels of the process parameters which optimizes the yield or any other response of interest, for prediction and estimation purposes etc Steps in Regression Analysis. Regression analysis include the following steps: 1. Statement of the problem. 2. Selection of potentially relevant variables. 3. Data collection. 4. Graphics representation of the data (scatter plot) 5. Model specifications. 6. Choice of fitting method. 7. Model fitting and calculation of indices like correlation coefficient etc. 8. Model validation and criticism. 9. Using the chosen model (s) for the solution of the posed problem. The variables can be either quantitative or qualitative. Examples of the quantitative variables are measurable variable like hardness, tensile strength, height, age at birth of the first child etc. Examples of qualitative variables are good / bad, defective / non-defective, religion, sex, region etc. Graphical Representation of the data: If there is only one predictor variable then the data can be plotted as a scatter diagram to get an idea about the type of relationship, especially about the linearity of the relationship. This kind of graphical representation of the data will help to from ideas about the appropriate model to be chosen. Hardness (X) and Tensile strength (Y) of 16 specimens of annealed steel. S.No. X Y S.No. X Y 1 144 70.00 11 163 81.10 2 171 85.15 12 150 71.10 3 164 83.50 13 175 85.40 4 155 72.90 14 166 78.84 5 180 85.00 15 158 80.80 6 167 77.25 16 168 80.60 7 165 83.60 17 160 79.85 8 169 82.25 18 188 93.15 9 150 76.35 19 171 79.60 10 155 76.20 20 179 81.65


Page 62: Six Sigma Green Belt

The scatter plot can indicate that 1. There is a linear relationship between X and Y, where Y increases with X. 2. There is a linear relationship between X and Y, where Y decreases with X. 3. There is no relationship between X and Y. 4. X and Y are related but the relationship between them is non- linear. A regression equation containing only on predictor variable is called a simple regression equation where as if there are more than one predictor variable the equation is known as a multiple regression equation. Often the actual relationship may be non – linear for the wider range of the predictor variables but it can be considered to be linear in the range of the predictor variables. We are interested. Method of fittings: After the model has been defined and the data have been collected, the next task is to estimate the parameter estimation or model fitting. The most commonly used method of estimation is called the least squares method. Others are the maximum likelihood method, the ridge method and the principal component method. Simple Linear Regression: In simple linear regression we have only one independent variable and one dependent variable. Let these be denoted by X and Y respectively. Further the relationship is assumed to be linear. Thus, the relationship here can be expressed as a linear equation of the form. y = a + bx + ε Where a and b are unknown constants and ε is a random error component. The parameter a is the intercept of the regression line and b is the slope of the line. The parameter a and b are usually called regression coefficients. The errors are assumed to have mean zero and unknown variance σ2. Additionally, we usual assume that the errors are uncorrelated. This means that the value of one error does not depend on the value of any other error. It is convenient to view the regressor X as controlled by the data analyst and measured with negligible error, while the response Y is a random variable. That is there is probability distribution (usually normal) for Y at each possible value of X. Correlation coefficient: Correlation coefficient denoted by r (or rXY), measures the degree of linear association ship between two variables. It is calculated as: SXY r = SXX SYY

n _ _ Where SXY = ∑ (yi − y)(Xi − X) = (n- 1) times covariance between X and Y i=1

n _ SXX = ∑ (Xi − X)2 = (n- 1) times variance of X i=1

n _ SXY = ∑ (yi − y)2 = (n- 1) times variance of Y i=1


Page 63: Six Sigma Green Belt

Fitting the best line: Least squares For fitting the best line through the points (x1, y1), (x2, y2)………………………(xn, yn) least squares method is adopted where in the squared deviations of the points from the fitted line is minimized. That is, n n

Minimise S = ∑ εi2 = ∑ (yi − a − bxi)2 i=1 i=1

To minimize the above, we differentiate with respect to x and y and equate to 0,obtaining two equations, known as Normal Equations. On solving these two equations, the values of a and b are obtained as: _ _ a = y − b X and b = Sxy /Sxx where, Sxy , Sxx are as defined earlier. The equation so established, is know as regression of y on x, can be used for predicting y for giving values of x. However, this equation can’t be used for prediction for x forgiven value of y. We can use the some data to fit a regression of x on y, which can be used for prediction of x for given values y. When r = ±1, the regression of y on x can also be used for prediction of x for given values of y. Example For the data given in the above table find out the least square estimates of the regression parameters a and b. _ _ We have, X = 165.90 y = 80.2125 Syy = 554.4744, Sxx = 2717.0197, Sxy = 1054.2041 b = Sxy = 1054.2041 = 0.388 Sxx 2717.0197 ∧ _ a = y − β1X =80.2125 − 165.9 × 0.388 = 15.71 Multiple Regressions: There are situations when one dependent variable may be related with more than one independent variable. In such cases, we try to develop a model /equation relating the dependent variable with the independent variable. Such regression models are known as multiple regression analysis. If y be the dependent variable and x1, x2 and x3 be the independent variable then the linear regression equation fitted may be of the form. y = a + b1x1 + b2x3 + e