Tuesday, October 13, 2009

Six Sigma FAQs....

Q. Would appreciate if you could explain the unique differences between Response Optimiser & Contour Plot and cite any examples which work well with each in isolation.....


Ans 1 : Both the terms - Contour Plots and Response Optimizer are used to visualize and analyze the response surface.

While both of them are extensively used in Experimentation environments to find the optimum inputs to get the desired / maximum outputs, the key differences between the two are listed below:

1. Response Optimizer helps to identify the factor / input settings that optimize a single response or a set of responses. Whereas, a Contour plot shows only two factors / inputs at a time, while holding other factors at fixed levels.

2. A contour plot helps you identify an optimum point on the response surface whereas a response optimizer allows you to set a target for desired output and accordingly displays the optimum solution.

3. A Contour plot doesn't give you the option of setting the Target output with Lower and Upper limits set in by you or alternatively see the output as a function of maximizing or minimizing certain values which is possible using a Response Optimizer.

4. Moreover a response optimizer allows you to see a local and global solution which is not available as an option in contour plot. A 'Local' solution offers a best solution beginning from a starting point, whereas a 'Global' solution is the best of all proposed 'Local' solutions. So if you wish to see the best response for a variable, Local solution works best but Global solution takes into consideration all variables before shooting out the most optimum response and hence is desirable.

For eg: In the hotel industry, if you wish to analyze the response / outcome - ' Customer Visit' with factors as 'Ambience', 'Service' and 'Offers' and you wish to analyze the critical factors that impact the 'Number of customer Visits' at the hotel, then you should do the following:

1. Run a DOE to shortlist the significant factors impacting the outcome. Work on the equation to identify the best combination of inputs to get the desired output.

2. Run a Contour Plot to identify the optimum combination of factors for best outcome.

3. Run a Response Optimizer to identify a response, setting limits defined by your organisation.

From the above you would be able to conclude what are your significant areas that need more attention and focus to increase a customer's visit.


(Thanks to Tina Arora from Linked In)


Ans 2 : to my understanding they are two tools aimed at the same result. with a 3d contour plot you have 3 axis, two are inputs and one is output. it is a visual way to show how the two inputs interact and give you an area in which you can achieve the intended goal.

a response optimizer results in a single number which your process would vary around given the inputs you have selected. I believe you can select more inputs with this tool than you can with a contour plot.

with the contour plot, you see variation over the continuum of possible inputs, with the response optimizer, you are giving it an intended target and it's giving you the ideal inputs.


(Thanks to Daniel Rankin on Linked In)


Q. As Significance level is dependent on Signal, Noise & Sample size, should i consider that to work with higher Significance Levels, i will have to either improve the Signal strength,increase Sample size or reduce Noise.
A good amateur level example covering all the 3(Signal, Sample size & Noise) would be highly appreciated.


Ans : Significance level is a term where you state how much error would you be willing to take in your analysis. Hence, the higher the significance level (let say moving from 95% to 99% level of confidence), the wider the population interval will be. This interval size can be reduced by reducing noise - hence 'strengthening' signal (be cautious though, natural noise or unknown noise cannot be reduced and have to be lived with) or increasing sample size. 

Let me give an example: suppose you want to know whether A is smarter than B in Math. You do this by comparing their tests. A got an average of 7. B got 3, 10, 10, 7, and 5. Is B smarter than A? Hard to say from the example because the 'noise' in B's result is big - B has a wide range in his test, which can happen due to many reasons we can investigate. Since we have no idea what noise we can reduce, the only option to answer this question is to take more sample size, and hoping that the next 5 data set have better leaning toward >7 or <7. Otherwise we have to be willing to fail to reject Ho and conclude that there is no evidence that A is smarter than B. (Note the statement, that's as far as what we can conclude - we can't never conclude that A and B are the same)

(Courtesy : Dax Ramadani from Linked in)


Ans : When you want to decide on buying a car there are three important factors (CHEAP, FAST & GOOD). The problem in the real life is we can only choose two. (If you want it Cheap and Fast it will not be Good, if you want it Fast and Good it will not be Cheap and finally if you want it to be Cheap and Good it will not be Fast).
Taking clue from this






1. if you are considering a small sample test with minimum risk the effect will have to be very huge to be able to detect it.
2. On the other hand you are considering the small effect using a small sample size you are accepting a lot of risk in accepting or rejecting the hypothesis.
3. In cognizance to the above points it is clearly derived that, minimizing the risk and also detecting small effects will need a collection and testing of huge samples.
Depending on the process you are testing you can check which, is the least important factor and go with the other 2.        (Courtesy : Shiv Mahapatro from Benchmark SixSigma Fraternity)

Ans : With theory we will always be asking ourselves which combination is the best? In reality what we desire is a low alpha and beta risk This means we have to either increase sample size or reduce variation. The signal cannot be changed as the customer is seeking the change. Thus it is a whole lot better to focus on reducing variation in the process than to try reducing alpha and beta which is practically unacceptable to the customer unless if you started with a very low value of alpha and beta to give you some wiggle room.














So based on what process you are running which you haven't highlighted in your question, the easiest approach would be to try and get variation out. If you are talking about the hospitality industry, then we can look at the processes there and reduce and even in some cases eliminate the variation to drop your risk and sample size precipitously to achieve results rapidly.

(Courtesy : Shree Nanguneri from Linked in)


Q. What is the difference between "Develop the Ideal State Map" & " Develop the Future State Map" in Value Stream Mapping(VSM)?


Ans : Ideal state refers to perfection. For example, if the current Work in Process (WIP) = 1000, the ideal WIP = 0. However, we cannot get to the ideal state within a short period of time. So, people plan for an interim future state usually six months or one year from now. This is called the future state. For this example, maybe the future state could show that we want to reduce WIP from 1000 to 500.


After the end of one year, the future state becomes the new current state and we plan for another future state at that point with the goal of getting to the ideal state in the long run.


Ans : After doing current state Value Stream Mapping team assembles, becomes creative and works out ideal state map. Note the word "ideal" - meaning that it may not be really feasible to reach this state in the near future.
There would be several barriers ; these could be absence of NV activities such as no waiting, no inspection, no rework, etc. In the real world this may not be feasible.


Future state map would be practically what the flow should look like in the next few weeks/months. Identification of the gaps to the Future State are really improvement opportunities. On overcoming gaps there is a reasonable chance of reaching "future state map".
(Courtesy: Jag from Linkedin)


Ans : Ideal state map = Processes with zero waste.


Future state map = processes kaizened to eliminate all unnecessary waste and keep necessary waste processes ( eg - transportation)
(Courtesy: Dinesh V from Linkedin)


Q. How to use the regression analysis for optimization atleast for three variables?


Ans : Regression model is nothing but a relationship between your input(s) and your output. It is primarily used when your input(s) and output are continuous. Typically, we build a linear model between the input(s) and output.


If you have one input and one output, we use simple regression of the form Y = m*X + c. Where, X is your input and Y is your output.


For your question, if I understand it correctly, you have three inputs, X1, X2, X3, in which case, we would use multiple regression where the model would be:


Y = m1*X1 + m2*X2 + m3*X3 + c


Once you build a regression model and check that you have a decent model between your inputs and output, then you can use this for prediction or optimization. Make sure you check the adjusted R^2 values, the appropriate P-values, and also make sure that you check the model assumptions are satisfied. One of the most important requirements is that X1, X2, and X3 should not be co-linear.


Once you have a model, you can then use it for optimization by adding additional constraints on X1, X2, and X3 (if appropriate). This optimization can be done using Linear Programming (LP) - a reference to LP is shown below.


Courtesy : Suresh Jayaram @ Benchmark SixSigma


My Ans : The Regression equation is nothing but an effort to predict the output when your X's i.e. variables takes a certain set of values. You assign values to the variables and you have the predicted result as Y.


So, with the equation provided, if you have control on your Xs (Variables) you can estimate your Ys..


Coming to Correlation, It just gives out if 2 variables are positively/negatively correlated i.e. related. Eg : In cricket Spin Balls vs Runs scored for a player. If negative correlation, you may choose not to select the player for a spin friendly pitch and vice versa...


Q. Appreciate help in understanding Sequential Test Method For Process Capability Decisions. It's claimed that this method can save upto 50% Sampling Savings. Examples welcome...


Ans : If you look at the confidence interval of Cp/Cpk (process capability index), you will find that the confidence interval is pretty wide when the sample size is small. For example, the confidence interval is around +/- 0.4 when the sample size is 30. If you calculate the process capability index as 1.0, it could be as low as 0.6 or as high as 1.4.


If you use a smaller number of samples as recommended by the Sequential Test Method, say 15, then the confidence interval would be a lot higher. This would mean that the error in analysis could be a lot higher. A process that is shown to be capable may in fact be not capable.


So, I would recommend to use this method with caution.
Courtesy : Suresh Jayaram @ Benchmark SixSigma





Q. Probability Function : What is difference between cumulative distribution function & Probability mass function. Need help with understanding TRUE/FALSE option of NORMDIST excel function.






Example : For one given value, (Value/Mean /Std.Dev/(TRUE/FALSE )) the output was .908 & .109. Help appreciated...



Ans : We need to differentiate between continuous and discrete variables.


Let's first look at the discrete case. For example, if we are talking about tossing a coin. The probability of getting a head is 0.5 and the probability of getting a tail is 0.5. The value 0.5 can be referred to as the probability mass function.


For continuous variables, the probability mass function is referred to as the probability density function. However, the value of the probability density function does not equal the probability of getting a value in the continuous case. In fact, the probability of exactly getting a value for a continuous distribution is always 0.


The area under the probability density function gives the probability in the continuous case. For example, if we have normally distributed data with mean = 20 and standard deviation = 5, then the probability of getting say 20, P(20) = 0. We can however, calculate the probability of getting values between 19 and 21, represented as P(19 < class="Apple-style-span" style="color:#3333FF;">Q. What does the Sum of Squares (SS) represent?


Ans : The sum of squares between groups (BSS) measures the variation between the group means


The sum of squares within groups (WSS) measures the variation of values within each group


Ans : In ANOVA we compare the Variance of different data set example - variance in the performance 3 different battery brands. SS is squared deviation from the average of a set of data. MS is SS divided by df.


Q. What is MS and what does it represent?


Ans : Mean Squares Between = Sum of Squares between / (no of groups-1), Mean Squares Within= Sum of Squares within / (total observations - no of groups).


Q. How do we get F and what does it mean?


Ans : It is simply the ratio of two variance estimates. F = (sample size*variance of sample mean)/ error variance OR Mean Squares between groups/ Mean Squares with group.


Q. How do we get the p-value?


Ans : p value in ANOVA is F distribution's probability density function of x= F statistic with numerator df between group and denominator df within group. To find the p-value, you can use MS-Excel by inserting the function ‘fdist (F, df between groups, df within group)'. In Minitab, follow this path - Calc>>Probability distribution>>F distribution.


Comparisons or Post Hocs


Q. Why do we use multiple comparisons?


Ans : ANOVA only indicates whether there are differences between one or more pair of treatment means. It doesn't indicate which pairs are different. Multiple Comparisons are used to identify significant differences between specific factor levels


Q. What is the difference between Tukey and Hsu's MCB?


Ans : Tukey's method or the Tukey-Kramer method is used to compare all possible pairs of treatments while controlling the family error rate. HSU's method compares the best treatment (the one with the lowest or highest mean) to all other treatments while controlling the family error rate.


Q. What is the difference Tukey and Fisher comparisons?


Ans : Fisher's method is used to compare all possible pairs of treatments using a specified error rate for individual tests. It doesn't control the family error rate. Therefore it is advised that we use Fisher's method with appropriate Bonferroni corrected alpha if only a subset of all possible comparisons is of interest.


Courtesy : Shantanu Kumar @ Benchmark SixSigma Fraternity