Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistical Inference

Similar presentations


Presentation on theme: "Statistical Inference"— Presentation transcript:

1 Statistical Inference
Making decisions regarding the population base on a sample

2 Decision Types Estimation Hypothesis Testing Prediction
Deciding on the value of an unknown parameter Hypothesis Testing Deciding a statement regarding an unknown parameter is true of false Prediction Deciding the future value of a random variable All decisions will be based on the values of statistics

3 Estimation Definitions
An estimator of an unknown parameter is a sample statistic used for this purpose An estimate is the value of the estimator after the data is collected The performance of an estimator is assessed by determining its sampling distribution and measuring its closeness to the parameter being estimated

4 Examples of Estimators

5 The Sample Proportion Let p = population proportion of interest or binomial probability of success. Let = sample proportion or proportion of successes. is a normal distribution with

6

7 The Sample Mean Let x1, x2, x3, …, xn denote a sample of size n from a normal distribution with mean m and standard deviation s. Let is a normal distribution with

8

9 Confidence Intervals

10 Estimation by Confidence Intervals
Definition An (100) P% confidence interval of an unknown parameter is a pair of sample statistics (t1 and t2) having the following properties: P[t1 < t2] = 1. That is t1 is always smaller than t2. P[the unknown parameter lies between t1 and t2] = P. the statistics t1 and t2 are random variables Property 2. states that the probability that the unknown parameter is bounded by the two statistics t1 and t2 is P.

11 Critical values for a distribution
The a upper critical value for a any distribution is the point xa underneath the distribution such that P[X > xa] = a a xa

12 Critical values for the standard Normal distribution
P[Z > za] = a a za

13 Critical values for the standard Normal distribution
P[Z > za] = a

14 Confidence Intervals for a proportion p
Let and Then t1 to t2 is a (1 – a)100% = P100% confidence interval for p

15 Logic: has a Standard Normal distribution Then and Hence
Thus t1 to t2 is a (1 – a)100% = P100% confidence interval for p

16 Example Suppose we are interested in determining the success rate of a new drug for reducing Blood Pressure The new drug is given to n = 70 patients with abnormally high Blood Pressure Of these patients to X = 63 were able to reduce the abnormally high level of Blood Pressure The proportion of patients able to reduce the abnormally high level of Blood Pressure was This is an estimate of p.

17 Thus a 95% confidence interval for p is 0.8297 to 0.9703
and za/2 = 1.960 If P = 1 – a = 0.95 then a/2 = .025 This comes from the Table Then and Thus a 95% confidence interval for p is to

18 What is the probability that p is beween 0.8297 and 0.9703?
Is it 95% ? Answer: p (unknown) , and are numbers. Either p is between and or it is not. The 95% refers to success of confidence interval procedure prior to the collection of the data. After the data is collected it was either successful in capturing p or it was not.

19 Statistical Inference
Making decisions regarding the population base on a sample

20 Two Areas of Statistical Inference
Estimation Hypothesis Testing

21 Estimation Definitions
An estimator of an unknown parameter is a sample statistic used for this purpose An estimate is the value of the estimator after the data is collected The performance of an estimator is assessed by determining its sampling distribution and measuring its closeness to the parameter being estimated

22 Estimation of a parameter by a range of values (an interval)
Confidence Intervals Estimation of a parameter by a range of values (an interval)

23 Estimation by Confidence Intervals
Definition An (100) P% confidence interval of an unknown parameter is a pair of sample statistics (t1 and t2) having the following properties: P[t1 < t2] = 1. That is t1 is always smaller than t2. P[the unknown parameter lies between t1 and t2] = P. the statistics t1 and t2 are random variables Property 2. states that the probability that the unknown parameter is bounded by the two statistics t1 and t2 is P.

24 Confidence Interval for a Proportion
100(1 – a)% Confidence Interval for the population proportion: Interpretation: For about 100(1 – a)P% of all randomly selected samples from the population, the confidence interval computed in this manner captures the population proportion.

25 Comment The usual choices of a are 0.05 and 0.01 In this case the level of confidence, 100(1 - a)%, is 95% and 99% respectively Also the tabled value za/2 is: z0.025 = and z0.005 = respectively

26 Example Suppose we are interested in determining the success rate of a new drug for reducing Blood Pressure The new drug is given to n = 70 patients with abnormally high Blood Pressure Of these patients to X = 63 were able to reduce the abnormally high level of Blood Pressure The proportion of patients able to reduce the abnormally high level of Blood Pressure was This is an estimate of p.

27 Thus a 95% confidence interval for p is 0.8297 to 0.9703
and za/2 = 1.960 If P = 1 – a = 0.95 then a/2 = .025 This comes from the Table Then and Thus a 95% confidence interval for p is to

28 What is the probability that p is beween 0.8297 and 0.9703?
Is it 95% ? Answer: p (unknown) , and are numbers. Either p is between and or it is not. The 95% refers to success of confidence interval procedure prior to the collection of the data. After the data is collected it was either successful in capturing p or it was not.

29 Error Bound For a (1 – a)% confidence level, the approximate margin of error in a sample proportion is

30 Factors that Determine the Error Bound
1. The sample size, n. When sample size increases, margin of error decreases. 2. The sample proportion, If the proportion is close to either 1 or 0 most individuals have the same trait or opinion, so there is little natural variability and the margin of error is smaller than if the proportion is near 0.5. 3. The “multiplier” za/2. Connected to the “(1 – a)%” level of confindence of the Error Bound. The value of za/2 for a 95% level of confidence is 1.96 This value is changed to change the level of confidence.

31 Determination of Sample Size
In almost all research situations the researcher is interested in the question: How large should the sample be?

32 Answer: Depends on: How accurate you want the answer.
Accuracy is specified by: Specifying the magnitude of the error bound Level of confidence

33 Error Bound: If we have specified the level of confidence then the value of za/2 will be known. If we have specified the magnitude of B, it will also be known Solving for n we get:

34 Summarizing: The sample size that will estimate p with an Error Bound B and level of confidence P = 1 – a is: where: B is the desired Error Bound za/2 is the a/2 critical value for the standard normal distribution p* is some preliminary estimate of p. If you do not have a preliminary estimate of p, use p* = 0.50

35 Reason For p* = 0.50 n will take on the largest value. Thus using p* = 0.50, n may be larger than required if p is not but will give the desired accuracy or better for all values of p.

36 Example Suppose that I want to conduct a survey and want to estimate p = proportion of voters who favour a downtown location for a casino: I know that the approximate value of p is p* = This is also a good choice for p if one has no preliminary estimate of its value. I want the survey to estimate p with an error bound B = 0.01 (1 percentage point) I want the level of confidence to be 95% (i.e. a = 0.05 and za/2 = z0.025 = 1.960 Then

37 Confidence Intervals for the mean , m, of a Normal Population

38 Confidence Intervals for the mean of a Normal Population, m
Let and Then t1 to t2 is a (1 – a)100% = P100% confidence interval for m

39 Logic: has a Standard Normal distribution Then and Hence
Thus t1 to t2 is a (1 – a)100% = P100% confidence interval for m

40 Example Suppose we are interested average Bone Mass Density (BMD) for women aged 70-75 A sample n = 100 women aged are selected and BMD is measured for eahc individual in the sample. The average BMD for these individuals is: The standard deviation (s) of BMD for these individuals is:

41 If P = 1 – a = 0.95 then a/2 = .025 and za/2 = 1.960 Then and
Thus a 95% confidence interval for m is to 27.16

42 Determination of Sample Size
Again a question to be asked: How large should the sample be?

43 Answer: Depends on: How accurate you want the answer.
Accuracy is specified by: Specifying the magnitude of the error bound Level of confidence

44 Error Bound: If we have specified the level of confidence then the value of za/2 will be known. If we have specified the magnitude of B, it will also be known Solving for n we get:

45 Summarizing: The sample size that will estimate m with an Error Bound B and level of confidence P = 1 – a is: where: B is the desired Error Bound za/2 is the a/2 critical value for the standard normal distribution s* is some preliminary estimate of s.

46 Notes: n increases as B, the desired Error Bound, decreases
Larger sample size required for higher level of accuracy n increases as the level of confidence, (1 – a), increases za/2 increases as a/2 becomes closer to zero. Larger sample size required for higher level of confidence n increases as the standard deviation, s, of the population increases. If the population is more variable then a larger sample size required

47 Summary: The sample size n depends on: Desired level of accuracy
Desired level of confidence Variability of the population

48 Example Suppose that one is interested in estimating the average number of grams of fat (m) in one kilogram of lean beef hamburger : This will be estimated by: randomly selecting one kilogram samples, then Measuring the fat content for each sample. Preliminary estimates of m and s indicate: that m and s are approximately 220 and 40 respectively. I want the study to estimate m with an error bound 5 and a level of confidence to be 95% (i.e. a = 0.05 and za/2 = z0.025 = 1.960)

49 Solution Hence n = 246 one kilogram samples are required to estimate m within B = 5 gms with a 95% level of confidence.

50 Statistical Inference
Making decisions regarding the population base on a sample

51 Decision Types Estimation Hypothesis Testing Prediction
Deciding on the value of an unknown parameter Hypothesis Testing Deciding a statement regarding an unknown parameter is true of false Prediction Deciding the future value of a random variable All decisions will be based on the values of statistics

52 Estimation Definitions
An estimator of an unknown parameter is a sample statistic used for this purpose An estimate is the value of the estimator after the data is collected The performance of an estimator is assessed by determining its sampling distribution and measuring its closeness to the parameter being estimated

53 Comments When you use a single statistic to estimate a parameter it is called a point estimator The estimate is a single value The accuracy of this estimate cannot be determined from this value A better way to estimate is with a confidence interval. The width of this interval gives information on its accuracy

54 Estimation by Confidence Intervals
Definition An (100) P% confidence interval of an unknown parameter is a pair of sample statistics (t1 and t2) having the following properties: P[t1 < t2] = 1. That is t1 is always smaller than t2. P[the unknown parameter lies between t1 and t2] = P. the statistics t1 and t2 are random variables Property 2. states that the probability that the unknown parameter is bounded by the two statistics t1 and t2 is P.

55 Confidence Intervals Summary

56 Confidence Interval for a Proportion

57 Determination of Sample Size
The sample size that will estimate p with an Error Bound B and level of confidence P = 1 – a is: where: B is the desired Error Bound za/2 is the a/2 critical value for the standard normal distribution p* is some preliminary estimate of p.

58 Confidence Intervals for the mean of a Normal Population, m

59 Determination of Sample Size
The sample size that will estimate m with an Error Bound B and level of confidence P = 1 – a is: where: B is the desired Error Bound za/2 is the a/2 critical value for the standard normal distribution s* is some preliminary estimate of s.

60 An important area of statistical inference
Hypothesis Testing An important area of statistical inference

61 Definition Hypothesis (H)
Statement about the parameters of the population In hypothesis testing there are two hypotheses of interest. The null hypothesis (H0) The alternative hypothesis (HA)

62 Either null hypothesis (H0) is true or the alternative hypothesis (HA) is true. But not both We say that are mutually exclusive and exhaustive.

63 or One has to make a decision
to either to accept null hypothesis (equivalent to rejecting HA) or to reject null hypothesis (equivalent to accepting HA)

64 There are two possible errors that can be made.
Rejecting the null hypothesis when it is true. (type I error) accepting the null hypothesis when it is false (type II error)

65 An analogy – a jury trial
The two possible decisions are Declare the accused innocent. Declare the accused guilty.

66 The null hypothesis (H0) – the accused is innocent
The alternative hypothesis (HA) – the accused is guilty

67 The two possible errors that can be made:
Declaring an innocent person guilty. (type I error) Declaring a guilty person innocent. (type II error) Note: in this case one type of error may be considered more serious

68 Decision Table showing types of Error
H0 is True H0 is False Correct Decision Type II Error Accept H0 Type I Error Correct Decision Reject H0

69 To define a statistical Test we
Choose a statistic (called the test statistic) Divide the range of possible values for the test statistic into two parts The Acceptance Region The Critical Region

70 To perform a statistical Test we
Collect the data. Compute the value of the test statistic. Make the Decision: If the value of the test statistic is in the Acceptance Region we decide to accept H0 . If the value of the test statistic is in the Critical Region we decide to reject H0 .

71 Example We are interested in determining if a coin is fair. i.e. H0 : p = probability of tossing a head = ½. To test this we will toss the coin n = 10 times. The test statistic is x = the number of heads. This statistic will have a binomial distribution with p = ½ and n = 10 if the null hypothesis is true.

72 Sampling distribution of x when H0 is true

73 Note We would expect the test statistic x to be around 5 if H0 : p = ½ is true. Acceptance Region = {3, 4, 5, 6, 7}. Critical Region = {0, 1, 2, 8, 9, 10}. The reason for the choice of the Acceptance region: Contains the values that we would expect for x if the null hypothesis is true.

74 Definitions: For any statistical testing procedure define
a = P[Rejecting the null hypothesis when it is true] = P[ type I error] b = P[accepting the null hypothesis when it is false] = P[ type II error]

75 In the last example a = P[ type I error] = p(0) + p(1) + p(2) + p(8) + p(9) + p(10) = 0.109, where p(x) are binomial probabilities with p = ½ and n = 10 . b = P[ type II error] = p(3) + p(4) + p(5) + p(6) + p(7), where p(x) are binomial probabilities with p (not equal to ½) and n = 10. Note: these will depend on the value of p.

76 Table: Probability of a Type II error, b vs. p
Note: the magnitude of b increases as p gets closer to ½.

77 Comments: You can control a = P[ type I error] and b = P[ type II error] by widening or narrowing the acceptance region. . Widening the acceptance region decreases a = P[ type I error] but increases b = P[ type II error]. Narrowing the acceptance region increases a = P[ type I error] but decreases b = P[ type II error].

78 Example – Widening the Acceptance Region
Suppose the Acceptance Region includes in addition to its previous values 2 and 8 then a = P[ type I error] = p(0) + p(1) + p(9) + p(10) = 0.021, where again p(x) are binomial probabilities with p = ½ and n = 10 . b = P[ type II error] = p(2) + p(3) + p(4) + p(5) + p(6) + p(7) + p(8). Tabled values of are given on the next page.

79 Table: Probability of a Type II error, b vs. p
Note: Compare these values with the previous definition of the Acceptance Region. They have increased,

80 Example – Narrowing the Acceptance Region
Suppose the original Acceptance Region excludes the values 3 and 7. That is the Acceptance Region is {4,5,6}. Then a = P[ type I error] = p(0) + p(1) + p(2) + p(3) + p(7) + p(8) +p(9) + p(10) = b = P[ type II error] = p(4) + p(5) + p(6) . Tabled values of are given on the next page.

81 Table: Probability of a Type II error, b vs. p
Note: Compare these values with the otiginal definition of the Acceptance Region. They have decreased,

82 a = 0.344 a = 0.021 a = 0.109 Acceptance Region Acceptance Region
{2,3,4,5,6,7,8}. Acceptance Region {4,5,6}. Acceptance Region {3,4,5,6,7}. a = 0.344 a = 0.021 a = 0.109

83 An important area of statistical inference
Hypothesis Testing An important area of statistical inference

84 Definition Hypothesis (H)
Statement about the parameters of the population In hypothesis testing there are two hypotheses of interest. The null hypothesis (H0) The alternative hypothesis (HA)

85 Either null hypothesis (H0) is true or the alternative hypothesis (HA) is true. But not both We say that are mutually exclusive and exhaustive.

86 Decision Table showing types of Error
H0 is True H0 is False Correct Decision Type II Error Accept H0 Type I Error Correct Decision Reject H0

87 The Approach in Statistical Testing is:
Set up the Acceptance Region so that a is close to some predetermine value (the usual values are 0.05 or 0.01) The predetermine value of a (0.05 or 0.01) is called the significance level of the test. The significance level of the test is a = P[test makes a type I error]

88 Determining the Critical Region
The Critical Region should consist of values of the test statistic that indicate that HA is true. (hence H0 should be rejected). The size of the Critical Region is determined so that the probability of making a type I error, a, is at some pre-determined level. (usually 0.05 or 0.01). This value is called the significance level of the test. Significance level = P[test makes type I error]

89 To find the Critical Region
Find the sampling distribution of the test statistic when is H0 true. Locate the Critical Region in the tails (either left or right or both) of the sampling distribution of the test statistic when is H0 true. Whether you locate the critical region in the left tail or right tail or both tails depends on which values indicate HA is true. The tails chosen = values indicating HA.

90 the size of the Critical Region is chosen so that the area over the critical region and under the sampling distribution of the test statistic when is H0 true is the desired level of a =P[type I error] Sampling distribution of test statistic when H0 is true Critical Region - Area = a

91 The z-test for Proportions
Testing the probability of success in a binomial experiment

92 Situation A success-failure experiment has been repeated n times
The probability of success p is unknown. We want to test H0: p = p0 (some specified value of p) Against HA:

93 The Data The success-failure experiment has been repeated n times
The number of successes x is observed. Obviously if this proportion is close to p0 the Null Hypothesis should be accepted otherwise the null Hypothesis should be rejected.

94 The Test Statistic To decide to accept or reject the Null Hypothesis (H0) we will use the test statistic If H0 is true we should expect the test statistic z to be close to zero. If H0 is true we should expect the test statistic z to have a standard normal distribution. If HA is true we should expect the test statistic z to be different from zero.

95 The Standard Normal distribution
The sampling distribution of z when H0 is true: The Standard Normal distribution Reject H0 Accept H0

96 The Acceptance region:
Reject H0 Accept H0

97 Acceptance Region Critical Region With this Choice Accept H0 if:
Reject H0 if: With this Choice

98 Summary To Test for a binomial probability p
H0: p = p0 (some specified value of p) Against HA: we Decide on a = P[Type I Error] = the significance level of the test (usual choices 0.05 or 0.01)

99 Collect the data Compute the test statistic Make the Decision Accept H0 if: Reject H0 if:

100 Example In the last provincial election the proportion of the voters who voted for the Liberal party was 0.08 (8 %) The party is interested in determining if that percentage has changed A sample of n = 800 voters are surveyed

101 We want to test H0: p = 0.08 (8%) Against HA:

102 Summary Decide on a = P[Type I Error] = the significance level of the test Choose (a = 0.05) Collect the data The number in the sample that support the liberal party is x = 92

103 Compute the test statistic
Make the Decision Accept H0 if: Reject H0 if:

104 Since the test statistic is in the Critical region we decide to Reject H0
Conclude that H0: p = 0.08 (8%) is false There is a significant difference (a = 5%) in the proportion of the voters supporting the liberal party in this election than in the last election

105 The two-tailed z-test for Proportions
Testing the probability of success in a binomial experiment

106 Situation A success-failure experiment has been repeated n times
The probability of success p is unknown. We want to test H0: p = p0 (some specified value of p) Against HA:

107 The Test Statistic To decide to accept or reject the Null Hypothesis (H0) we will use the test statistic

108 Acceptance Region Critical Region With this Choice Accept H0 if:
Reject H0 if: With this Choice

109 The Acceptance region:
Reject H0 Accept H0

110 The one tailed z-test A success-failure experiment has been repeated n times The probability of success p is unknown. We want to test H0: (some specified value of p) Against HA: The alternative hypothesis is in this case called a one-sided alternative

111 The Test Statistic To decide to accept or reject the Null Hypothesis (H0) we will use the test statistic If H0 is true we should expect the test statistic z to be close to zero or negative If p = p0 we should expect the test statistic z to have a standard normal distribution. If HA is true we should expect the test statistic z to be a positive number.

112 The Standard Normal distribution
The sampling distribution of z when p = p0 : The Standard Normal distribution Reject H0 Accept H0

113 The Acceptance and Critical region:
Reject H0 Accept H0

114 The Critical Region is called one-tailed With this Choice
Acceptance Region Accept H0 if: Critical Region Reject H0 if: The Critical Region is called one-tailed With this Choice

115 Example A new surgical procedure is developed for correcting heart defects infants before the age of one month. Previously the procedure was used on infants that were older than one month and the success rate was 91% A study is conducted to determine if the success rate of the new procedure is greater than 91% (n = 200)

116 We want to test H0: Against HA:

117 Summary Decide on a = P[Type I Error] = the significance level of the test Choose (a = 0.05) Collect the data The number of successful operations in the sample of 200 cases is x = 187

118 Compute the test statistic
Make the Decision Accept H0 if: Reject H0 if:

119 Since the test statistic is in the Acceptance region we decide to Accept H0
Conclude that H0: is true More precisely H0 can’t be rejected There is a no significant (a = 5%) increase in the success rate of the new procedure over the older procedure

120 Comments When the decision is made to accept H0 is made one should not conclude that we have proven H0. This is because when setting up the test we have not controlled b = P[type II error] = P[accepting H0 when H0 is FALSE] Whenever H0 is accepted there is a possibility that a type II error has been made.

121 In the last example The conclusion that there is a no significant (a = 5%) increase in the success rate of the new procedure over the older procedure should be interpreted: We have been unable to proof that the new procedure is better than the old procedure

122 Some other comments: When does one use a two-tailed test? When does one use a one tailed test? Answer: This depends on the alternative hypothesis HA. Critical Region = values that indicate HA Thus if only the upper tail indicates HA, the test is one tailed. If both tails indicate HA, the test is two tailed.

123 Also: The alternative hypothesis HA usually corresponds to the research hypothesis (the hypothesis that the researcher is trying to prove) The new procedure is better The drug is effective in reducing levels of cholesterol. There has a change in political opinion from the time the survey was taken till the present time (time of current survey).

124 The z-test for the Mean of a Normal Population
We want to test, m, denote the mean of a normal population

125 Situation A sample of n observations are collected from a Normal distribution The mean of the Normal distribution, m, is unknown. We want to test H0: m = m0 (some specified value of m) Against HA:

126 The Data Let x1, x2, x3 , … , xn denote a sample from a normal population with mean m and standard deviation s. Let we want to test if the mean, m, is equal to some given value m0. Obviously if the sample mean is close to m0 the Null Hypothesis should be accepted otherwise the null Hypothesis should be rejected.

127 The Test Statistic To decide to accept or reject the Null Hypothesis (H0) we will use the test statistic If H0 is true we should expect the test statistic z to be close to zero. If H0 is true we should expect the test statistic z to have a standard normal distribution. If HA is true we should expect the test statistic z to be different from zero.

128 The Standard Normal distribution
The sampling distribution of z when H0 is true: The Standard Normal distribution Reject H0 Accept H0

129 The Acceptance region:
Reject H0 Accept H0

130 Acceptance Region Critical Region With this Choice Accept H0 if:
Reject H0 if: With this Choice

131 Summary To Test for mean m, of a normal population
H0: m = m0 (some specified value of m) Against HA: Decide on a = P[Type I Error] = the significance level of the test (usual choices 0.05 or 0.01)

132 Collect the data Compute the test statistic Make the Decision Accept H0 if: Reject H0 if:

133 Example A manufacturer Glucosamine capsules claims that each capsule contains on the average: 500 mg of glucosamine To test this claim n = 40 capsules were selected and amount of glucosamine (X) measured in each capsule. Summary statistics:

134 We want to test: Manufacturers claim is correct against Manufacturers claim is not correct

135 The Test Statistic

136 The Critical Region and Acceptance Region
Using a = 0.05 za/2 = z0.025 = 1.960 We accept H0 if ≤ z ≤ 1.960 reject H0 if z < or z > 1.960

137 The Decision Since z= -2.75 < -1.960 We reject H0
Conclude: the manufacturers’s claim is incorrect:

138 A review of the concepts
Hypothesis Testing A review of the concepts

139 In hypotheses testing there are two hypotheses
The Null Hypothesis (H0) The Alternative Hypothesis (HA) The alternative hypothesis is usually the research hypothesis - the hypothesis that the researcher is trying to prove. The null hypothesis is the hypothesis that the research hypothesis is not true.

140 A statistical Test is defined by
Choosing a statistic (called the test statistic) Dividing the range of possible values for the test statistic into two parts The Acceptance Region The Critical Region

141 To perform a statistical Test we
Collect the data. Compute the value of the test statistic. Make the Decision: If the value of the test statistic is in the Acceptance Region we decide to accept H0 . If the value of the test statistic is in the Critical Region we decide to reject H0 .

142 You can compare a statistical test to a meter
Value of test statistic Acceptance Region Critical Region Critical Region Critical Region is the red zone of the meter

143 Accept H0 Value of test statistic Acceptance Region Critical Critical

144 Reject H0 Acceptance Region Value of test statistic Critical Critical

145 Acceptance Region Critical Region Sometimes the critical region is located on one side. These tests are called one tailed tests.

146 Whether you use a one tailed test or a two tailed test depends on:
The hypotheses being tested (H0 and HA). The test statistic.

147 If only large positive values of the test statistic indicate HA then the critical region should be located in the positive tail. (1 tailed test) If only large negative values of the test statistic indicate HA then the critical region should be located in the negative tail. (1 tailed test) If both large positive and large negative values of the test statistic indicate HA then the critical region should be located both the positive and negative tail. (2 tailed test)

148 Usually 1 tailed tests are appropriate if HA is one-sided.
Two tailed tests are appropriate if HA is two -sided. But not always

149 Once the test statistic is determined, to set up the critical region we have to find the sampling distribution of the test statistic when H0 is true This describes the behaviour of the test statistic when H0 is true

150 We then locate the critical region in the tails of the sampling distribution of the test statistic when H0 is true a /2 a /2 The size of the critical region is chosen so that the area over the critical region is a.

151 This ensures that the P[type I error] = P[rejecting H0 when true] = a

152 To find P[type II error] = P[ accepting H0 when false] = b, we need to find the sampling distribution of the test statistic when H0 is false sampling distribution of the test statistic when H0 is false sampling distribution of the test statistic when H0 is true b a /2 a /2

153 The p-value approach to Hypothesis Testing

154 In hypothesis testing we need
A test statistic A Critical and Acceptance region for the test statistic The Critical Region is set up under the sampling distribution of the test statistic. Area = a (0.05 or 0.01) above the critical region. The critical region may be one tailed or two tailed

155 The Critical region: a/2 a/2 Reject H0 Accept H0

156 In test is carried out by
Computing the value of the test statistic Making the decision Reject if the value is in the Critical region and Accept if the value is in the Acceptance region.

157 The value of the test statistic may be in the Acceptance region but close to being in the Critical region, or The it may be in the Critical region but close to being in the Acceptance region. To measure this we compute the p-value.

158 Definition – Once the test statistic has been computed form the data the p-value is defined to be:
p-value = P[the test statistic is as or more extreme than the observed value of the test statistic] more extreme means giving stronger evidence to rejecting H0

159 Example – Suppose we are using the z –test for the mean m of a normal population and a = 0.05.
Thus the critical region is to reject H0 if Z < or Z > Suppose the z = 2.3, then we reject H0 p-value = P[the test statistic is as or more extreme than the observed value of the test statistic] = P [ z > 2.3] + P[z < -2.3] = =

160 Graph p - value -2.3 2.3

161 If the value of z = 1.2, then we accept H0
p-value = P[the test statistic is as or more extreme than the observed value of the test statistic] = P [ z > 1.2] + P[z < -1.2] = = 23.02% chance that the test statistic is as or more extreme than 1.2. Fairly high, hence 1.2 is not very extreme

162 Graph p - value -1.2 1.2

163 Properties of the p -value
If the p-value is small (<0.05 or 0.01) H0 should be rejected. The p-value measures the plausibility of H0. If the test is two tailed the p-value should be two tailed. If the test is one tailed the p-value should be one tailed. It is customary to report p-values when reporting the results. This gives the reader some idea of the strength of the evidence for rejecting H0

164 Summary A common way to report statistical tests is to compute the p-value. If the p-value is small ( < 0.05 or < 0.01) then H0 is rejected. If the p-value is extremely small this gives a strong indication that HA is true. If the p-value is marginally above the threshold 0.05 then we cannot reject H0 but there would be a suspicion that H0 is false.

165 Next topic: Student’s t - test


Download ppt "Statistical Inference"

Similar presentations


Ads by Google

玻璃钢生产厂家渭城玻璃钢园林雕塑天津玻璃钢雕塑企业成都玻璃钢雕塑加工公司四川仿铜玻璃钢雕塑玻璃钢圆门头雕塑通化玻璃钢雕塑厂家报价菏泽玻璃钢彩绘雕塑定制九江特色玻璃钢雕塑定做价格佛山玻璃钢花盆批发价格福建多彩玻璃钢雕塑市场佛山玻璃钢雕塑有哪些六安卡通玻璃钢雕塑报价玻璃钢雕塑定制定制玻璃钢雕塑三大要素广安玻璃钢广场雕塑厂家黑龙江佛像玻璃钢雕塑销售厂家河北周年庆典商场美陈批发价玻璃钢雕塑厂商出售玻璃钢景观雕塑设计制作玻璃钢骆驼雕塑哪家便宜河南室内商场美陈费用辣椒玻璃钢卡通雕塑工厂兴化玻璃钢雕塑公司电话玻璃钢雕塑情况甘肃广场玻璃钢雕塑公司云南玻璃钢雕塑作品做玻璃钢雕塑对环境有什么影响玻璃钢恐龙雕塑服务公司二七区玻璃钢人物雕塑安徽订制玻璃钢雕塑香港通过《维护国家安全条例》两大学生合买彩票中奖一人不认账让美丽中国“从细节出发”19岁小伙救下5人后溺亡 多方发声单亲妈妈陷入热恋 14岁儿子报警汪小菲曝离婚始末遭遇山火的松茸之乡雅江山火三名扑火人员牺牲系谣言何赛飞追着代拍打萧美琴窜访捷克 外交部回应卫健委通报少年有偿捐血浆16次猝死手机成瘾是影响睡眠质量重要因素高校汽车撞人致3死16伤 司机系学生315晚会后胖东来又人满为患了小米汽车超级工厂正式揭幕中国拥有亿元资产的家庭达13.3万户周杰伦一审败诉网易男孩8年未见母亲被告知被遗忘许家印被限制高消费饲养员用铁锨驱打大熊猫被辞退男子被猫抓伤后确诊“猫抓病”特朗普无法缴纳4.54亿美元罚金倪萍分享减重40斤方法联合利华开始重组张家界的山上“长”满了韩国人?张立群任西安交通大学校长杨倩无缘巴黎奥运“重生之我在北大当嫡校长”黑马情侣提车了专访95后高颜值猪保姆考生莫言也上北大硕士复试名单了网友洛杉矶偶遇贾玲专家建议不必谈骨泥色变沉迷短剧的人就像掉进了杀猪盘奥巴马现身唐宁街 黑色着装引猜测七年后宇文玥被薅头发捞上岸事业单位女子向同事水杯投不明物质凯特王妃现身!外出购物视频曝光河南驻马店通报西平中学跳楼事件王树国卸任西安交大校长 师生送别恒大被罚41.75亿到底怎么缴男子被流浪猫绊倒 投喂者赔24万房客欠租失踪 房东直发愁西双版纳热带植物园回应蜉蝣大爆发钱人豪晒法院裁定实锤抄袭外国人感慨凌晨的中国很安全胖东来员工每周单休无小长假白宫:哈马斯三号人物被杀测试车高速逃费 小米:已补缴老人退休金被冒领16年 金额超20万

玻璃钢生产厂家 XML地图 TXT地图 虚拟主机 SEO 网站制作 网站优化