The purpose of this Article is to illustrate the method used in dependent sample hypothesis test which could contribute to analysing the variable relationship difference between before treatment and after the treatment. In other words, this method is used to test whether the treatment has significantly changed the relationship between sample variables.
This question was posed by Youshi on March 17, 2018. Imagine we have a group of subjects that having two variables, named Y and X. After we perform a control experiment by introducing a treatment we will have two samples, named samples-before and samples-after.The relevant information could be illustrated as follows.
Dependent Variable | Independent Variable |
---|---|
Y_{b1} | X_{b1} |
Y_{b2} | X_{b2} |
â€¦ â€¦ | â€¦ â€¦ |
Y_{bn} | X_{bn} |
Dependent Variable | Independent Variable |
---|---|
Y_{a1} | X_{a1} |
Y_{a2} | X_{a2} |
â€¦ â€¦ | â€¦ â€¦ |
Y_{an} | X_{an} |
First step is to perform regression analysis for two sample gourps sepreately and that would produce 4 scenarios.
Dependent Variable | Independent Variable | Relationship |
---|---|---|
Y_{b} | X_{b} | Strong Correlation |
Y_{a} | X_{a} | Strong Correlation |
Dependent Variable | Independent Variable | Relationship |
---|---|---|
Y_{b} | X_{b} | Not Significant |
Y_{a} | X_{a} | Not Significant |
Dependent Variable | Independent Variable | Relationship |
---|---|---|
Y_{b} | X_{b} | Not Significant |
Y_{a} | X_{a} | Strong Correlation |
Dependent Variable | Independent Variable | Relationship |
---|---|---|
Y_{b} | X_{b} | Strong Correlation |
Y_{a} | X_{a} | Not Significant |
Itâ€™s not hard to find that it is necessary to make a meaningful analysis further only under the scenario 1. For scenario 3 and 4, regression analysis already gives the answer that the relationship between dependent variable and independent variable are changed by treatment, from strong correlation to not significant correlation, or the oppsite way.
The next step is to design a statistic that could represent the relationship between variable Y and X. Assume variable Y and X are linear correlated.
\[ Y=\alphaÂ·X+\beta\\ \]
\(\alpha\) is slope and \(\beta\) is interception. Itâ€™s expected that the mean of Y and X has the same linear correlation.
\[ \bar{Y}=\alphaÂ·\bar{X}+\beta \\ \]
Then weâ€™ll getâ€¦ \[ \begin{aligned} Y-\bar{Y} &=(\alphaÂ·X + \beta) - (\alphaÂ·\bar{X}+\beta) \\ &= \alpha(X-\bar{X})+(\beta-\beta) \\ &= \alpha(X-\bar{X}) \\ \end{aligned} \] So we get statistic \(\alpha\) \[ \begin{aligned} \alpha = \frac{Y-\bar{Y}}{X-\bar{X}} \end{aligned} \]
so we could use this linear transformation to get the statistic \(\alpha\) of both sample before and sample after.
Before | After |
---|---|
\(\alpha_{b1} = \frac{Y_{b1}-\bar{Y_{b}}}{X_{b1}-\bar{X_{b}}}\) | \(\alpha_{a1} = \frac{Y_{a1}-\bar{Y_{a}}}{X_{a1}-\bar{X_{a}}}\) |
\(\alpha_{b2} = \frac{Y_{b2}-\bar{Y_{b}}}{X_{b2}-\bar{X_{b}}}\) | \(\alpha_{a2} = \frac{Y_{a2}-\bar{Y_{a}}}{X_{a2}-\bar{X_{a}}}\) |
â€¦ â€¦ | â€¦ â€¦ |
\(\alpha_{bn} = \frac{Y_{bn}-\bar{Y_{b}}}{X_{bn}-\bar{X_{b}}}\) | \(\alpha_{an} = \frac{Y_{an}-\bar{Y_{a}}}{X_{an}-\bar{X_{a}}}\) |
Caculate the sample mean, sample standard deviation and standard error.
\[ \bar{\alpha_{b}} = \frac{\sum\limits_{i=1}^{n}\alpha_{bi}}{n},\;\;\;\;\;\; \bar{\alpha_{a}} = \frac{\sum\limits_{i=1}^{n}\alpha_{ai}}{n},\;\;\;\;\;\; S_\alpha=\sqrt{\frac{\sum\limits_{i=1}^{n}(\alpha_{bi}-\alpha_{ai})^{2}}{n-1}},\;\;\;\;\;\; \sigma_\alpha=\frac{S}{\sqrt{n}} \]
Caculate the t statistics.
\[ \begin{aligned} t_\alpha=\frac{\bar{\alpha_b}-\bar{\alpha_a}}{\sigma} \end{aligned} \]
If \(t_\alpha\) is significant, then we could reject null hypothesis and accept the alternative hypothesis that the treatment has significant influence on the relationship between Dependent Variable Y and Independent Variable X. But if \(t_\alpha\) is not significant, we still need to test interception \(\beta\) to give a further conclution.
\[ \begin{aligned} &Y=\alphaÂ·X+\beta\\ &\beta = Y - \alphaÂ·X\\ \end{aligned} \]
Caculate the statistic \(\beta\)
Before | After |
---|---|
\(\beta_{b1} = {Y_{b1}-\alpha_{b}Â·X_{b1}}\) | \(\beta_{a1} = {Y_{a1}-\alpha_{a}Â·X_{a1}}\) |
\(\beta_{b2} = {Y_{b2}-\alpha_{b}Â·X_{b2}}\) | \(\beta_{a1} = {Y_{a1}-\alpha_{a}Â·X_{a1}}\) |
â€¦ â€¦ | â€¦ â€¦ |
\(\beta_{bn} = {Y_{bn}-\alpha_{b}Â·X_{bn}}\) | \(\beta_{an} = {Y_{an}-\alpha_{a}Â·X_{an}}\) |
As previous test for \(\alpha\), if \(t_\beta\) is significant, then we could reject null hypothesis and accept the alternative hypothesis that the treatment has significant influence on the relationship between Dependent Variable Y and Independent Variable X. But if both \(t_\alpha\) and \(t_\beta\) are not significant, we then have a solid reason retain the null hypothesis.
First step is to produce two random data set x and C with each size = 50.
Secondly, construct a function named rmplot with 4 input parameters.They are slope, interception, residual coefficient, and the indicator for before treament of after treament. For example, rmplot(slope,interception,res,t). The out put is scattor polt.
Thirdly, build a function to perform t-test. The output is T-test result.
x1 <- rmplot(2,-11,1,0)
x2 <- rmplot(2.5,30,1,1)
ttestplot()
x1 <- rmplot(2,-11,1,0)
x2 <- rmplot(3.5,30,1,1)
ttestplot()
x1 <- rmplot(2,-11,1,0)
x2 <- rmplot(5,30,1,1)
ttestplot()