【重回帰分析・最小二乗法の途中計算】①行列表現を用いた最小二乗推定量の算出、②偏差平方和(変動)と偏差積和(共変動)を用いた最小二乗推定量の算出

重回帰分析最小二乗法の途中計算(最小二乗推定量の算出) 統計学

前提

文字

実績値、理論値

  • 実績値:\( y_i = \beta_0 + \beta_1x_{i1} + \beta_2x_{i2} + \cdots + \beta_kx_{ik} + u_i \)
  • 理論値:\( \widehat{y_i} = \widehat{\beta_0} + \widehat{\beta_1}x_{i1} + \widehat{\beta_2}x_{i2} + \cdots + \widehat{\beta_k}x_{ik} \)
    ※誤差項:\( u_i \)

残差、残差平方和(RSS:residual sum of squares)

  • 残差:\( \widehat{u_i} = y_i \ – \ \widehat{y_i} = y_i \ – \ \widehat{\beta_0} + \widehat{\beta_1}x_{i1} + \widehat{\beta_2}x_{i2} + \cdots + \widehat{\beta_k}x_{ik} \)
  • 残差平方和:\[ RSS = \sum_{i=1}^n \widehat{u_i}^2  = \sum_{i=1}^n (y_i \ – \  (\widehat{\beta_0} + \widehat{\beta_1}x_{i1} + \widehat{\beta_2}x_{i2}  + \cdots + \widehat{\beta_k}x_{ik}))^2 \]

①行列表現を用いた最小二乗推定量の算出

導出の対象:\( \widehat{\beta} = (X^TX)^{-1}X^Ty \)

\( X = \begin{pmatrix}
1 & x_{11} & x_{12} & \cdots & x_{1k} \\
1 & x_{21} & x_{22} & \cdots & x_{2k} \\
\vdots & \vdots & \vdots & \vdots \\
1 & x_{i1} & x_{i2} & \cdots & x_{ik} \\
\vdots & \vdots & \vdots & \vdots \\
1 & x_{n1} & x_{n2} & \cdots & x_{nk} \\
\end{pmatrix} \)

\( \widehat{\beta_{j}} = \begin{pmatrix}
\widehat{\beta_0} \\
\widehat{\beta_1} \\
\vdots \\
\widehat{\beta_k} \\
\end{pmatrix} \)

\( y_{i} = \begin{pmatrix}
y_0 \\
y_1 \\
\vdots \\
y_n \\
\end{pmatrix} \)

\begin{align}
RSS &= \widehat{u}^T\widehat{u} \\
&= (y_i \ – \ X \widehat{\beta})^T(y_i \ – \  X \widehat{\beta}) \\
&= (y^T \ – \ \widehat{\beta}^T X^T)(y \ – \ X\widehat{\beta}) \\
&= y^Ty \ – \ y^TX\widehat{\beta} \ – \ \widehat{\beta}^TX^Ty + \widehat{\beta}^TX^TX\widehat{\beta}
\end{align}

\( y^TX\widehat{\beta} \)はスカラー量なので、転置しても同じ( \(y^TX\widehat{\beta} = \widehat{\beta}^TX^Ty \) )

したがって、\[ RSS = y^Ty \ – \ 2\widehat{\beta}^TX^Ty + \widehat{\beta}^TX^TX\widehat{\beta} \]

\( \widehat{\beta} \) で偏微分して \( =0 \) とすれば、\[ \frac{\partial{RSS}}{\partial{\widehat{\beta}}} = 0 \] より、
\begin{align}
-2X^Ty + 2X^TX\widehat{\beta} = 0
\end{align}

よって、

②偏差平方和(変動)と偏差積和(共変動)を用いた最小二乗推定量の算出

導出の対象:説明変数が2個の場合の \( \widehat{\beta_0}, \ \widehat{\beta_1}, \ \widehat{\beta_2} \)
\( \widehat{\beta_0} = \overline{y} \ – \ \widehat{\beta_1}\overline{X_1} \ – \ \widehat{\beta_2}\overline{X_2} \)
\( \widehat{\beta_1} = \frac{S_{22}S_{1y} \ – \ S_{12}S_{2y}}{S_{11}S_{22} \ – \ S_{12}^2} \)
\( \widehat{\beta_2} = \frac{S_{11}S_{2y} \ – \ S_{12}S_{1y}}{S_{11}S_{22} \ – \ S_{12}^2} \)

説明変数が2個の場合の実績値、理論値、RSSは以下。

  • 実績値:\( y_i = \beta_0 + \beta_1x_{i1} + \beta_2x_{i2} + u_i \)
  • 理論値:\( \widehat{y_i} = \widehat{\beta_0} + \widehat{\beta_1}x_{i1} + \widehat{\beta_2}x_{i2} \)
    ※誤差項:\( u_i \)
  • RSS:RSS = \[ \sum_{i=1}^n \widehat{u_i}^2  = \sum_{i=1}^n (y_i \ – \  (\widehat{\beta_0} + \widehat{\beta_1}x_{i1} + \widehat{\beta_2}x_{i2}))^2 \]

最小二乗推定量 \( \widehat{\beta_0}, \ \widehat{\beta_1}, \ \widehat{\beta_2} \) を求める。

\( \frac{\partial{RSS}}{\partial{\widehat{\beta_0}}} = 0 \) より、
\begin{align}
-2\sum_{i=1}^n(y_i \ – \ (\widehat{\beta_0} + \widehat{\beta_1}x_{i1} + \widehat{\beta_2}x_{i2})) &= 0 \\
\sum_{i=1}^n(y_i \ – \ \widehat{\beta_0} \ – \ \widehat{\beta_1}x_{i1} \ – \ \widehat{\beta_2}x_{i2})) &= 0 \tag{1}
\end{align}

\( \frac{\partial{RSS}}{\partial{\widehat{\beta_1}}} = 0 \) より、
\begin{align}
-2\sum_{i=1}^nx_{i1}(y_i \ – \ (\widehat{\beta_0} + \widehat{\beta_1}x_{i1} + \widehat{\beta_2}x_{i2})) &= 0 \\
\sum_{i=1}^nx_{i1}(y_i \ – \ \widehat{\beta_0} \ – \ \widehat{\beta_1}x_{i1} \ – \ \widehat{\beta_2}x_{i2})) &= 0 \tag{2}
\end{align}

\( \frac{\partial{RSS}}{\partial{\widehat{\beta_2}}} = 0 \) より、
\begin{align}
-2\sum_{i=1}^nx_{i2}(y_i \ – \ (\widehat{\beta_0} + \widehat{\beta_1}x_{i1} + \widehat{\beta_2}x_{i2})) &= 0 \\
\sum_{i=1}^nx_{i2}(y_i \ – \ \widehat{\beta_0} \ – \ \widehat{\beta_1}x_{i1} \ – \ \widehat{\beta_2}x_{i2})) &= 0 \tag{3}
\end{align}

(1)より、
\begin{align}
\sum_{i=1}^ny_i \ – \ \sum_{i=1}^n\widehat{\beta_0} \ – \ \sum_{i=1}^n\widehat{\beta_1}x_{i1} \ – \ \sum_{i=1}^n\widehat{\beta_2}x_{i2} &= 0\\
n\widehat{\beta_0} + \widehat{\beta_1}\sum_{i=1}^nx_{i1} + \widehat{\beta_2}\sum_{i=1}^nx_{i2} &= \sum_{i=1}^n y_i \tag{4}
\end{align}

(2)より、
\begin{align}
\sum_{i=1}^nx_{i1}y_i \ – \ \sum_{i=1}^n\widehat{\beta_0}x_{i1} \ – \ \sum_{i=1}^n\widehat{\beta_1}(x_{i1})^2 \ – \ \sum_{i=1}^n\widehat{\beta_2}x_{i1}x_{i2} &= 0\\
\widehat{\beta_0}\sum_{i=1}^nx_{i1} + \widehat{\beta_1}\sum_{i=1}^n(x_{i1})^2 + \widehat{\beta_2}\sum_{i=1}^nx_{i1}x_{i2} &= \sum_{i=1}^n x_{i1}y_i \tag{5}
\end{align}

(3)より、
\begin{align}
\sum_{i=1}^nx_{i2}y_i \ – \ \sum_{i=1}^n\widehat{\beta_0}x_{i2} \ – \ \sum_{i=1}^n\widehat{\beta_1}x_{i1}x_{i2} \ – \ \sum_{i=1}^n\widehat{\beta_2}(x_{i2})^2 &= 0\\
\widehat{\beta_0}\sum_{i=1}^nx_{i2} + \widehat{\beta_1}\sum_{i=1}^nx_{i1}x_{i2} + \widehat{\beta_2}\sum_{i=1}^n(x_{i2})^2 &= \sum_{i=1}^n x_{i2}y_i \tag{6}
\end{align}

ここで、\( \overline{X_1} = \frac{1}{n}\sum_{i=1}^nx_{i1} \) は、\( n\overline{X_1} = \sum_{i=1}^nx_{i1} \)

(4)より、
\begin{align}
n\widehat{\beta_0} + n\widehat{\beta_1}\overline{X_1} + n\widehat{\beta_2}\overline{X_2} &= n\overline{y} \\
\end{align}

よって、

(5)に(7)を代入して整理すると、
\begin{align}
(\overline{y} \ – \ \widehat{\beta_1}\overline{X_{1}} \ – \ \widehat{\beta_2}\overline{X_{2}})\sum_{i=1}^nx_{i1} + \widehat{\beta_1}\sum_{i=1}^n(x_{i1})^2 + \widehat{\beta_2}\sum_{i=1}^nx_{i1}x_{i2} &= \sum_{i=1}^n x_{i1}y_i \\
\widehat{\beta_1}(\sum_{i=1}^n(x_{i1})^2 \ – \ \overline{X_1}\sum_{i=1}^nx_{i1}) + \widehat{\beta_2}(\sum_{i=1}^nx_{i1}x_{i2} \ – \ \overline{X_2}\sum_{i=1}^nx_{i1}) + n\overline{X_1}\overline{y} &= \sum_{i=1}^nx_{i1}y_i \\
\widehat{\beta_1}(\sum_{i=1}^n(x_{i1})^2 \ – \ n(\overline{X_1})^2) + \widehat{\beta_2}(\sum_{i=1}^nx_{i1}x_{i2} \ – \ n\overline{X_1}*\overline{X_2}) &= \sum_{i=1}^nx_{i1}y_i \ – \ n\overline{X_1}\overline{y} \tag{8}
\end{align}

(6)に(7)を代入して整理すると、
\begin{align}
(\overline{y} \ – \ \widehat{\beta_1}\overline{X_{1}} \ – \ \widehat{\beta_2}\overline{X_{2}})\sum_{i=1}^n x_{i1} + \widehat{\beta_1}\sum_{i=1}^n x_{i1}x_{i2} + \widehat{\beta_2}\sum_{i=1}^n (x_{i2})^2 &= \sum_{i=1}^n x_{i2}y_i \\
\widehat{\beta_1}(\sum_{i=1}^n x_{i1}x_{i2} \ – \ \overline{X_1}\sum_{i=1}^nx_{i2}) + \widehat{\beta_2}(\sum_{i=1}^n(x_{i2})^2 \ – \ \overline{X_2}\sum_{i=1}^n x_{i2}) + n\overline{X_2}\overline{y} &= \sum_{i=1}^nx_{i2}y_i \\
\widehat{\beta_1}(\sum_{i=1}^n x_{i1}x_{i2} \ – \ n\overline{X_1}*\overline{X_2} + \widehat{\beta_2}(\sum_{i=1}^n(x_{i2})^2 \ – \ n(\overline{X_2})^2 &= \sum_{i=1}^n x_{i2}y_i \ – \ n\overline{X_2}\overline{y} \tag{9}
\end{align}

ここで、以下のように文字を置く。
\begin{alignat}{3}
S_{11} &= \sum_{i=1}^n (x_{i1})^2 – n(\overline{X_1})^2 &\qquad S_{22} &= \sum_{i=1}^n (x_{i2})^2 – n(\overline{X_2})^2\\
S_{1y} &= \sum_{i=1}^n x_{i1}y_{i} \ – \ n\overline{X_1}\overline{y} & S_{2y} &= \sum_{i=1}^n x_{i2}y_{i} \ – \ n\overline{X_2}\overline{y} \\
S_{12} &= \sum_{i=1}^n x_{i1}x_{i2} \ – \ n\overline{X_1}*\overline{X_2}
\end{alignat}

上記を用いて、
(8)より、\( \widehat{\beta_1}S_{11} + \widehat{\beta_2}S_{12} = S_{1y} \ \ \ \ \)(10)
(9)より、\( \widehat{\beta_1}S_{12} + \widehat{\beta_2}S_{22} = S_{2y} \ \ \ \ \)(11)
(10)と(11)の連立方程式を解く。

(10)より、
\begin{align}
\widehat{\beta_2}S_{12} &= S_{1y} \ – \ \widehat{\beta_1}S_{11} \\
\widehat{\beta_2} &= \frac{S_{1y} \ – \ \widehat{\beta_1}S_{11}}{S_{12}} \tag{12}
\end{align}

(12)を(11)に代入すると、
\begin{align}
\widehat{\beta_1}S_{12} + \frac{S_{22}(S_{1y} \ – \ \widehat{\beta_1}S_{11})}{S_{12}} &= S_{2y} \\
\widehat{\beta_1}S_{12}^2 + S_{22}S_{1y} – \widehat{\beta_1}S_{11}S_{22} &= S_{12}S_{2y} \\
\widehat{\beta_1}(S_{12}^2 \ – \ S_{11}S_{22}) &= S_{11}S_{2y} \ – \ S_{22}S_{1y}
\end{align}

よって、

(11)より、
\begin{align}
\widehat{\beta_1}S_{12} &= S_{2y} \ – \ \widehat{\beta_2}S_{22} \\
\widehat{\beta_1} &= \frac{S_{2y} \ – \ \widehat{\beta_2}S_{22}}{S_{12}} \tag{13}
\end{align}

(13)を(10)に代入すると、
\begin{align}
\frac{S_{11}(S_{2y} \ – \ \widehat{\beta_2}S_{22})}{S_{12}} + \widehat{\beta_2}S_{12} &= S_{1y} \\
S_{11}S_{2y} \ – \ \widehat{\beta_2}S_{11}S_{22} + \widehat{\beta_2}S_{12}^2 &= S_{12}S_{1y} \\
\widehat{\beta_2}(S_{12}^2 – S_{11}S_{22}) &= S_{12}S_{1y} \ – \ S_{11}S_{2y}
\end{align}

よって、

タイトルとURLをコピーしました