Key idea: Minimize \[
J(\mathbf{\boldsymbol{\alpha}})=
\frac{1}{2}\,\sum_{i=0}^{n} {\lvert {\widehat u(x_i) - u_i} \rvert}^2
= \frac{1}{2}\,\sum_{i=0}^{n} \left( \sum_{j=0}^{m} \alpha_j \varphi_j(x_i) - u_i\right)^2
=\frac{1}{2}\,{\lVert {\mathsf A \mathbf{\boldsymbol{\alpha}}-\mathbf{\boldsymbol{b}}} \rVert}^2
\]
To this end, we seek \(\alpha\) such that \[
\nabla J(\mathbf{\boldsymbol{\alpha}})=\frac{1}{2}\,\nabla \Bigl( (\mathsf A \mathbf{\boldsymbol{\alpha }}- \mathbf{\boldsymbol{b}})^T(\mathsf A \mathbf{\boldsymbol{\alpha }}- \mathbf{\boldsymbol{b}}) \Bigr)=\mathbf{\boldsymbol{0}},
\] which leads to the equation \[
\nabla J(\mathbf{\boldsymbol{\alpha}})=\mathsf A^T(\mathsf A \mathbf{\boldsymbol{\alpha }}- \mathbf{\boldsymbol{b}})
=\mathbf{\boldsymbol{0}}
\] Rearranging, we obtain the following linear system \[
\mathsf A^T\mathsf A \mathbf{\boldsymbol{\alpha }}= \mathsf A^T\mathbf{\boldsymbol{b}}.
\]