By linear models, we mean that the hypothesis function \(h_{\bf w}({\bf x})\) is a linear function of the parameters \({\bf w}\).
Predictions are a linear combination of feature values
\[h_{\bf w}({\mathbf{x}}) = \sum_{k=0}^{p} w_k \phi_k({\mathbf{x}}) = {{\boldsymbol{\phi}}}({\mathbf{x}})^{\mathsf{T}}{{\mathbf{w}}}\] where \(\phi_k\) are called basis functions (or features!) As usual, we let \(\phi_0({\mathbf{x}})=1, \forall {\mathbf{x}}\), to create a bias.
To recover degree-\(d\) polynomial regression in one variable, set \[\phi_0(x) = 1, \phi_1(x) = x, \phi_2(x) = x^2, ..., \phi_d(x) = x^d\]
Basis functions are fixed for a given analysis