Skip to content

Commit

Permalink
Merge pull request #88 from catcooc/master
Browse files Browse the repository at this point in the history
修改4.4.1 4.4.3 4.4.4  4.5.3 4.5.4
  • Loading branch information
KMnO4-zx authored Mar 6, 2024
2 parents 0349dcb + be24928 commit 994b468
Showing 1 changed file with 82 additions and 5 deletions.
87 changes: 82 additions & 5 deletions notebooks/ch04/ch04.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -22394,14 +22394,35 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"  $$L(w)=\\frac{1}{2} \\sum_{i=1}^N\\left(\\sum_{j=0}^M w_j x_i^j-y_i\\right)^2$$ "
"  对于一个多项式回归问题我们可以把它的损失函数写成如下形式"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"$$取 \\frac{\\partial L(w)}{\\partial w_k}=0\\Rightarrow\\frac{1}{2} \\sum_{i=1}^N 2\\left(\\sum_{j=0}^M w_j x_i^j-y_i\\right) \\times x_i^k=0 \\Rightarrow \\sum_{i=1}^N \\sum_{j=0}^M w_j x_i^{j+k}=\\sum_{i=1}^N x_i^k y_i(k=0,1,2, \\cdots, M)\\Rightarrow X W=Y$$"
"  $$L(w)=\\frac{1}{2} \\sum_{i=1}^N\\left(\\sum_{j=0}^M w_j x_{i}^j-y_i\\right)^2$$ "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"  其中第$j$ 阶多项式$ w_j$的多项式参数,$x_i$表示第$i$个样本的值,$y_i$表示第i个样本的真实值。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"  为了最小化损失函数,我们可以对每一个多项式参数求损失函数极小值"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"$$取 \\frac{\\partial L(w)}{\\partial w_k}=0\\Rightarrow\\frac{1}{2} \\sum_{i=1}^N 2\\left(\\sum_{j=0}^M w_j x_{i}^j-y_i\\right) \\times x_{i}^k=0 \\Rightarrow \\sum_{i=1}^N \\sum_{j=0}^M w_j x_i^{j+k}=\\sum_{i=1}^N x_i^k y_i(k=0,1,2, \\cdots, M)\\Rightarrow X W=Y$$"
]
},
{
Expand Down Expand Up @@ -22429,6 +22450,13 @@
"$$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"  因此可以求得多项式参数如下式:"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -26190,7 +26218,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"  避免出现较大的梯度值或者损失值。除了这个办法我们也可以让每个特征减去该特征数据的平均值并除于标准差。"
"  会出现较大的梯度值或者损失值,从而使得计算不稳定。除了这个办法我们也可以通过每个特征减去该特征数据的平均值并除于标准差来标准化特征。"
]
},
{
Expand All @@ -26213,7 +26241,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"  在训练数据足够多的时候泛化误差接近经验误差,实验对于确定可解,且数据没有噪声的问题泛化误差可以为零。"
"  理论上,只有模型可以完全捕捉数据所有模式才可能为了0。然而,在真实世界的情况下,数据通常是不完美的,可能包含噪声或不可预测的变化,因此即使使用了最好的模型和算法,也很难实现泛化误差为零。此外,在有限的数据量和复杂的问题领域中,即使是最佳模型也可能无法完全捕捉数据的所有模式,因此泛化误差通常不会为零,但我们的目标是尽可能地接近零。"
]
},
{
Expand Down Expand Up @@ -27219,13 +27247,55 @@
"**解答:**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"  在$L_1$正则化下损失函数可以写成如下所示"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"  $$L(\\mathbf{w}, b)+\\lambda\\|\\mathbf{w}\\|_1$$\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"  其中第一部分为不考虑正则化的损失函数"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"  $$L(\\mathbf{w}, b)=\\frac{1}{n}\\sum_{i=1}^n \\frac{1}{2}\\left(\\mathbf{w}^\\top \\mathbf{x}^{(i)} + b - y^{(i)}\\right)^2.$$\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"  其中第二部分限制大小$\\|\\mathbf{w}\\|$的惩罚项"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"  $$\\lambda\\|\\mathbf{w}\\|_1=\\lambda \\sum_{i=1}^n |w_i|$$\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"  为了求出最小化损失函数的$w_i$,我们要对每个$w_i$求梯度然后按梯度下降更新参数"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -27256,6 +27326,13 @@
"  $$\\|\\mathbf{w}\\|_2= \\left[\\mathbf{w}^\\top \\mathbf{w}\\right]^{1 / 2}$$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"  可以表示$n$维向量欧几里得距离"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -60311,7 +60388,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.18"
"version": "3.9.16"
},
"toc": {
"base_numbering": 1,
Expand Down

0 comments on commit 994b468

Please sign in to comment.