当前位置：首页 > news >正文

《Pytorch深度学习实践》ch2-梯度下降算法

news 2025/6/3 23:11:31

------B站《刘二大人》

1.Gradient Decent

局部最优，全局最优，非凸函数：

梯度下降算法公式：

2.Implementation

import matplotlib.pyplot as plt# 数据集
x_data = [1.0, 2.0, 3.0]
y_data = [2.0, 4.0, 6.0]# 权重
w = 1.0# 模型
def forward(x): return x * w# 损失函数
def cost(xs, ys):cost = 0for x, y in zip(xs, ys):y_pred = forward(x)cost += (y_pred - y) ** 2return cost / len(xs) # 梯度下降
def gradient(xs, ys):grad = 0for x, y in zip(xs, ys):y_pred = forward(x)grad += 2 * x * (y_pred - y)return grad / len(xs)# 训练轮数 epoch 为横坐标，损失 Cost 为纵坐标
epoch_list = []
cost_list = []print('Predict (before training)', 4, forward(4))for epoch in range(100):cost_val = cost(x_data , y_data)grad_val = gradient(x_data,y_data)w -= 0.01 * grad_val # 0.01 为学习率epoch_list.append(epoch)cost_list.append(cost_val)print('Epoch:', epoch, 'w=', w, 'cost=', cost_val)print('Predict (after training)', 4, forward(4))# 绘图
plt.plot(epoch_list, cost_list)
plt.xlabel('epoch')
plt.ylabel('cost')
plt.grid()
plt.show

迭代如下：
绘制图形如下：

3.Discussion

如何让曲线变得更平滑：“加权”；
如果曲线最终没有趋于收敛：常见原因是 a 太大；

4. Stochastic Gradient Decent

相比于GD，SGD（随机梯度下降）：每次随机选择一个样本，优势在于有可能跨过局部最优，并且曲线也更加平滑；

代码修改如下：

import matplotlib.pyplot as plt# 数据集
x_data = [1.0, 2.0, 3.0]
y_data = [2.0, 4.0, 6.0]# 权重
w = 1.0# 模型
def forward(x): return x * w# 损失函数
def loss(x, y):y_pred = forward(x)return (y_pred - y) * (y_pred - y)# 梯度下降
def gradient(x, y):y_pred = forward(x)return 2 * x * (y_pred - y)# 训练轮数 epoch 为横坐标，损失 loss 为纵坐标
epoch_list = []
loss_list = []print('Predict (before training)', 4, forward(4))for epoch in range(100):for x, y in zip(x_data, y_data):loss_val = loss(x , y)grad_val = gradient(x,y)w -= 0.01 * grad_val # 0.01 为学习率epoch_list.append(epoch)loss_list.append(loss_val)print('Epoch:', epoch, 'w=', w, 'loss=', loss_val)print('Predict (after training)', 4, forward(4))# 绘图
plt.plot(epoch_list, loss_list)
plt.xlabel('epoch')
plt.ylabel('loss')
plt.grid()
plt.show