Pytorch深度学习(小土堆)
-------------------------在此感谢土堆老师QAQ------------------------------
1. 环境配置
1.1 Python学习两大法宝
- dir():打开,看见里面包含哪些工具;
- help():说明书,说明怎样使用这些工具;
1.2 Pycham与Jupyter
2. 加载数据
2.1 input and label
2.2 Dataset and Dataloader
- 利用 Dataset 处理数据集并实例化:
- 在 init 方法里面输入根目录和标签目录,创建完整路径并得到文件名列表;在 getitem 方法里面输入索引 idx,创建该索引图片完整路径,打开图片并和标签一起返回;在 len 方法里面返回数据集长度,即图片个数;
from torch.utils.data import Dataset
from PIL import Image # Python Image Libarary 图像处理库
import osclass MyData(Dataset):def __init__(self, root_dir, label_dir):self.root_dir = root_dirself.label_dir = label_dirself.path = os.path.join(root_dir, label_dir) # 创建标签目录的完整路径self.img_path = os.listdir(self.path) # 获取标签目录下的所有文件名def __getitem__(self,idx):img_name = self.img_path[idx] # 获取当前索引对应的图片文件名img_item_path = os.path.join(self.root_dir, self.label_dir, img_name) # 构建完整的图片路径img = Image.open(img_item_path)label = self.label_dirreturn img, label # 返回图片和标签def __len__(self):return len(self.img_path) # 返回数据集长度root_dir = "hymenoptera_data/train"
ants_label_dir = "ants"
ants_dataset = MyData(root_dir, ants_label_dir) # 实例化bees_label_dir = "bees"
bees_dataset = MyData(root_dir, bees_label_dir)train_dataset = ants_dataset + bees_dataset # 合成一个数据集
- 运行的一些结果:
3. TensorBoard
3.1 add_scalar
- 这段代码使用了
torch.utils.tensorboard
中的SummaryWriter
类,它是 PyTorch 提供的用于将训练过程中的信息(如损失、精度等)记录到 TensorBoard 的工具。 - TensorBoard 是一个可视化工具,可以查看和分析 PyTorch 或 TensorFlow 中的训练过程。
- 创建了一个
SummaryWriter
对象writer
,并指定日志存储的目录为logs
。TensorBoard 会读取这个目录中的日志文件,并通过网页进行展示。每次运行这段代码时,logs
目录中的数据会被更新。
from torch.utils.tensorboard import SummaryWriterwriter = SummaryWriter("logs")# writer.add_image()
# y = 2*x
for i in range(100):writer.add_scalar("y=2x", 2*i, i)writer.close()
- tensorboard --logdir=logs:启动 TensorBoard 来查看结果,访问
http://localhost:6006
,可以看到名为y=2x
的曲线图;(需要先运行代码) - 还可以通过 tensorboard --logdir=logs --port=6008 来切换别的端口进行访问;
- 在 Tensorboard 中显示如下:
3.2 add_image
- 输入变量:
1.tag
:图像的标签;- 2. img_tensor 需要为 torch.Tensor(利用Transform)或者 numpy.ndarray(利用numy);
- 3. global_step:训练的步数,可以用来记录训练过程中图像变化的时间点;
4.dataformats
:表示输入图像数据的格式,默认值是'CHW'
。常见的格式包括:'CHW'
:通道数 (C),高度 (H),宽度 (W)'HWC'
:高度 (H),宽度 (W),通道数 (C)
- 注意:从 PIL 导入的图片是 HWC 型(观察 shape 可知),但由于默认值为 CHW,所以需要在变量内指明;
- 运行代码:
from torch.utils.tensorboard import SummaryWriter
import numpy as np
from PIL import Imagewriter = SummaryWriter("logs")
img_path = "hymenoptera_data/train/ants/0013035.jpg"
img_PIL = Image.open(img_path)
img_array = np.array(img_PIL)writer.add_image("test", img_array, 1, dataformats="HWC")
# y = 2*x
for i in range(100):writer.add_scalar("y=2x", 2*i, i)writer.close()
- 在 TensorBoard 中观察:
- 如果此时更换图片,且 tag 不变,global_step 改为 2,则可以滑动图片上方时间轴进行 step 的切换;
4. Transforms
4.1 魔法方法
class Person():def __call__(self, name): # 魔法方法print('__call__' + ' hello ' + name)def hello(self,name):print('hello ' + name)person = Person()
person('zhangsan')
person.hello('lisi')
- 结果如下:可以看到得到的结果是一样的,通过魔法方法(前后各加两条下划线),可以直接在实例化中输入,而不用 '. + 方法' 调用;
4.2 Tensor
- 输入一个 PIL(PIL读入)/ numpy.ndarray(cv2读入) 类型的图片,返回 tensor 类型;
from torchvision import transforms
from PIL import Image
import cv2
from torch.utils.tensorboard import SummaryWriterimg_path = "hymenoptera_data/train/ants/0013035.jpg"
img_PIL = Image.open(img_path)
img_cv = cv2.imread(img_path)tensor_trans = transforms.ToTensor() # 实例化
img_tensor = tensor_trans(img_PIL)print(img_tensor)writer = SummaryWriter("logs")
writer.add_image("img_Tensor", img_tensor)writer.close()
- 运行结果:
4.3 Normalize
- 像素值计算:
使用 ToTensor 和 Normalize(输入需要为 Tensor):
from PIL import Image
from torchvision import transforms
from torch.utils.tensorboard import SummaryWriterwriter = SummaryWriter("logs")
img = Image.open("images/saber.png")
print(img)# ToTensor
trans_tensor = transforms.ToTensor()
img_tensor = trans_tensor(img)
writer.add_image("ToTensor", img_tensor)# Normalize
print(img_tensor[0][0][0])
trans_norm = transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])
img_norm = trans_norm(img_tensor)
print(img_norm[0][0][0])
writer.add_image("Normalize", img_norm)writer.close()
- 计算结果:
- tensorboard 显示结果:
4.4 Resize
- 输入的图片可以是 PIL,也可以是 Tensor 类型:
- 如果 Resize 里面只有一个数字,则较短边变化为该大小,图片等比例变换;
# Resize
print(img.size)
trans_resize = transforms.Resize((512, 512))
img_resize = trans_resize(img_tensor)
writer.add_image("Resize", img_resize)
print(img_resize)
4.5 Compose
- Compose 方法用来组合不同的 Transforms 工具,更加一体化;
- 这里先给 Resize 输入 PIL 类型的图片,再进行 Tensor 类型转换:
# Compose
trans_resize_2 = transforms.Resize(512)
trans_compose = transforms.Compose([trans_resize_2, trans_tensor])
img_resize_2 = trans_compose(img)
writer.add_image("Resize", img_resize_2, 1)
4.6 RandomCrop
- 随机裁剪方法:输入一个数字则输出的是正方形,序列则为(H,W);
# RandomCrop
trans_crop = transforms.RandomCrop(512)
trans_compose_2 = transforms.Compose([trans_crop, trans_tensor])
for i in range(10):img_crop = trans_compose_2(img)writer.add_image("RandomCrop", img_crop, i)
- 总结:
5. torchvision
5.1 CIFAR10 数据集
- 数据集如下:
- 调用方法:
test_set[0]
返回的是一个包含 一张图片 和它对应的 标签 的元组(image, label)
import torchvisiontrain_set = torchvision.datasets.CIFAR10("./dataset", train=True, download=True)
test_set = torchvision.datasets.CIFAR10("./dataset", train=False, download=True)print(test_set[0])
print(train_set.classes)img, target = train_set[0] # (图片, 标签)
print(img)
print(target)print(test_set.classes[target])
img.show()
- 运行结果:
5.2 利用 transforms
- 加入 transform 变量,将图片转化为 tensor 类型:
import torchvision
from torchvision.transforms import transforms
from torch.utils.tensorboard import SummaryWriterdataset_transform = transforms.Compose([transforms.ToTensor()
])train_set = torchvision.datasets.CIFAR10("./dataset", train=True, transform=dataset_transform, download=True)
test_set = torchvision.datasets.CIFAR10("./dataset", train=False, transform=dataset_transform, download=True)# print(test_set[0])
# print(train_set.classes)
#
# img, target = train_set[0]
# print(img)
# print(target)
#
# print(test_set.classes[target])
# img.show()writer = SummaryWriter("logs")
for i in range(10):img, target = test_set[i]writer.add_image("test_set", img, i)writer.close()
6. DataLoader
6.1 batch
- batch = 4:将 img,target 分别按 4 个进行打包分组;
- 运行结果如下:4 代表 4 张图片,tensor 是四张图片的标签集合;
6.2 DataLoader 使用方法
batch_size:每次加载多少个样本(默认为
1
);shuffle:设置为
True
可在每个 epoch 重新排列数据(默认为False
);num_workers:用于数据加载的子进程数量。
0
表示数据将在主进程中加载;drop_last:如果 Dataset 大小不能被 batch_size 整除,则设置为
True
以丢弃最后一个不完整的批次;注意此代码中的 add_images 而不是 add_image,因为 imgs 包含多张图片;
import torchvision
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter# 准备的测试数据集
test_data = torchvision.datasets.CIFAR10("./dataset", train=False, transform=torchvision.transforms.ToTensor())test_loader = DataLoader(dataset=test_data, batch_size=64, shuffle=True, num_workers=0, drop_last=False)# 测试数据集中第一张图片及target
img, target = test_data[0]
print(img.shape)
print(target)writer = SummaryWriter("dataloader")for epoch in range(2):step = 0for data in test_loader:imgs, targets = data# print(imgs.shape)# print(targets)writer.add_images("test_data", imgs, step)step += 1writer.close()
7. 神经网络
7.1 nn.Module
- 使用模板:
- 在 Python 3 中,建议使用
super().__init__();
import torch
from torch import nnclass Model(nn.Module):def __init__(self):super().__init__()def forward(self, input):output = input + 1return outputmodel = Model()
x = torch.tensor(1.0)
output = model(x)
print(output)
7.2 卷积(conv)计算
- 图像二维卷积:
- 计算方式:
- 卷积实现:
import torch
import torch.nn.functional as Finput = torch.tensor([[1, 2, 0, 3, 1],[0, 1, 2, 3, 1],[1, 2, 1, 0, 0],[5, 2, 3, 1, 1],[2, 1, 0, 1, 1]])kernel = torch.tensor([[1, 2, 1],[0, 1, 0],[2, 1, 0]])# 将 (H, W) -> (batch_size, C, H, W)
input = torch.reshape(input, (1, 1, 5, 5))
kernel = torch.reshape(kernel, (1, 1, 3, 3))print(input.shape)
print(kernel.shape)output = F.conv2d(input, kernel, stride=1)
print(output)# 卷积核的步长,即移动的步数
output2 = F.conv2d(input, kernel, stride=2)
print(output2)# 在图像四周添加 padding 层像素
output3 = F.conv2d(input, kernel, stride=1, padding=1)
print(output3)
- 结果如下:
7.3 卷积层
- Conv2d:
- In_channel and Out_channel:利用多个卷积核就可以得到多层数的输出;
- 卷积操作:
import torch.nn
import torchvision
from torch.utils.data import DataLoader
from torch import nn
from torch.utils.tensorboard import SummaryWriterdataset = torchvision.datasets.CIFAR10("./dataset", train=False, transform=torchvision.transforms.ToTensor(),download=True)dataloader = DataLoader(dataset=dataset, batch_size=64)class Model(nn.Module):def __init__(self):super(Model, self).__init__()self.conv1 = torch.nn.Conv2d(in_channels=3, out_channels=6, kernel_size=3, stride=1, padding=0)def forward(self, x):x = self.conv1(x)return xmodel = Model()
writer = SummaryWriter("logs")step = 0
for data in dataloader:imgs, targets = dataoutput = model(imgs)print(imgs.shape) # torch.Size([64, 3, 32, 32])print(output.shape) # torch.Size([64, 6, 30, 30]),通道为 6 无法显示output = torch.reshape(output, (-1, 3, 30, 30)) # -1 表示自动确认数值writer.add_images("input", imgs, step)writer.add_images("output", output, step)step += 1writer.close()
- 可以看到卷积起到了“提取特征”的作用:
7.4 最大池化层
- MaxPool2d:
- 池化计算过程:
- Ceil_model = True:池化核内只要有数字,就会取一个最大值出来,反之必须全都有值时才会取最大值;
import torch
from torch import nninput = torch.tensor([[1, 2, 0, 3, 1],[0, 1, 2, 3, 1],[1, 2, 1, 0, 0],[5, 2, 3, 1, 1],[2, 1, 0, 1, 1]])input = torch.reshape(input, (-1, 1, 5, 5))class Model(nn.Module):def __init__(self):super().__init__()self.maxpool1 = nn.MaxPool2d(kernel_size=3, ceil_mode=True)def forward(self, input):output = self.maxpool1(input)return outputmodel = Model()
output = model(input)
print(output)
- 池化结果:
import torch
import torchvision.datasets
from torch import nn
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriterdataset = torchvision.datasets.CIFAR10("./dataset", train=False, transform=torchvision.transforms.ToTensor(),download=True)dataloader = DataLoader(dataset=dataset, batch_size=64)class Model(nn.Module):def __init__(self):super().__init__()self.maxpool1 = nn.MaxPool2d(kernel_size=3, ceil_mode=True)def forward(self, input):output = self.maxpool1(input)return outputmodel = Model()
writer = SummaryWriter("logs")step = 0
for data in dataloader:imgs, targets = dataoutput = model(imgs) # 池化不改变 channelwriter.add_images("input", imgs, step)writer.add_images("output", output, step)step += 1writer.close()
- 可以看到池化起到了“压缩特征”的作用:
7.5 非线性激活
- ReLU:
import torch
from torch import nninput = torch.tensor([[1, -0.5],[-1, 3]])class Model(nn.Module):def __init__(self):super(Model, self).__init__()self.relu1 = torch.nn.ReLU()def forward(self, x):x = self.relu1(x)return xmodel = Model()
output = model(input)
print(output)
- 计算结果:
- Sigmoid:
import torch
import torchvision.datasets
from torch import nn
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriterdataset = torchvision.datasets.CIFAR10("./dataset", train=False, transform=torchvision.transforms.ToTensor(),download=True)dataloader = DataLoader(dataset, batch_size=64)class Model(nn.Module):def __init__(self):super(Model, self).__init__()self.sigmoid1 = torch.nn.Sigmoid()def forward(self, input):output = self.sigmoid1(input)return outputmodel = Model()
writer = SummaryWriter("logs")step = 0
for data in dataloader:imgs, targets = dataoutput = model(imgs)writer.add_images("input", imgs, step)writer.add_images("output", output, step)step += 1writer.close()
- 激活结果如下:
7.6 线性层
- Linear:
import torch
import torchvision
from torch import nn
from torch.utils.data import DataLoaderdataset = torchvision.datasets.CIFAR10("./dataset", train=False, transform=torchvision.transforms.ToTensor(),download=True)dataloader = DataLoader(dataset, batch_size=64)class Model(nn.Module):def __init__(self):super(Model, self).__init__()self.linear1 = torch.nn.Linear(196608, 10) # (1, 196608) -> (1, 10)def forward(self, input):output = self.linear1(input)return outputmodel = Model()for data in dataloader:imgs, targets = dataprint(imgs.shape)output = torch.flatten(imgs) # 将输入向量展开成一行print(output.shape)output = model(output)print(output.shape)
- 输出结果如下:
7.7 搭建网络和 Sequential
- CIFAR10 模型:
- 不用 Sequential 下:
import torch
from torch import nn
from torch.nn import Conv2d, MaxPool2d, Flatten, Linearclass Model(nn.Module):def __init__(self):super().__init__()self.conv1 = Conv2d(3, 32, 5, padding=2)self.maxpool1 = MaxPool2d(2)self.conv2 = Conv2d(32, 32, 5, padding=2)self.maxpool2 = MaxPool2d(2)self.conv3 = Conv2d(32, 64, 5, padding=2)self.maxpool3 = MaxPool2d(2)self.flatten = Flatten()self.linear1 = Linear(1024, 64)self.linear2 = Linear(64, 10)def forward(self, x):x = self.conv1(x)x = self.maxpool1(x)x = self.conv2(x)x = self.maxpool2(x)x = self.conv3(x)x = self.maxpool3(x)x = self.flatten(x)x = self.linear1(x)x = self.linear2(x)return xmodel = Model()
print(model)# 检验
input = torch.ones((64, 3, 32, 32))
output = model(input)
print(output.shape)
- 使用 Sequential:
import torch
from tensorflow.python.keras import Sequential
from torch import nn
from torch.nn import Conv2d, MaxPool2d, Flatten, Linearclass Model(nn.Module):def __init__(self):super().__init__()self.model = Sequential(Conv2d(3, 32, 5, padding=2),MaxPool2d(2),Conv2d(32, 32, 5, padding=2),MaxPool2d(2),Conv2d(32, 64, 5, padding=2),MaxPool2d(2),Flatten(),Linear(1024, 64),Linear(64, 10))def forward(self, x):x = self.model(x)return xmodel = Model()
print(model)# 检验
input = torch.ones((64, 3, 32, 32))
output = model(input)
print(output.shape)# 流程图
writer = SummaryWriter("logs")
writer.add_graph(model, input)
writer.close()
- 流程图如下:
7.8 损失函数与反向传播
- 常用的损失函数:MSELoss(平均平方误差,用于回归问题),CrossEntropyLoss(交叉熵损失,用于分类问题);
import torch
import torchvision.datasets
from torch import nn
from torch.nn import Conv2d, MaxPool2d, Flatten, Linear, Sequential
from torch.utils.data import DataLoaderdataset = torchvision.datasets.CIFAR10("./dataset", train=False, transform=torchvision.transforms.ToTensor(),download=True)dataloader = DataLoader(dataset, batch_size=64)class Model(nn.Module):def __init__(self):super().__init__()self.model = Sequential(Conv2d(3, 32, 5, padding=2),MaxPool2d(2),Conv2d(32, 32, 5, padding=2),MaxPool2d(2),Conv2d(32, 64, 5, padding=2),MaxPool2d(2),Flatten(),Linear(1024, 64),Linear(64, 10))def forward(self, x):x = self.model(x)return xmodel = Model()criterion = nn.CrossEntropyLoss() # 计算交叉熵损失
for data in dataloader:imgs, targets = dataoutput = model(imgs)loss = criterion(imgs, output)loss.backward()
7.9 优化器
- Optimizer:这里采用 gpu 跑模型;
import torch
import torchvision.datasets
from torch import nn, optim
from torch.nn import Conv2d, MaxPool2d, Flatten, Linear, Sequential
from torch.utils.data import DataLoaderdataset = torchvision.datasets.CIFAR10("./dataset", train=False, transform=torchvision.transforms.ToTensor(),download=True)dataloader = DataLoader(dataset, batch_size=32)device = torch.device("cuda")class Model(nn.Module):def __init__(self):super().__init__()self.model = Sequential(Conv2d(3, 32, 5, padding=2),MaxPool2d(2),Conv2d(32, 32, 5, padding=2),MaxPool2d(2),Conv2d(32, 64, 5, padding=2),MaxPool2d(2),Flatten(),Linear(1024, 64),Linear(64, 10))def forward(self, x):x = self.model(x)return xmodel = Model().to(device)criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.01)for epoch in range(20):running_loss = 0.0for data in dataloader:# 前馈imgs, targets = dataimgs, targets = imgs.to(device), targets.to(device)output = model(imgs)loss = criterion(output, targets)# 梯度清零 + 反馈optimizer.zero_grad()loss.backward()# 更新optimizer.step()running_loss += lossprint(running_loss)
- 损失变化:
8. 网络模型
8.1 vgg16 模型
- 权重表示是否经过预训练:
- 这个模型预训练的数据集为 ImageNet,但是太大了,于是选择直接下载模型;
import torchvision.datasets
from torchvision.models import VGG16_Weights# train_data = torchvision.datasets.ImageNet("./dataset", split="train",
# transform=torchvision.transforms.ToTensor())vgg16_true = torchvision.models.vgg16(weights = VGG16_Weights.DEFAULT) # 预训练好了的模型
vgg16_false = torchvision.models.vgg16(weights = None)print(vgg16_true)
- 可以看到这个模型是一个分类器,最后一层特征数为 1000:
8.2 修改模型
- 修改模型最后一层的输出特征数为 10:
import torchvision.datasets
from torch import nn
from torchvision.models import VGG16_Weights# train_data = torchvision.datasets.ImageNet("./dataset", split="train",
# transform=torchvision.transforms.ToTensor())vgg16_true = torchvision.models.vgg16(weights = VGG16_Weights.DEFAULT) # 预训练好了的模型
vgg16_false = torchvision.models.vgg16(weights = None)# 在模型外层添加
vgg16_true.add_module("add_linear", nn.Linear(1000, 10))
print(vgg16_true)# 在模型 classifier 中添加
vgg16_true.classifier.add_module("add_linear", nn.Linear(1000, 10))
print(vgg16_true)# 在模型 classifier 中修改
vgg16_false.classifier[6] = nn.Linear(4096, 10)
print(vgg16_false)
- 结果如下:
8.3 保存模型
- 两种方式:
import torch
import torchvision.modelsvgg16 = torchvision.models.vgg16()# 保存方式 1, 模型结构+模型结构
torch.save(vgg16, "vgg16_method1_pth")# 保存方式 2, 模型参数(官方推荐)
torch.save(vgg16.state_dict(), "vgg16_method2_pth")
8.4 加载模型
- 两种方式(对应保存模型):
import torch
import torchvision# 方式 1 -> 加载模型
model = torch.load("vgg16_method1_pth", weights_only=False)
print(model)# 方式 2 -> 加载模型
vgg16 = torchvision.models.vgg16()
vgg16.load_state_dict(torch.load("vgg16_method2_pth"))
print(vgg16)# 注意: 不能实例化后保存模型
9. 模型训练
9.1 CIFAR10 模型训练与测试
- 模型模块:
import torch
from torch import nn
from torch.nn import Sequential, Conv2d, MaxPool2d, Flatten, Linear# 搭建神经网络
class Model(nn.Module):def __init__(self):super().__init__()self.model = Sequential(Conv2d(3, 32, 5, padding=2),MaxPool2d(2),Conv2d(32, 32, 5, padding=2),MaxPool2d(2),Conv2d(32, 64, 5, padding=2),MaxPool2d(2),Flatten(),Linear(1024, 64),Linear(64, 10))def forward(self, x):x = self.model(x)return xif __name__ == '__main__':model = Model()input = torch.ones((64, 3, 32, 32))output = model(input)print(output.shape)
- 训练模块:
import torch
import torchvision.datasets
from torch import optim
from torch.nn import CrossEntropyLoss
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriterfrom CIFAR10_model import *# 准备数据集
train_data = torchvision.datasets.CIFAR10("./dataset", train=True, transform=torchvision.transforms.ToTensor(),download=True)
test_data = torchvision.datasets.CIFAR10("./dataset", train=False, transform=torchvision.transforms.ToTensor(),download=True)# 数据集大小
train_data_size = len(train_data)
test_data_size = len(test_data)
print(f"训练数据集的长度为{train_data_size}")
print(f"测试数据集的长度为{test_data_size}")# 利用 DataLoader 加载数据集
train_dataloader = DataLoader(train_data, batch_size=64)
test_dataloader = DataLoader(test_data, batch_size=64)# 创建网络模型
model = Model()# 损失函数
criterion = CrossEntropyLoss()# 优化器
learning_rate = 1e-2 # 0.01
optimizer = optim.SGD(model.parameters(), lr=learning_rate)# 设置训练网络的一些参数
# 记录训练的次数
total_train_step = 0
# 记录测试的次数
total_test_step = 0
# 训练轮数
epoch = 10# 添加 tensorboard
writer = SummaryWriter("logs")for i in range(epoch):print(f"第 {i+1} 轮训练开始")# 训练步骤开始model.train()for data in train_dataloader:imgs, targets = dataoutput = model(imgs)loss = criterion(output, targets)# 优化器优化模型optimizer.zero_grad()loss.backward()optimizer.step()total_train_step += 1if total_train_step % 100 == 0:print(f"训练次数: {total_train_step}, Loss: {loss.item()}")writer.add_scalar("train_loss", loss.item(), global_step=total_train_step)# 测试步骤开始model.eval()total_test_loss = 0with torch.no_grad():for data in test_dataloader:imgs, targets = dataoutput = model(imgs)loss = criterion(output, targets)total_test_loss += lossprint(f"整体测试集上的Loss: {total_test_loss}")writer.add_scalar("test_loss", total_test_loss, global_step=total_test_step)total_test_step += 1# 保存每一轮训练的模型torch.save(model, f"model_{i}.pth")print("模型已保存")writer.close()
- 训练结果:
- train_loss 与 test_loss:
9.2 分类问题正确率
- 原先的输出是每个类别组成的概率集合,通过 argmax(1) 得到每一行的最大值对应类别,再与目标对比计算相同类别的总个数;
import torchoutputs = torch.tensor([[0.1, 0.2],[0.3, 0.4]])print(outputs.argmax(1))
preds = outputs.argmax(1)
targets = torch.tensor([0, 1])
print((preds == targets).sum())
- 结果如下:
- 代码修改如下:
import torch
import torchvision.datasets
from torch import optim
from torch.nn import CrossEntropyLoss
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriterfrom CIFAR10_model import *# 准备数据集
train_data = torchvision.datasets.CIFAR10("./dataset", train=True, transform=torchvision.transforms.ToTensor(),download=True)
test_data = torchvision.datasets.CIFAR10("./dataset", train=False, transform=torchvision.transforms.ToTensor(),download=True)# 数据集大小
train_data_size = len(train_data)
test_data_size = len(test_data)
print(f"训练数据集的长度为{train_data_size}")
print(f"测试数据集的长度为{test_data_size}")# 利用 DataLoader 加载数据集
train_dataloader = DataLoader(train_data, batch_size=64)
test_dataloader = DataLoader(test_data, batch_size=64)# 创建网络模型
model = Model()# 损失函数
criterion = CrossEntropyLoss()# 优化器
learning_rate = 1e-2 # 0.01
optimizer = optim.SGD(model.parameters(), lr=learning_rate)# 设置训练网络的一些参数
# 记录训练的次数
total_train_step = 0
# 记录测试的次数
total_test_step = 0
# 训练轮数
epoch = 10# 添加 tensorboard
writer = SummaryWriter("logs")for i in range(epoch):print(f"第 {i+1} 轮训练开始")# 训练步骤开始model.train()for data in train_dataloader:imgs, targets = dataoutput = model(imgs)loss = criterion(output, targets)# 优化器优化模型optimizer.zero_grad()loss.backward()optimizer.step()total_train_step += 1if total_train_step % 100 == 0:print(f"训练次数: {total_train_step}, Loss: {loss.item()}")writer.add_scalar("train_loss", loss.item(), global_step=total_train_step)# 测试步骤开始model.eval() total_test_loss = 0total_accuracy = 0with torch.no_grad():for data in test_dataloader:imgs, targets = dataoutput = model(imgs)loss = criterion(output, targets)total_test_loss += loss# 计算准确率total_accuracy += (output.argmax(1) == targets).sum()print(f"整体测试集上的Loss: {total_test_loss}")print(f"整体测试集上的准确率: {total_accuracy / test_data_size}")writer.add_scalar("test_loss", total_test_loss, global_step=total_test_step)writer.add_scalar("test_accuracy", total_accuracy / test_data_size, global_step=total_test_step)total_test_step += 1# 保存每一轮训练的模型torch.save(model, f"model_{i}.pth")print("模型已保存")writer.close()
- 准确率结果如下:
9.3 利用GPU训练
- 方法一:
import torch
import torchvision.datasets
from torch import optim, nn
from torch.nn import CrossEntropyLoss, Sequential, Conv2d, MaxPool2d, Flatten, Linear
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter
import time# 准备数据集
train_data = torchvision.datasets.CIFAR10("./dataset", train=True, transform=torchvision.transforms.ToTensor(),download=True)
test_data = torchvision.datasets.CIFAR10("./dataset", train=False, transform=torchvision.transforms.ToTensor(),download=True)# 数据集大小
train_data_size = len(train_data)
test_data_size = len(test_data)
print(f"训练数据集的长度为{train_data_size}")
print(f"测试数据集的长度为{test_data_size}")# 利用 DataLoader 加载数据集
train_dataloader = DataLoader(train_data, batch_size=64)
test_dataloader = DataLoader(test_data, batch_size=64)# 创建网络模型
class Model(nn.Module):def __init__(self):super().__init__()self.model = Sequential(Conv2d(3, 32, 5, padding=2),MaxPool2d(2),Conv2d(32, 32, 5, padding=2),MaxPool2d(2),Conv2d(32, 64, 5, padding=2),MaxPool2d(2),Flatten(),Linear(1024, 64),Linear(64, 10))def forward(self, x):x = self.model(x)return xmodel = Model()
if torch.cuda.is_available():model = model.cuda()# 损失函数
criterion = CrossEntropyLoss()
if torch.cuda.is_available():criterion = criterion.cuda()# 优化器
learning_rate = 1e-2 # 0.01
optimizer = optim.SGD(model.parameters(), lr=learning_rate)# 设置训练网络的一些参数
# 记录训练的次数
total_train_step = 0
# 记录测试的次数
total_test_step = 0
# 训练轮数
epoch = 10# 添加 tensorboard
writer = SummaryWriter("logs")start_time = time.time()
for i in range(epoch):print(f"第 {i+1} 轮训练开始")# 训练步骤开始model.train()for data in train_dataloader:imgs, targets = dataif torch.cuda.is_available():imgs = imgs.cuda()targets = targets.cuda()output = model(imgs)loss = criterion(output, targets)# 优化器优化模型optimizer.zero_grad()loss.backward()optimizer.step()total_train_step += 1if total_train_step % 100 == 0:end_time = time.time()print(end_time - start_time) # 计算一轮训练时间print(f"训练次数: {total_train_step}, Loss: {loss.item()}")writer.add_scalar("train_loss", loss.item(), global_step=total_train_step)# 测试步骤开始model.eval()total_test_loss = 0total_accuracy = 0with torch.no_grad():for data in test_dataloader:imgs, targets = dataif torch.cuda.is_available():imgs = imgs.cuda()targets = targets.cuda()output = model(imgs)loss = criterion(output, targets)total_test_loss += loss# 计算准确率total_accuracy += (output.argmax(1) == targets).sum()print(f"整体测试集上的Loss: {total_test_loss}")print(f"整体测试集上的准确率: {total_accuracy / test_data_size}")writer.add_scalar("test_loss", total_test_loss, global_step=total_test_step)writer.add_scalar("test_accuracy", total_accuracy / test_data_size, global_step=total_test_step)total_test_step += 1# 保存每一轮训练的模型torch.save(model, f"model_{i}.pth")print("模型已保存")writer.close()
- 方法二:
import torch
import torchvision.datasets
from torch import optim, nn
from torch.nn import CrossEntropyLoss, Sequential, Conv2d, MaxPool2d, Flatten, Linear
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter
import time# 定义训练的设备
device = torch.device("cuda")# 准备数据集
train_data = torchvision.datasets.CIFAR10("./dataset", train=True, transform=torchvision.transforms.ToTensor(),download=True)
test_data = torchvision.datasets.CIFAR10("./dataset", train=False, transform=torchvision.transforms.ToTensor(),download=True)# 数据集大小
train_data_size = len(train_data)
test_data_size = len(test_data)
print(f"训练数据集的长度为{train_data_size}")
print(f"测试数据集的长度为{test_data_size}")# 利用 DataLoader 加载数据集
train_dataloader = DataLoader(train_data, batch_size=64)
test_dataloader = DataLoader(test_data, batch_size=64)# 创建网络模型
class Model(nn.Module):def __init__(self):super().__init__()self.model = Sequential(Conv2d(3, 32, 5, padding=2),MaxPool2d(2),Conv2d(32, 32, 5, padding=2),MaxPool2d(2),Conv2d(32, 64, 5, padding=2),MaxPool2d(2),Flatten(),Linear(1024, 64),Linear(64, 10))def forward(self, x):x = self.model(x)return xmodel = Model()
model = model.to(device)# 损失函数
criterion = CrossEntropyLoss()
criterion = criterion.to(device)# 优化器
learning_rate = 1e-2 # 0.01
optimizer = optim.SGD(model.parameters(), lr=learning_rate)# 设置训练网络的一些参数
# 记录训练的次数
total_train_step = 0
# 记录测试的次数
total_test_step = 0
# 训练轮数
epoch = 10# 添加 tensorboard
writer = SummaryWriter("logs")start_time = time.time()
for i in range(epoch):print(f"第 {i+1} 轮训练开始")# 训练步骤开始model.train()for data in train_dataloader:imgs, targets = dataimgs = imgs.to(device)targets = targets.to(device)output = model(imgs)loss = criterion(output, targets)# 优化器优化模型optimizer.zero_grad()loss.backward()optimizer.step()total_train_step += 1if total_train_step % 100 == 0:end_time = time.time()print(end_time - start_time) # 计算一轮训练时间print(f"训练次数: {total_train_step}, Loss: {loss.item()}")writer.add_scalar("train_loss", loss.item(), global_step=total_train_step)# 测试步骤开始model.eval()total_test_loss = 0total_accuracy = 0with torch.no_grad():for data in test_dataloader:imgs, targets = dataimgs = imgs.to(device)targets = targets.to(device)output = model(imgs)loss = criterion(output, targets)total_test_loss += loss# 计算准确率total_accuracy += (output.argmax(1) == targets).sum()print(f"整体测试集上的Loss: {total_test_loss}")print(f"整体测试集上的准确率: {total_accuracy / test_data_size}")writer.add_scalar("test_loss", total_test_loss, global_step=total_test_step)writer.add_scalar("test_accuracy", total_accuracy / test_data_size, global_step=total_test_step)total_test_step += 1# 保存每一轮训练的模型torch.save(model, f"model_{i}.pth")print("模型已保存")writer.close()
9.4 模型验证
- 完整的模型验证(测试, demo):利用已经训练好的模型,然后给它提供输入;
- 利用训练好的 CIFAR10 数据集对应模型,选取一张狗狗图片进行测试:
import torch
import torchvision.transforms
from PIL import Image
from torch import nn
from torch.nn import Sequential, Conv2d, MaxPool2d, Flatten, Linear# 导入需要进行分类的图片,转换为 'RGB' 形式
image_path = "./images/dog.jpg"
image = Image.open(image_path)
image = image.convert("RGB")
print(image)# CIFAR10 要求输入图片大小为 (32, 32)
transform = torchvision.transforms.Compose([torchvision.transforms.Resize((32, 32)),torchvision.transforms.ToTensor()])image = transform(image)
print(image.shape) # (3, 32, 32)# 加载网络模型
class Model(nn.Module):def __init__(self):super().__init__()self.model = Sequential(Conv2d(3, 32, 5, padding=2),MaxPool2d(2),Conv2d(32, 32, 5, padding=2),MaxPool2d(2),Conv2d(32, 64, 5, padding=2),MaxPool2d(2),Flatten(),Linear(1024, 64),Linear(64, 10))def forward(self, x):x = self.model(x)return x# 第二个参数说明由gpu训练的模型要在cpu加载,第三个参数说明保存的网络不仅只有权重还有结构
model = torch.load("model_29_gpu.pth", map_location=torch.device('cpu'), weights_only=False)# 数据集要求输入数据包含 batch_size
image = torch.reshape(image, (1, 3, 32, 32))
model.eval()
with torch.no_grad():output = model(image)# 输出概率最高的类别
print(output)
print(output.argmax(1))