当前位置: 首页 > news >正文

P20:Inception v3算法实战与解析

  • 🍨 本文为🔗365天深度学习训练营 中的学习记录博客
  • 🍖 原作者:K同学啊
    使用InceptionV3实现天气识别

一、模型结构

Inception v3是Google团队在2015年提出的第三代Inception模型,属于卷积神经网络(CNN)的经典架构之一,主要用于图像分类任务。它在Inception v1(即GoogLeNet)的基础上进行了多项优化,显著提升了模型性能和计算效率。以下是Inception v3的核心特点及其与v1的主要区别:


Inception v3的核心改进

  1. 卷积核分解与优化

    • 对称分解:将大卷积核(如5×5)分解为多个小卷积核(如两个3×3),减少参数量的同时保持感受野范围。例如,5×5卷积分解为两个3×3卷积,参数量减少约28%。
    • 非对称分解:进一步将3×3卷积分解为1×3和3×1的串联结构,减少计算成本(例如,3×3卷积分解后计算量降低33%)。这种设计不仅节约参数,还能捕获更丰富的空间特征。
  2. 高效的网格尺寸缩减(Grid Size Reduction)

    • 通过结合步长为2的卷积层和池化层,并行处理特征图后再拼接,避免信息丢失。例如,使用3×3卷积(步长2)和3×3最大池化(步长2)并行,扩大特征图通道数同时缩小尺寸。
  3. 正则化与训练策略

    • 标签平滑(Label Smoothing):通过软化标签(如将正确标签从1改为0.9,其他类别均分0.1),减少模型过拟合,提升泛化能力。
    • 批量归一化(Batch Normalization):广泛用于卷积层后,加速训练收敛并提高稳定性。
    • 优化器改进:采用RMSProp替代传统SGD,自适应调整学习率,提升训练效率。
  4. 模块化设计

    • Inception v3包含三种不同结构的Inception模块(35×35、17×17、8×8网格),分别用于处理不同尺度的特征图。模块内部进一步引入分支中的分支(如1×3和3×1卷积的组合),形成更复杂的非线性表达。

Inception v3与v1的主要区别

特性Inception v1Inception v3
卷积核分解引入对称与非对称分解(如3×3→1×3+3×1)
网络深度22层(含9个Inception模块)46层(含11个Inception模块)
正则化技术无批量归一化,使用辅助分类器缓解梯度消失批量归一化 + 标签平滑 + 辅助分类器正则化
计算效率5×5卷积计算成本高通过分解卷积核降低计算量,参数量减少约30%
错误率(ImageNet)Top-5错误率6.67%Top-5错误率3.5%
模块设计基础Inception模块(含1×1、3×3、5×5卷积)复杂模块(如非对称卷积、多分支组合)

关键改进的意义

  1. 性能提升:通过卷积分解和模块优化,v3在保持较低计算成本的同时,显著降低了分类错误率(较v1提升近50%)。
  2. 灵活性增强:非对称卷积和模块化设计使模型能更灵活地捕捉多尺度特征,适应不同输入尺寸。
  3. 训练稳定性:批量归一化和标签平滑等技术有效缓解了过拟合,加速了模型收敛。

程序结构图

在这里插入图片描述

二、 前期准备

1. 导入库

import torch
import torch.nn as nn
import torchvision.transforms as transforms
import torchvision
from torchvision import transforms, datasets
import os,PIL,pathlib
import os,PIL,random,pathlib
import torch.nn.functional as F
from PIL import Image
import matplotlib.pyplot as plt
#隐藏警告
import warnings

2.导入数据

data_dir = './data/4-data/'
data_dir = pathlib.Path(data_dir)
#print(data_dir)
data_paths = list(data_dir.glob('*'))
classeNames = [str(path).split("\\")[2] for path in data_paths]
#print(classeNames)
total_datadir = './data/4-data/'train_transforms = transforms.Compose([transforms.Resize([224, 224]),  # 将输入图片resize成统一尺寸transforms.ToTensor(),          # 将PIL Image或numpy.ndarray转换为tensor,并归一化到[0,1]之间transforms.Normalize(           # 标准化处理-->转换为标准正太分布(高斯分布),使模型更容易收敛mean=[0.485, 0.456, 0.406],std=[0.229, 0.224, 0.225])  # 其中 mean=[0.485,0.456,0.406]与std=[0.229,0.224,0.225] 从数据集中随机抽样计算得到的。
])total_data = datasets.ImageFolder(total_datadir,transform=train_transforms)

3.划分数据集

train_size = int(0.8 * len(total_data))
test_size  = len(total_data) - train_size
train_dataset, test_dataset = torch.utils.data.random_split(total_data, [train_size, test_size])
batch_size = 32train_dl = torch.utils.data.DataLoader(train_dataset,batch_size=batch_size,shuffle=True,num_workers=1)
test_dl = torch.utils.data.DataLoader(test_dataset,batch_size=batch_size,shuffle=True,num_workers=1)

三、模型设计

1. 神经网络的搭建

class BasicConv2d(nn.Module):def __init__(self, in_channels, out_channels, **kwargs):super().__init__()self.conv = nn.Conv2d(in_channels, out_channels, bias=False, **kwargs)self.bn = nn.BatchNorm2d(out_channels, eps=0.001)def forward(self, x):x = self.conv(x)x = self.bn(x)return F.relu(x, inplace=True)class InceptionA(nn.Module):def __init__(self, in_channels, pool_features):super(InceptionA, self).__init__()self.branch1x1 = BasicConv2d(in_channels, 64, kernel_size=1)  # 1self.branch5x5_1 = BasicConv2d(in_channels, 48, kernel_size=1)self.branch5x5_2 = BasicConv2d(48, 64, kernel_size=5, padding=2)self.branch3x3dbl_1 = BasicConv2d(in_channels, 64, kernel_size=1)self.branch3x3dbl_2 = BasicConv2d(64, 96, kernel_size=3, padding=1)self.branch3x3dbl_3 = BasicConv2d(96, 96, kernel_size=3, padding=1)self.branch_pool = BasicConv2d(in_channels, pool_features, kernel_size=1)def forward(self, x):branch1x1 = self.branch1x1(x)branch5x5 = self.branch5x5_1(x)branch5x5 = self.branch5x5_2(branch5x5)branch3x3dbl = self.branch3x3dbl_1(x)branch3x3dbl = self.branch3x3dbl_2(branch3x3dbl)branch3x3dbl = self.branch3x3dbl_3(branch3x3dbl)branch_pool = F.avg_pool2d(x, kernel_size=3, stride=1, padding=1)branch_pool = self.branch_pool(branch_pool)outputs = [branch1x1, branch5x5, branch3x3dbl, branch_pool]return torch.cat(outputs, 1)class InceptionB(nn.Module):def __init__(self, in_channels, channels_7x7):super(InceptionB, self).__init__()self.branch1x1 = BasicConv2d(in_channels, 192, kernel_size=1)c7 = channels_7x7self.branch7x7_1 = BasicConv2d(in_channels, c7, kernel_size=1)self.branch7x7_2 = BasicConv2d(c7, c7, kernel_size=(1, 7), padding=(0, 3))self.branch7x7_3 = BasicConv2d(c7, 192, kernel_size=(7, 1), padding=(3, 0))self.branch7x7dbl_1 = BasicConv2d(in_channels, c7, kernel_size=1)self.branch7x7dbl_2 = BasicConv2d(c7, c7, kernel_size=(7, 1), padding=(3, 0))self.branch7x7dbl_3 = BasicConv2d(c7, c7, kernel_size=(1, 7), padding=(0, 3))self.branch7x7dbl_4 = BasicConv2d(c7, c7, kernel_size=(7, 1), padding=(3, 0))self.branch7x7dbl_5 = BasicConv2d(c7, 192, kernel_size=(1, 7), padding=(0, 3))self.branch_pool = BasicConv2d(in_channels, 192, kernel_size=1)def forward(self, x):branch1x1 = self.branch1x1(x)branch7x7 = self.branch7x7_1(x)branch7x7 = self.branch7x7_2(branch7x7)branch7x7 = self.branch7x7_3(branch7x7)branch7x7dbl = self.branch7x7dbl_1(x)branch7x7dbl = self.branch7x7dbl_2(branch7x7dbl)branch7x7dbl = self.branch7x7dbl_3(branch7x7dbl)branch7x7dbl = self.branch7x7dbl_4(branch7x7dbl)branch7x7dbl = self.branch7x7dbl_5(branch7x7dbl)branch_pool = F.avg_pool2d(x, kernel_size=3, stride=1, padding=1)branch_pool = self.branch_pool(branch_pool)outputs = [branch1x1, branch7x7, branch7x7dbl, branch_pool]return torch.cat(outputs, 1)class InceptionC(nn.Module):def __init__(self, in_channels):super(InceptionC, self).__init__()self.branch1x1 = BasicConv2d(in_channels, 320, kernel_size=1)self.branch3x3_1 = BasicConv2d(in_channels, 384, kernel_size=1)self.branch3x3_2a = BasicConv2d(384, 384, kernel_size=(1, 3), padding=(0, 1))self.branch3x3_2b = BasicConv2d(384, 384, kernel_size=(3, 1), padding=(1, 0))self.branch3x3dbl_1 = BasicConv2d(in_channels, 448, kernel_size=1)self.branch3x3dbl_2 = BasicConv2d(448, 384, kernel_size=3, padding=1)self.branch3x3dbl_3a = BasicConv2d(384, 384, kernel_size=(1, 3), padding=(0, 1))self.branch3x3dbl_3b = BasicConv2d(384, 384, kernel_size=(3, 1), padding=(1, 0))self.branch_pool = BasicConv2d(in_channels, 192, kernel_size=1)def forward(self, x):branch1x1 = self.branch1x1(x)branch3x3 = self.branch3x3_1(x)branch3x3 = [self.branch3x3_2a(branch3x3),self.branch3x3_2b(branch3x3),]branch3x3 = torch.cat(branch3x3, 1)branch3x3dbl = self.branch3x3dbl_1(x)branch3x3dbl = self.branch3x3dbl_2(branch3x3dbl)branch3x3dbl = [self.branch3x3dbl_3a(branch3x3dbl),self.branch3x3dbl_3b(branch3x3dbl),]branch3x3dbl = torch.cat(branch3x3dbl, 1)branch_pool = F.avg_pool2d(x, kernel_size=3, stride=1, padding=1)branch_pool = self.branch_pool(branch_pool)outputs = [branch1x1, branch3x3, branch3x3dbl, branch_pool]return torch.cat(outputs, 1)class ReductionA(nn.Module):def __init__(self, in_channels):super(ReductionA, self).__init__()self.branch3x3 = BasicConv2d(in_channels, 384, kernel_size=3, stride=2)self.branch3x3dbl_1 = BasicConv2d(in_channels, 64, kernel_size=1)self.branch3x3dbl_2 = BasicConv2d(64, 96, kernel_size=3, padding=1)self.branch3x3dbl_3 = BasicConv2d(96, 96, kernel_size=3, stride=2)def forward(self, x):branch3x3 = self.branch3x3(x)branch3x3dbl = self.branch3x3dbl_1(x)branch3x3dbl = self.branch3x3dbl_2(branch3x3dbl)branch3x3dbl = self.branch3x3dbl_3(branch3x3dbl)branch_pool = F.max_pool2d(x, kernel_size=3, stride=2)outputs = [branch3x3, branch3x3dbl, branch_pool]return torch.cat(outputs, 1)class ReductionB(nn.Module):def __init__(self, in_channels):super(ReductionB, self).__init__()self.branch3x3_1 = BasicConv2d(in_channels, 192, kernel_size=1)self.branch3x3_2 = BasicConv2d(192, 320, kernel_size=3, stride=2)self.branch7x7x3_1 = BasicConv2d(in_channels, 192, kernel_size=1)self.branch7x7x3_2 = BasicConv2d(192, 192, kernel_size=(1, 7), padding=(0, 3))self.branch7x7x3_3 = BasicConv2d(192, 192, kernel_size=(7, 1), padding=(3, 0))self.branch7x7x3_4 = BasicConv2d(192, 192, kernel_size=3, stride=2)def forward(self, x):branch3x3 = self.branch3x3_1(x)branch3x3 = self.branch3x3_2(branch3x3)branch7x7x3 = self.branch7x7x3_1(x)branch7x7x3 = self.branch7x7x3_2(branch7x7x3)branch7x7x3 = self.branch7x7x3_3(branch7x7x3)branch7x7x3 = self.branch7x7x3_4(branch7x7x3)branch_pool = F.max_pool2d(x, kernel_size=3, stride=2)outputs = [branch3x3, branch7x7x3, branch_pool]return torch.cat(outputs, 1)class InceptionAux(nn.Module):def __init__(self, in_channels, num_classes):super(InceptionAux, self).__init__()self.conv0 = BasicConv2d(in_channels, 128, kernel_size=1)self.conv1 = BasicConv2d(128, 768, kernel_size=5)self.conv1.stddev = 0.01self.fc = nn.Linear(768, num_classes)self.fc.stddev = 0.001def forward(self, x):# 17 x 17 x 768x = F.avg_pool2d(x, kernel_size=5, stride=3)# 5 x 5 x 768x = self.conv0(x)# 5 x 5 x 128x = self.conv1(x)# 1 x 1 x 768x = x.view(x.size(0), -1)# 768x = self.fc(x)# 1000return ximport torch.nn.functional as Fclass BasicConv2d(nn.Module):def __init__(self, in_channels, out_channels, **kwargs):super(BasicConv2d, self).__init__()self.conv = nn.Conv2d(in_channels, out_channels, bias=False, **kwargs)self.bn = nn.BatchNorm2d(out_channels, eps=0.001)def forward(self, x):x = self.conv(x)x = self.bn(x)return F.relu(x, inplace=True)class InceptionV3(nn.Module):def __init__(self, num_classes=1000, aux_logits=False, transform_input=False):super(InceptionV3, self).__init__()self.aux_logits = aux_logitsself.transform_input = transform_inputself.Conv2d_1a_3x3 = BasicConv2d(3, 32, kernel_size=3, stride=2)self.Conv2d_2a_3x3 = BasicConv2d(32, 32, kernel_size=3)self.Conv2d_2b_3x3 = BasicConv2d(32, 64, kernel_size=3, padding=1)self.Conv2d_3b_1x1 = BasicConv2d(64, 80, kernel_size=1)self.Conv2d_4a_3x3 = BasicConv2d(80, 192, kernel_size=3)self.Mixed_5b = InceptionA(192, pool_features=32)self.Mixed_5c = InceptionA(256, pool_features=64)self.Mixed_5d = InceptionA(288, pool_features=64)self.Mixed_6a = ReductionA(288)self.Mixed_6b = InceptionB(768, channels_7x7=128)self.Mixed_6c = InceptionB(768, channels_7x7=160)self.Mixed_6d = InceptionB(768, channels_7x7=160)self.Mixed_6e = InceptionB(768, channels_7x7=192)if aux_logits:self.AuxLogits = InceptionAux(768, num_classes)self.Mixed_7a = ReductionB(768)self.Mixed_7b = InceptionC(1280)self.Mixed_7c = InceptionC(2048)self.fc = nn.Linear(2048, num_classes)def forward(self, x):if self.transform_input:  # 1x = x.clone()x[:, 0] = x[:, 0] * (0.229 / 0.5) + (0.485 - 0.5) / 0.5x[:, 1] = x[:, 1] * (0.224 / 0.5) + (0.456 - 0.5) / 0.5x[:, 2] = x[:, 2] * (0.225 / 0.5) + (0.406 - 0.5) / 0.5# 299 x 299 x 3x = self.Conv2d_1a_3x3(x)# 149 x 149 x 32x = self.Conv2d_2a_3x3(x)# 147 x 147 x 32x = self.Conv2d_2b_3x3(x)# 147 x 147 x 64x = F.max_pool2d(x, kernel_size=3, stride=2)# 73 x 73 x 64x = self.Conv2d_3b_1x1(x)# 73 x 73 x 80x = self.Conv2d_4a_3x3(x)# 71 x 71 x 192x = F.max_pool2d(x, kernel_size=3, stride=2)# 35 x 35 x 192x = self.Mixed_5b(x)# 35 x 35 x 256x = self.Mixed_5c(x)# 35 x 35 x 288x = self.Mixed_5d(x)# 35 x 35 x 288x = self.Mixed_6a(x)# 17 x 17 x 768x = self.Mixed_6b(x)# 17 x 17 x 768x = self.Mixed_6c(x)# 17 x 17 x 768x = self.Mixed_6d(x)# 17 x 17 x 768x = self.Mixed_6e(x)# 17 x 17 x 768if self.training and self.aux_logits:aux = self.AuxLogits(x)# 17 x 17 x 768x = self.Mixed_7a(x)# 8 x 8 x 1280x = self.Mixed_7b(x)# 8 x 8 x 2048x = self.Mixed_7c(x)# 8 x 8 x 2048x = F.avg_pool2d(x, kernel_size=8)# 1 x 1 x 2048x = F.dropout(x, training=self.training)# 1 x 1 x 2048x = x.view(x.size(0), -1)# 2048x = self.fc(x)# 1000 (num_classes)if self.training and self.aux_logits:return x, auxreturn x#model = InceptionV3().to(device)

2.设置损失值等超参数

# 修改3: 初始化InceptionV3模型
model = InceptionV3(num_classes=len(classeNames),  # 适配数据集类别数aux_logits=False,  # 禁用辅助分类器transform_input=False  # 禁用内置输入转换
).to(device)
print(summary.summary(model, (3, 299, 299)))
loss_fn = nn.CrossEntropyLoss()
learn_rate = 1e-4
# 修改4: 推荐使用Adam优化器
opt = torch.optim.Adam(model.parameters(), lr=learn_rate, weight_decay=1e-4)
----------------------------------------------------------------Layer (type)               Output Shape         Param #
================================================================Conv2d-1         [-1, 32, 149, 149]             864BatchNorm2d-2         [-1, 32, 149, 149]              64BasicConv2d-3         [-1, 32, 149, 149]               0Conv2d-4         [-1, 32, 147, 147]           9,216BatchNorm2d-5         [-1, 32, 147, 147]              64BasicConv2d-6         [-1, 32, 147, 147]               0Conv2d-7         [-1, 64, 147, 147]          18,432BatchNorm2d-8         [-1, 64, 147, 147]             128BasicConv2d-9         [-1, 64, 147, 147]               0....								.....BatchNorm2d-288            [-1, 384, 8, 8]             768BasicConv2d-289            [-1, 384, 8, 8]               0Conv2d-290            [-1, 192, 8, 8]         393,216BatchNorm2d-291            [-1, 192, 8, 8]             384BasicConv2d-292            [-1, 192, 8, 8]               0InceptionC-293           [-1, 2048, 8, 8]               0Linear-294                    [-1, 4]           8,196
================================================================
Total params: 21,793,764
Trainable params: 21,793,764
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 1.02
Forward/backward pass size (MB): 224.12
Params size (MB): 83.14
Estimated Total Size (MB): 308.28
----------------------------------------------------------------

3. 设置训练函数

def train(dataloader, model, loss_fn, optimizer):model.train()total_loss, correct = 0, 0for X, y in dataloader:X, y = X.to(device), y.to(device)pred = model(X)loss = loss_fn(pred, y)optimizer.zero_grad()loss.backward()optimizer.step()total_loss += loss.item()correct += (pred.argmax(1) == y).type(torch.float).sum().item()return correct / len(dataloader.dataset), total_loss / len(dataloader)

4. 设置测试函数

def test(dataloader, model, loss_fn):model.eval()total_loss, correct = 0, 0with torch.no_grad():for X, y in dataloader:X, y = X.to(device), y.to(device)pred = model(X)total_loss += loss_fn(pred, y).item()correct += (pred.argmax(1) == y).type(torch.float).sum().item()return correct / len(dataloader.dataset), total_loss / len(dataloader)

5. 创建导入本地图片预处理模块

def preprocess_image(image_path):transform = transforms.Compose([transforms.Resize([299, 299]),transforms.ToTensor(),transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])])image = Image.open(image_path).convert('RGB')  # 确保RGB格式return transform(image).unsqueeze(0).to(device)

6. 主函数

if __name__ == '__main__':epochs = 10train_acc, train_loss = [], []test_acc, test_loss = [], []for epoch in range(epochs):epoch_train_acc, epoch_train_loss = train(train_dl, model, loss_fn, opt)epoch_test_acc, epoch_test_loss = test(test_dl, model, loss_fn)# 记录指标train_acc.append(epoch_train_acc)train_loss.append(epoch_train_loss)test_acc.append(epoch_test_acc)test_loss.append(epoch_test_loss)# 打印进度print(f"Epoch {epoch + 1}/{epochs}")print(f"Train Acc: {epoch_train_acc * 100:.2f}% | Test Acc: {epoch_test_acc * 100:.2f}%")print(f"Train Loss: {epoch_train_loss:.4f} | Test Loss: {epoch_test_loss:.4f}\n")# 保存模型torch.save(model.state_dict(), "inceptionv3_weather.pth")# 示例预测test_image = preprocess_image('./data/img.jpg')with torch.no_grad():output = model(test_image)prediction = classeNames[output.argmax().item()]print(f"\nFinal Prediction: {prediction}")

结果

Epoch 1/10
Train Acc: 75.33% | Test Acc: 24.89%
Train Loss: 0.6807 | Test Loss: 3.5399
Epoch 2/10
Train Acc: 89.00% | Test Acc: 93.33%
Train Loss: 0.3573 | Test Loss: 0.1631
Epoch 3/10
Train Acc: 90.33% | Test Acc: 94.67%
Train Loss: 0.3525 | Test Loss: 0.1844
Epoch 4/10
Train Acc: 89.78% | Test Acc: 93.78%
Train Loss: 0.3061 | Test Loss: 0.2160
Epoch 5/10
Train Acc: 91.44% | Test Acc: 95.56%
Train Loss: 0.2578 | Test Loss: 0.1473
Epoch 6/10
Train Acc: 93.67% | Test Acc: 90.22%
Train Loss: 0.2180 | Test Loss: 0.3774
Epoch 7/10
Train Acc: 92.78% | Test Acc: 96.44%
Train Loss: 0.2151 | Test Loss: 0.0971
Epoch 8/10
Train Acc: 94.89% | Test Acc: 94.67%
Train Loss: 0.1631 | Test Loss: 0.1472
Epoch 9/10
Train Acc: 92.56% | Test Acc: 94.22%
Train Loss: 0.2332 | Test Loss: 0.1486
Epoch 10/10
Train Acc: 93.89% | Test Acc: 95.11%
Train Loss: 0.2795 | Test Loss: 0.1317

在这里插入图片描述

http://www.xdnf.cn/news/259255.html

相关文章:

  • ThreadLocal理解
  • Manus AI多语言手写识别技术解析
  • C语言-指针(二)
  • Linux diff 命令使用详解
  • flux_train_network的参数
  • new的几种形式
  • 深入理解 C++ 变量:从基础到高级应用
  • 5月2日日记
  • (六——下)RestAPI 毛子(Http resilience/Refit/游标分页/异步大文件上传)
  • Linux-常用监控工具
  • 第 12 届蓝桥杯 C++ 青少组中 / 高级组省赛 2021 年 4 月 24 日真题(选择题)
  • Python Cookbook-6.16 用 Borg 惯用法来避免“单例”模式
  • Codeforces Round 1022 (Div. 2)(ABC)
  • GESP2024年6月认证C++八级( 第三部分编程题(1)最远点对)
  • 【愚公系列】《Manus极简入门》011-习惯养成教练:“习惯塑造师”
  • 【Java IO流】File类基础详解
  • 【IPMV】图像处理与机器视觉:Lec9 Laplace Blending 拉普拉斯混合
  • 常见工业汽车行业通讯接口一览表
  • vulkanscenegraph显示倾斜模型(6.2)-记录与提交
  • 数字智慧方案5877丨智慧交通项目方案(122页PPT)(文末有下载方式)
  • OpenLayers+WebGIS实时协作黑科技!多人同步标绘神器
  • 使用xlwings将两张顺序错乱的表格进行数据核对
  • 二叉搜索树的判断(双指针解决)
  • 深度残差网络ResNet
  • Controller层接收参数方式
  • 瑞萨 EZ-CUBE2 调试器
  • AI赋能新媒体运营:效率提升与能力突破实战指南
  • ZYNQ工业级串口方案:AXI UART 16550扩展RS-485实战(自动方向控制+Linux驱动)
  • AI大模型-微调和RAG方案选项
  • 友元函数和友元类