当前位置：首页 > web >正文

《Python星球日记》第62天：图像方向综合项目（猫狗分类）

web 2025/7/2 13:41:25

名人说：路漫漫其修远兮，吾将上下而求索。—— 屈原《离骚》
创作者：Code_流苏(CSDN)（一个喜欢古诗词和编程的Coder😊）

目录

一、项目概述
1. 项目背景与意义

二、项目需求分析
1. 功能需求
2. 性能需求
3. 技术路线

三、数据准备与增强
1. 数据集选择
2. 数据探索与分析
3. 数据预处理
4. 数据增强

四、模型选择与训练
1. 自建CNN模型
2. 使用迁移学习
3. 模型训练

五、结果可视化与评估
1. 训练过程可视化
2. 模型评估
3. 预测可视化

六、模型部署建议
1. 模型导出
2. 模型部署选项
3. 部署示例：Flask Web应用
4. 其他部署考虑因素

七、总结与扩展
1. 项目总结
2. 扩展方向
3. 学习资源推荐

八、参考资料

👋 专栏介绍： Python星球日记专栏介绍（持续更新ing）
✅ 上一篇：《Python星球日记》第61天：深度强化学习（DQN 与 PPO）基础

大家好，欢迎来到Python星球的第62天！🪐

一、项目概述

经过前面对卷积神经网络、迁移学习和深度学习基础知识的学习，今天我们将通过一个实际项目来综合应用这些技术 - 构建一个猫狗分类识别系统。这个项目不仅能让我们巩固所学知识，还能体验一个完整的深度学习项目流程，从数据准备到模型部署。

在这里插入图片描述

1. 项目背景与意义

在计算机视觉领域，图像分类是一个基础且重要的任务。猫狗分类是深度学习入门的经典案例，它不仅操作简单，而且能够直观地展示卷积神经网络的强大能力。完成这个项目后，你将能够：

掌握数据预处理与增强技术
理解如何选择和应用适合的深度学习模型
学会评估模型性能并可视化结果
了解如何部署训练好的模型

二、项目需求分析

我们的目标是构建一个能够准确区分猫和狗图片的深度学习系统。具体需求如下：

1. 功能需求

输入：各种角度、光照条件下的猫或狗图片
处理：通过深度学习模型分析图像特征
输出：给出图片中是猫还是狗的预测，以及相应的置信度

2. 性能需求

分类准确率较高
推理速度较快，适合实时应用
模型大小适中，便于部署

3. 技术路线

在这里插入图片描述

三、数据准备与增强

1. 数据集选择

在本项目中，我们将使用经典的猫狗数据集（Kaggle Dogs vs. Cats）。该数据集包含25,000张标记为猫或狗的图像，分辨率和质量各不相同，非常适合训练我们的模型。

# 下载数据集
!kaggle competitions download -c dogs-vs-cats
!unzip dogs-vs-cats.zip -d data/

2. 数据探索与分析

在开始处理数据之前，让我们先了解一下数据集的基本情况：

import os
import matplotlib.pyplot as plt
import random
from PIL import Image
import numpy as np# 设置数据路径
data_dir = 'data/train'
cats_dir = os.path.join(data_dir, 'cats')
dogs_dir = os.path.join(data_dir, 'dogs')# 创建猫狗分类文件夹
if not os.path.exists(cats_dir):os.makedirs(cats_dir)
if not os.path.exists(dogs_dir):os.makedirs(dogs_dir)# 将原始数据整理到对应文件夹
for img in os.listdir(data_dir):if img.startswith('cat'):os.rename(os.path.join(data_dir, img), os.path.join(cats_dir, img))elif img.startswith('dog'):os.rename(os.path.join(data_dir, img), os.path.join(dogs_dir, img))# 统计数量
num_cats = len(os.listdir(cats_dir))
num_dogs = len(os.listdir(dogs_dir))
print(f"猫图片数量: {num_cats}")
print(f"狗图片数量: {num_dogs}")# 展示几张样本图片
plt.figure(figsize=(12, 6))
for i in range(3):# 随机选择猫图片cat_img = random.choice(os.listdir(cats_dir))cat_path = os.path.join(cats_dir, cat_img)plt.subplot(2, 3, i+1)plt.imshow(np.array(Image.open(cat_path)))plt.title('Cat')plt.axis('off')# 随机选择狗图片dog_img = random.choice(os.listdir(dogs_dir))dog_path = os.path.join(dogs_dir, dog_img)plt.subplot(2, 3, i+4)plt.imshow(np.array(Image.open(dog_path)))plt.title('Dog')plt.axis('off')plt.tight_layout()
plt.show()

3. 数据预处理

图像预处理是提高模型性能的关键步骤。我们需要对图像进行如下处理：

from tensorflow.keras.preprocessing.image import ImageDataGenerator
from sklearn.model_selection import train_test_split
import shutil# 创建训练、验证和测试集目录
train_dir = 'data/train_split'
valid_dir = 'data/valid_split'
test_dir = 'data/test_split'for dir_path in [train_dir, valid_dir, test_dir]:for category in ['cats', 'dogs']:os.makedirs(os.path.join(dir_path, category), exist_ok=True)# 划分数据集
cat_files = os.listdir(cats_dir)
dog_files = os.listdir(dogs_dir)# 训练集:验证集:测试集 = 7:2:1
cat_train, cat_temp = train_test_split(cat_files, test_size=0.3, random_state=42)
cat_valid, cat_test = train_test_split(cat_temp, test_size=1/3, random_state=42)dog_train, dog_temp = train_test_split(dog_files, test_size=0.3, random_state=42)
dog_valid, dog_test = train_test_split(dog_temp, test_size=1/3, random_state=42)# 复制文件到对应目录
def copy_files(file_list, src_dir, dst_dir):for file in file_list:shutil.copy(os.path.join(src_dir, file), os.path.join(dst_dir, file))copy_files(cat_train, cats_dir, os.path.join(train_dir, 'cats'))
copy_files(cat_valid, cats_dir, os.path.join(valid_dir, 'cats'))
copy_files(cat_test, cats_dir, os.path.join(test_dir, 'cats'))copy_files(dog_train, dogs_dir, os.path.join(train_dir, 'dogs'))
copy_files(dog_valid, dogs_dir, os.path.join(valid_dir, 'dogs'))
copy_files(dog_test, dogs_dir, os.path.join(test_dir, 'dogs'))print(f"训练集: 猫 {len(cat_train)}, 狗 {len(dog_train)}")
print(f"验证集: 猫 {len(cat_valid)}, 狗 {len(dog_valid)}")
print(f"测试集: 猫 {len(cat_test)}, 狗 {len(dog_test)}")

4. 数据增强

数据增强是解决数据量不足和提高模型泛化能力的重要技术。通过对训练图像进行各种变换，我们可以生成更多样化的训练样本：

# 设置图像预处理参数
img_width, img_height = 224, 224  # 统一图像尺寸
batch_size = 32# 训练数据增强
train_datagen = ImageDataGenerator(rescale=1./255,             # 像素值归一化rotation_range=40,          # 随机旋转角度范围width_shift_range=0.2,      # 宽度偏移范围height_shift_range=0.2,     # 高度偏移范围shear_range=0.2,            # 剪切强度zoom_range=0.2,             # 缩放范围horizontal_flip=True,       # 水平翻转fill_mode='nearest'         # 填充模式
)# 验证和测试数据只需要归一化
valid_datagen = ImageDataGenerator(rescale=1./255)
test_datagen = ImageDataGenerator(rescale=1./255)# 创建数据生成器
train_generator = train_datagen.flow_from_directory(train_dir,target_size=(img_width, img_height),batch_size=batch_size,class_mode='binary'
)valid_generator = valid_datagen.flow_from_directory(valid_dir,target_size=(img_width, img_height),batch_size=batch_size,class_mode='binary'
)test_generator = test_datagen.flow_from_directory(test_dir,target_size=(img_width, img_height),batch_size=batch_size,class_mode='binary'
)# 展示数据增强效果
def show_augmented_images():# 获取一批训练数据x_batch, y_batch = next(train_generator)# 显示几张增强后的图像plt.figure(figsize=(12, 8))for i in range(8):plt.subplot(2, 4, i+1)plt.imshow(x_batch[i])plt.title('Cat' if y_batch[i] < 0.5 else 'Dog')plt.axis('off')plt.tight_layout()plt.suptitle('数据增强效果展示', y=0.98)plt.show()show_augmented_images()

数据增强带来的好处包括：

大幅扩充训练样本，减轻过拟合风险
提高模型对各种环境下图像的适应能力
增强模型对旋转、缩放、光照等变化的鲁棒性

四、模型选择与训练

在这个项目中，我们将探讨两种解决方案：自建CNN模型和迁移学习。

1. 自建CNN模型

首先，我们来构建一个简单而有效的卷积神经网络：

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint, ReduceLROnPlateau
import time# 构建模型
def build_custom_cnn():model = Sequential([# 第一个卷积块Conv2D(32, (3, 3), activation='relu', padding='same', input_shape=(img_width, img_height, 3)),MaxPooling2D(2, 2),# 第二个卷积块Conv2D(64, (3, 3), activation='relu', padding='same'),MaxPooling2D(2, 2),# 第三个卷积块Conv2D(128, (3, 3), activation='relu', padding='same'),MaxPooling2D(2, 2),# 第四个卷积块Conv2D(256, (3, 3), activation='relu', padding='same'),MaxPooling2D(2, 2),# 展平层Flatten(),# 全连接层Dense(512, activation='relu'),Dropout(0.5),  # 防止过拟合Dense(1, activation='sigmoid')  # 二分类问题使用sigmoid激活函数])# 编译模型model.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy'])return model# 创建自定义CNN模型
custom_model = build_custom_cnn()
custom_model.summary()

在这里插入图片描述

2. 使用迁移学习

迁移学习是一种强大的技术，它利用预训练模型的知识来解决新问题。这在图像分类任务中特别有效：

from tensorflow.keras.applications import VGG16, ResNet50
from tensorflow.keras.layers import GlobalAveragePooling2D# 使用VGG16作为基础模型
def build_transfer_model():# 加载预训练的VGG16模型（不包括顶层分类器）base_model = VGG16(weights='imagenet', include_top=False, input_shape=(img_width, img_height, 3))# 冻结预训练层的权重for layer in base_model.layers:layer.trainable = False# 构建新模型model = Sequential([base_model,GlobalAveragePooling2D(),Dense(256, activation='relu'),Dropout(0.5),Dense(1, activation='sigmoid')])# 编译模型model.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy'])return model# 创建迁移学习模型
transfer_model = build_transfer_model()
transfer_model.summary()

在这里插入图片描述

3. 模型训练

接下来，我们使用准备好的数据集来训练这两个模型，并比较它们的性能：

# 设置回调函数
callbacks = [EarlyStopping(patience=5, restore_best_weights=True),ModelCheckpoint('best_model.h5', save_best_only=True),ReduceLROnPlateau(factor=0.1, patience=3, min_lr=1e-6)
]# 训练自建CNN模型
print("训练自建CNN模型...")
start_time = time.time()
custom_history = custom_model.fit(train_generator,steps_per_epoch=train_generator.samples // batch_size,epochs=30,validation_data=valid_generator,validation_steps=valid_generator.samples // batch_size,callbacks=callbacks
)
custom_train_time = time.time() - start_time
print(f"训练时间: {custom_train_time:.2f}秒")# 训练迁移学习模型
print("\n训练迁移学习模型...")
start_time = time.time()
transfer_history = transfer_model.fit(train_generator,steps_per_epoch=train_generator.samples // batch_size,epochs=15,  # 迁移学习通常需要更少的轮次validation_data=valid_generator,validation_steps=valid_generator.samples // batch_size,callbacks=callbacks
)
transfer_train_time = time.time() - start_time
print(f"训练时间: {transfer_train_time:.2f}秒")

五、结果可视化与评估

1. 训练过程可视化

训练完成后，我们通过可视化来分析模型的学习过程和性能：

def plot_training_history(history, title):plt.figure(figsize=(12, 5))# 训练与验证准确率plt.subplot(1, 2, 1)plt.plot(history.history['accuracy'], label='训练准确率')plt.plot(history.history['val_accuracy'], label='验证准确率')plt.title(f'{title} - 准确率')plt.xlabel('轮次')plt.ylabel('准确率')plt.legend()# 训练与验证损失plt.subplot(1, 2, 2)plt.plot(history.history['loss'], label='训练损失')plt.plot(history.history['val_loss'], label='验证损失')plt.title(f'{title} - 损失')plt.xlabel('轮次')plt.ylabel('损失')plt.legend()plt.tight_layout()plt.show()# 可视化训练过程
plot_training_history(custom_history, '自建CNN模型')
plot_training_history(transfer_history, '迁移学习模型')

2. 模型评估

接下来，我们对模型在测试集上的性能进行全面评估：

from sklearn.metrics import classification_report, confusion_matrix, roc_curve, auc
import seaborn as snsdef evaluate_model(model, generator, model_name):# 获取预测结果y_pred_prob = model.predict(generator)y_pred = (y_pred_prob > 0.5).astype(int)y_true = generator.classes# 计算混淆矩阵cm = confusion_matrix(y_true, y_pred)# 输出分类报告print(f"\n{model_name} 测试集评估结果:")print(classification_report(y_true, y_pred, target_names=['猫', '狗']))# 绘制混淆矩阵plt.figure(figsize=(8, 6))sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',xticklabels=['猫', '狗'],yticklabels=['猫', '狗'])plt.title(f'{model_name} - 混淆矩阵')plt.ylabel('真实标签')plt.xlabel('预测标签')plt.show()# 绘制ROC曲线fpr, tpr, _ = roc_curve(y_true, y_pred_prob)roc_auc = auc(fpr, tpr)plt.figure(figsize=(8, 6))plt.plot(fpr, tpr, color='darkorange', lw=2, label=f'ROC曲线 (AUC = {roc_auc:.3f})')plt.plot([0, 1], [0, 1], color='navy', lw=2, linestyle='--')plt.xlim([0.0, 1.0])plt.ylim([0.0, 1.05])plt.xlabel('假阳性率')plt.ylabel('真阳性率')plt.title(f'{model_name} - ROC曲线')plt.legend(loc="lower right")plt.show()return roc_auc# 评估两个模型
custom_auc = evaluate_model(custom_model, test_generator, '自建CNN模型')
transfer_auc = evaluate_model(transfer_model, test_generator, '迁移学习模型')# 比较两个模型
print("\n模型比较:")
print(f"{'模型':<20} {'测试集AUC':<12} {'训练时间(秒)'}")
print(f"{'-'*45}")
print(f"{'自建CNN模型':<20} {custom_auc:<12.4f} {custom_train_time:.2f}")
print(f"{'迁移学习模型':<20} {transfer_auc:<12.4f} {transfer_train_time:.2f}")

在这里插入图片描述

3. 预测可视化

让我们对一些测试图像进行预测，并可视化结果：

def visualize_predictions(model, test_dir, model_name, num_samples=8):# 准备数据test_imgs = []test_labels = []# 随机选择猫和狗图片cat_dir = os.path.join(test_dir, 'cats')dog_dir = os.path.join(test_dir, 'dogs')cat_files = random.sample(os.listdir(cat_dir), num_samples//2)dog_files = random.sample(os.listdir(dog_dir), num_samples//2)# 加载并预处理图像from tensorflow.keras.preprocessing.image import load_img, img_to_arrayfor img_file in cat_files:img = load_img(os.path.join(cat_dir, img_file), target_size=(img_width, img_height))img_array = img_to_array(img) / 255.0test_imgs.append(img_array)test_labels.append(0)  # 0代表猫for img_file in dog_files:img = load_img(os.path.join(dog_dir, img_file), target_size=(img_width, img_height))img_array = img_to_array(img) / 255.0test_imgs.append(img_array)test_labels.append(1)  # 1代表狗# 转换为numpy数组test_imgs = np.array(test_imgs)# 预测predictions = model.predict(test_imgs)# 可视化结果plt.figure(figsize=(16, 8))for i in range(num_samples):plt.subplot(2, num_samples//2, i+1)plt.imshow(test_imgs[i])true_label = "Cat" if test_labels[i] == 0 else "Dog"pred_prob = predictions[i][0]pred_label = "Cat" if pred_prob < 0.5 else "Dog"title = f"真: {true_label}\n预测: {pred_label} ({pred_prob:.2f})"color = "green" if true_label == pred_label else "red"plt.title(title, color=color)plt.axis('off')plt.suptitle(f"{model_name} - 预测结果展示", fontsize=16)plt.tight_layout()plt.subplots_adjust(top=0.9)plt.show()# 可视化预测结果
visualize_predictions(custom_model, test_dir, '自建CNN模型')
visualize_predictions(transfer_model, test_dir, '迁移学习模型')

六、模型部署建议

训练好高性能模型后，我们需要将其部署到实际应用中。下面是一些常见的部署方法：

1. 模型导出

根据框架的不同，我们可以选择不同的导出格式：

# 保存TensorFlow/Keras模型为.h5格式
def save_model(model, model_name):model_path = f"{model_name}.h5"model.save(model_path)print(f"模型已保存到: {model_path}")# 保存两个模型
save_model(custom_model, "custom_cat_dog_classifier")
save_model(transfer_model, "transfer_cat_dog_classifier")# 如果使用PyTorch，则可以使用以下方式保存模型
"""
import torch# 假设model是一个PyTorch模型
torch.save(model.state_dict(), 'cat_dog_classifier.pt')
"""

2. 模型部署选项

3. 部署示例：Flask Web应用

下面是一个使用Flask创建简单Web应用的示例：

from flask import Flask, request, render_template, jsonify
from tensorflow.keras.models import load_model
from tensorflow.keras.preprocessing.image import img_to_array
from PIL import Image
import numpy as np
import io
import base64app = Flask(__name__)# 加载模型
model = load_model('transfer_cat_dog_classifier.h5')# 图像预处理函数
def preprocess_image(image, target_size):if image.mode != "RGB":image = image.convert("RGB")image = image.resize(target_size)image = img_to_array(image)image = np.expand_dims(image, axis=0)return image / 255.0@app.route('/', methods=['GET', 'POST'])
def predict():result = Noneimage_data = Noneif request.method == 'POST':# 检查是否有文件上传if 'file' not in request.files:return render_template('index.html', error='没有选择文件')file = request.files['file']if file.filename == '':return render_template('index.html', error='没有选择文件')# 读取图像文件image = Image.open(io.BytesIO(file.read()))# 保存图像为base64字符串以便在页面上显示buffered = io.BytesIO()image.save(buffered, format="JPEG")image_data = base64.b64encode(buffered.getvalue()).decode()# 预处理图像processed_image = preprocess_image(image, target_size=(224, 224))# 预测prediction = model.predict(processed_image)[0][0]# 判断结果if prediction < 0.5:label = "猫"confidence = 1 - predictionelse:label = "狗"confidence = predictionresult = {'prediction': label,'confidence': float(confidence * 100)}return render_template('index.html', result=result, image=image_data)if __name__ == '__main__':app.run(debug=True)

HTML模板 (templates/index.html) 可以这样设计：

<!DOCTYPE html>
<html>
<head><title>猫狗分类器</title><style>body {font-family: Arial, sans-serif;max-width: 800px;margin: 0 auto;padding: 20px;}.container {display: flex;flex-direction: column;align-items: center;}.prediction-container {margin-top: 20px;text-align: center;}img {max-width: 300px;margin-top: 20px;}.confidence-bar {height: 20px;background-color: #f0f0f0;border-radius: 10px;margin-top: 10px;overflow: hidden;}.confidence-level {height: 100%;background-color: #4CAF50;text-align: center;color: white;line-height: 20px;}</style>
</head>
<body><div class="container"><h1>猫狗图像分类器</h1><p>上传一张猫或狗的图片，AI将告诉你图中是猫还是狗。</p><form method="post" enctype="multipart/form-data"><input type="file" name="file" accept="image/*"><input type="submit" value="预测"></form>{% if error %}<p style="color: red;">{{ error }}</p>{% endif %}{% if result %}<div class="prediction-container"><h2>预测结果: {{ result.prediction }}</h2><div class="confidence-bar"><div class="confidence-level" style="width: {{ result.confidence }}%;">{{ result.confidence|round(2) }}%</div></div>{% if image %}<img src="data:image/jpeg;base64,{{ image }}" alt="上传的图片">{% endif %}</div>{% endif %}</div>
</body>
</html>

4. 其他部署考虑因素

模型优化：考虑使用模型量化、剪枝等技术减小模型体积，加快推理速度
批处理：如果需要处理大量图像，可以采用批处理方式提高效率
API设计：为不同平台设计合适的API接口
安全性：防止恶意输入和潜在攻击
监控与更新：定期监控模型性能，收集新数据进行更新

七、总结与扩展

1. 项目总结

在本项目中，我们完成了一个完整的计算机视觉任务——猫狗图像分类系统。我们经历了：

数据准备与增强：收集、清洗和增强图像数据
模型构建与训练：设计CNN架构和使用迁移学习
评估与可视化：使用各种指标和可视化方法评估模型性能
部署建议：探讨了将模型部署到实际应用的方法

通过这个项目，我们不仅掌握了深度学习的基本技能，还了解了实际项目的完整流程。

2. 扩展方向

这个项目还有很多可以改进和扩展的方向：

多分类扩展：增加更多动物种类的分类
目标检测：不仅分类图像，还可以定位图中的猫狗位置
实时检测：结合视频流实现实时猫狗检测
模型压缩：研究如何在保持准确率的同时减小模型大小
解释性研究：使用Grad-CAM等技术可视化模型关注的区域

3. 学习资源推荐

如果你对深度学习和计算机视觉感兴趣，可以继续学习以下资源：

书籍：《Deep Learning》（Goodfellow等著）、《动手学深度学习》（李沐等著）
课程：Stanford CS231n、Fast.ai深度学习课程
竞赛平台：Kaggle、天池等，可以参加各种计算机视觉挑战
实践项目：尝试不同类型的图像处理任务，如分割、检测和生成

八、参考资料

Kaggle Dogs vs. Cats竞赛：https://www.kaggle.com/c/dogs-vs-cats
TensorFlow官方教程：https://www.tensorflow.org/tutorials
PyTorch官方文档：https://pytorch.org/docs/stable/index.html
《Deep Learning with Python》（François Chollet著）
《Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow》（Aurélien Géron著）

通过这个猫狗分类的实战项目，我们将前面所学的深度学习知识应用到了实际问题中。希望这个项目能帮助你更好地理解深度学习的工作流程，并为你未来探索更复杂的计算机视觉任务奠定基础！

祝你学习愉快，Python星球的探索者！👨‍🚀🌠