当前位置：首页 > ds >正文

完整的VOC格式数据增强脚本

ds 2025/8/18 19:31:20

完整的VOC格式数据增强脚本，包含多种数据增强方法并能同步更新XML标签。

功能特性

数据增强方法：

水平/垂直翻转 - 同步更新边界框坐标
旋转 - 支持任意角度旋转，自动调整图像尺寸
平移 - 随机平移图像，保持边界框有效性
对比度/亮度调整 - 改善图像质量
高斯噪声 - 增强模型鲁棒性
随机裁剪 - 生成不同尺寸的训练样本
马赛克拼接 - 4张图片拼接成一张，丰富训练数据

XML标注同步：

自动解析VOC格式XML文件
精确计算变换后的边界框坐标
生成格式化的XML标注文件
验证边界框有效性

安装依赖

pip install opencv-python numpy

使用方法

基本用法：

python voc_augmentation.py --images_dir images --annotations_dir Annotations --output_dir augmented_dataset

自定义参数：

python voc_augmentation.py \--images_dir images \--annotations_dir Annotations \--output_dir augmented_dataset \--num_aug 8 \--enable_mosaic

参数说明：

--images_dir: 输入图片文件夹路径
--annotations_dir: 输入XML标注文件夹路径
--output_dir: 输出文件夹路径
--num_aug: 每张图片生成的增强数量（默认5）
--enable_mosaic: 启用马赛克拼接功能

文件结构

输入目录结构：

your_dataset/
├── images/
│   ├── image1.jpg
│   ├── image2.jpg
│   └── ...
└── Annotations/├── image1.xml├── image2.xml└── ...

输出目录结构：

augmented_dataset/
├── images/
│   ├── image1.jpg (原图)
│   ├── image1_aug_1_hflip_bright120.jpg
│   ├── image1_aug_2_rot15_crop80.jpg
│   ├── mosaic_001.jpg
│   └── ...
└── Annotations/├── image1.xml (原标注)├── image1_aug_1_hflip_bright120.xml├── image1_aug_2_rot15_crop80.xml├── mosaic_001.xml└── ...

代码集成示例

# 直接在代码中使用
from voc_augmentation import VOCDataAugmentation# 创建增强器
augmenter = VOCDataAugmentation(images_dir="images",annotations_dir="Annotations", output_dir="augmented_dataset"
)# 处理数据集
augmenter.process_dataset(num_augmentations_per_image=8,enable_mosaic=True
)