加拿大高等研究院CIFAR-100计算机视觉数据集(2009)Python版和ResNet101识别样例

摘要:

合集:AI案例-CV-计算机服务业
数据集:加拿大高等研究院CIFAR发布的100个类别计算机视觉数据集
数据集价值:用于图像分类和机器学习任务。
解决方案:PyTorch框架、ResNet101神经网络模型

一、问题描述

CIFAR是 Canadian Institute For Advanced Research(加拿大高等研究院)的缩写,CIFAR-100数据集是CIFAR-10数据集的扩展,具有更多的类别和更复杂的分类任务,用于图像分类和机器学习任务。CIFAR-100数据集的开发目的是为了推动图像分类技术的发展,特别是在深度学习和计算机视觉领域。它由Hinton的学生Alex Krizhevsky和Ilya Sutskever收集,旨在通过提供标注好的图像数据,训练深度学习模型以识别图片中的目标

二、数据集内容

CIFAR-100数据集包含60,000张32×32像素的彩色图像,分为100个类别,每个类别有600张图像。CIFAR-100中的100个类别被分为20个超类。每张图像都带有一个“细粒度”标签(它所属的类别)和一个“粗粒度”标签(它所属的超类)。有50,000张训练图像和10,000张测试图像。

基本信息

CIFAR-100数据集基本信息:
图像数量:60,000张
图像尺寸:32x32像素
颜色通道:3(RGB)
类别数量:100
每个类别的图像数量:每个类别包含600张图像
训练集大小:50,000张
测试集大小:10,000张

数据结构

元数据文件包含每个类别和超类的标签名称。。

Classes:
1-5) beaver, dolphin, otter, seal, whale
6-10) aquarium fish, flatfish, ray, shark, trout
11-15) orchids, poppies, roses, sunflowers, tulips
16-20) bottles, bowls, cans, cups, plates
21-25) apples, mushrooms, oranges, pears, sweet peppers
26-30) clock, computer keyboard, lamp, telephone, television
31-35) bed, chair, couch, table, wardrobe
36-40) bee, beetle, butterfly, caterpillar, cockroach
41-45) bear, leopard, lion, tiger, wolf
46-50) bridge, castle, house, road, skyscraper
51-55) cloud, forest, mountain, plain, sea
56-60) camel, cattle, chimpanzee, elephant, kangaroo
61-65) fox, porcupine, possum, raccoon, skunk
66-70) crab, lobster, snail, spider, worm
71-75) baby, boy, girl, man, woman
76-80) crocodile, dinosaur, lizard, snake, turtle
81-85) hamster, mouse, rabbit, shrew, squirrel
86-90) maple, oak, palm, pine, willow
91-95) bicycle, bus, motorcycle, pickup truck, train
96-100) lawn-mower, rocket, streetcar, tank, tractor

and the list of the 20 superclasses:
1) aquatic mammals (classes 1-5)
2) fish (classes 6-10)
3) flowers (classes 11-15)
4) food containers (classes 16-20)
5) fruit and vegetables (classes 21-25)
6) household electrical devices (classes 26-30)
7) household furniture (classes 31-35)
8) insects (classes 36-40)
9) large carnivores (classes 41-45)
10) large man-made outdoor things (classes 46-50)
11) large natural outdoor scenes (classes 51-55)
12) large omnivores and herbivores (classes 56-60)
13) medium-sized mammals (classes 61-65)
14) non-insect invertebrates (classes 66-70)
15) people (classes 71-75)
16) reptiles (classes 76-80)
17) small mammals (classes 81-85)
18) trees (classes 86-90)
19) vehicles 1 (classes 91-95)
20) vehicles 2 (classes 96-100)

致谢

Learning Multiple Layers of Features from Tiny Images, Alex Krizhevsky, 2009.

数据集使用许可协议

Creative Commons Attribution 4.0 License

三、识别样例

源码:CIFAR-100-ResNet101.ipynb

安装pytorch2.4.1

选择合适的CUDA版本进行安装,例如pytorch==2.4.1:

conda create -n pytorch241-gpu python=3.10
conda activate pytorch241-gpu
conda install pytorch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 pytorch-cuda=12.1 -c pytorch -c nvidia

导入库

import os
os.environ["KMP_DUPLICATE_LIB_OK"] = "TRUE"

# 环境准备
import numpy as np # numpy数组库
import matplotlib.pyplot as plt # 画图库
import time as time

import torch # torch基础库
import torch.nn as nn # torch神经网络库
import torch.nn.functional as F
import torchvision.datasets as dataset # 公开数据集的下载和管理
import torchvision.transforms as transforms # 公开数据集的预处理库及格式转换
import torchvision.utils as utils
import torch.utils.data as data_utils # 对数据集进行分批加载的工具集
from PIL import Image # 图片显示
from collections import OrderedDict
import torchvision.models as models

print(torch.__version__)
print(torch.cuda.is_available())
print(torch.version.cuda)
print(torch.backends.cudnn.version())

输出:

2.4.1
True
12.1
90100

以下是工作流程:

1、加载数据集

# 数据集格式转换
transform_train = transforms.Compose([transforms.Resize(256), # transforms.Scale(256)
                                    transforms.CenterCrop(224),
                                    transforms.ToTensor(),
                                    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])])

transform_test = transforms.Compose([transforms.Resize(256), # transforms.Scale(256)
                                    transforms.CenterCrop(224),
                                    transforms.ToTensor(),
                                    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])])

# 训练数据集
train_data = dataset.CIFAR100(root = "./", train = True, transform = transform_train, download = True)

# 测试数据集
test_data = dataset.CIFAR100(root = "./", train = False, transform = transform_test, download = True)

print(train_data)
print("train_data size = ", len(train_data))
print("")
print(test_data)
print("test_data size = ", len(test_data))

输出:

Files already downloaded and verified
Files already downloaded and verified
Dataset CIFAR100
  Number of datapoints: 50000
  Root location: ./
  Split: Train
  StandardTransform
Transform: Compose(
              Resize(size=256, interpolation=bilinear, max_size=None, antialias=True)
              CenterCrop(size=(224, 224))
              ToTensor()
              Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
          )
train_data size = 50000

Dataset CIFAR100
  Number of datapoints: 10000
  Root location: ./
  Split: Test
  StandardTransform
Transform: Compose(
              Resize(size=256, interpolation=bilinear, max_size=None, antialias=True)
              CenterCrop(size=(224, 224))
              ToTensor()
              Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
          )
test_data size = 10000

批量读取数据:

# 批量数据读取
batch_size = 32

# batch_size 是每个批次读取的图片数量,shuffle 表示读取到的数据是否需要随机打乱顺序
train_loader = data_utils.DataLoader(dataset = train_data, batch_size = batch_size, shuffle = True)
test_loader = data_utils.DataLoader(dataset = test_data, batch_size = batch_size, shuffle = True)

print(len(train_data), len(train_data) / batch_size)
print(len(test_data), len(test_data) / batch_size)

输出:

# 批量数据读取
batch_size = 32

# batch_size 是每个批次读取的图片数量,shuffle 表示读取到的数据是否需要随机打乱顺序
train_loader = data_utils.DataLoader(dataset = train_data, batch_size = batch_size, shuffle = True)
test_loader = data_utils.DataLoader(dataset = test_data, batch_size = batch_size, shuffle = True)

print(len(train_data), len(train_data) / batch_size)
print(len(test_data), len(test_data) / batch_size)

2、定义模型

# 直接使用 torchvision 框架提供的预定义模型,设定输出分类的种类数100,默认为1000分类
net = models.resnet101(num_classes = 100)
#print(net)

# 导入模型预先参数
# 预训练参数预先从官网上下载
# model = models.resnet101(pretrained=True)
# model

net_params_path = "./models/resnet101-63fe2227.pth"

# 导入预训练参数
net_params = torch.load(net_params_path)
print(net_params)

定义网络预测输出:

# 测试网络是否能够工作
print("定义测试数据")
input = torch.randn(1, 3, 224, 224)
print(input.shape)

print("\n设定在训练模式, 由于随机dropout,导致相同的输入,输出每次都不相同")
net.train()
print("Net的输出方法1:")
out = net(input)
print(out.shape)
print(out)

print("\nNet的输出方法2:")
out = net.forward(input)
print(out)

定义Loss和优化器

loss_fn = nn.CrossEntropyLoss()
print(loss_fn)
Learning_rate = 0.01 # 学习率

# optimizer = SGD 基本梯度下降法
# parameters 指明要优化的参数列表
# lr 指明学习率
# optimizer = torch.optim.Adam(model.parameters(), lr = Learning_rate)
optimizer = torch.optim.SGD(net.parameters(), lr = Learning_rate, momentum = 0.9)
print(optimizer)

3、训练和运行结果

采用预定义的ResNet101模型进行训练:

# Assume that we are on a CUDA machine, then this should print a CUDA device:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(device)

# 把网络转移到GPU
# net.to(device) # 自适应选择法
net.cuda()  # 强制指定法

# 把loss计算转移到GPU
# loss_fn = loss_fn.to(device) # 自适应选择法
loss_fn.cuda()  # 强制指定法

# 定义迭代次数
epochs = 1  # 30

loss_history = []  # 训练过程中的loss数据
accuracy_history = []  # 中间的预测结果
accuracy_batch = 0.0

# 网络进入训练模式
model = net.train()

train_start = time.time()
print('Train start at - {}'.format(time.strftime("%X", time.localtime())))

for i in range(0, epochs):
   epoch_start = time.time()
   for j, (x_train, y_train) in enumerate(train_loader):
       # 设置模型在 train 模式
       # net.train()
       # 指定数据处理的硬件单元
       x_train = x_train.to(device)
       # x_train = x_train.cuda()
       y_train = y_train.to(device)
       # y_train = y_train.cuda()

       # (0) 复位优化器的梯度
       optimizer.zero_grad()    

       # (1) 前向计算
       y_pred = net(x_train)

       # (2) 计算loss
       loss = loss_fn(y_pred, y_train)

       # (3) 反向求导
       loss.backward()

       # (4) 反向迭代
       optimizer.step()

       # 记录训练过程中的损失值
       loss_history.append(loss.item())  # loss for a batch

       # 记录训练过程中的准确率
       number_batch = y_train.size()[0] # 图片的个数
       _, predicted = torch.max(y_pred.data, dim = 1)
       correct_batch = (predicted == y_train).sum().item() # 预测正确的数目
       accuracy_batch = 100 * correct_batch/number_batch
       accuracy_history.append(accuracy_batch)

       if (j % 10 == 0):
           print('Epoch {} batch {} in {}, loss = {:.4f} accuracy = {:.4f}%, {}'.format(i, j , len(train_data)/batch_size, loss.item(), accuracy_batch, time.strftime("%X", time.localtime())))

   epoch_end = time.time()
   epoch_cost = epoch_end - epoch_start
   print('Epoch {} cost {}s '.format(i, epoch_cost))

train_end = time.time()
train_cost = train_end - train_start
print('\nTrain finshed at - {}'.format(time.strftime("%X", time.localtime())))
print('Train cost {}s '.format(train_cost))
print("Final loss = ", loss.item())
print("Final accuracy = ", accuracy_batch)

输出:

Train start at - 10:09:13
Epoch 0 batch 0 in 1562.5, loss = 5.0165 accuracy = 0.0000%, 10:09:58
Epoch 0 batch 10 in 1562.5, loss = 5.7989 accuracy = 0.0000%, 10:17:07
...

4、预测和运行结果

index = 0
print("获取一个batch样本")
images, labels = next(iter(test_loader))
print(images.shape)
print(labels.shape)
print(labels)

print("\n对batch中所有样本进行预测")
outputs = net(images)
print(outputs.data.shape)

print("\n对batch中每个样本的预测结果,选择最可能的分类")
_, predicted = torch.max(outputs, 1)
print(predicted.data.shape)
print(predicted)

print("\n对batch中的所有结果进行比较")
bool_results = (predicted == labels)
print(bool_results.shape)
print(bool_results)

print("\n统计预测正确样本的个数和精度")
corrects = bool_results.sum().item()
accuracy = corrects/(len(bool_results))
print("corrects = ", corrects)
print("accuracy = ", accuracy)

print("\n样本index = ", index)
print("标签值:", labels[index]. item())
print("分类可能性:", outputs.data[index].numpy())
print("最大可能性:", predicted.data[index].item())
print("正确性:", bool_results.data[index].item())

源码开源协议

MIT License

四、获取案例套装

需要登录后才允许下载文件包。登录

需要登录后才允许下载文件包。登录

发表评论