摘要:
合集:AI案例-CV-农业
数据集:PlantVillage2018马铃薯叶病检测数据集
数据集价值:精准病害识别、促进智慧农业。
解决方案:Tensorflow-Keras框架、CNN模型
一、问题描述
早期检测马铃薯叶病具有挑战性,因为作物品种、作物病害症状和环境因素的变化。这些因素使得在早期阶段难以检测到马铃薯叶病。已经开发了各种机器学习技术来检测马铃薯叶病。然而,现有方法不能普遍检测作物品种和作物病害,因为这些模型是在特定地区的植物叶片图像上训练和测试的。 在这项研究中,开发了一个用于马铃薯叶病识别的多层次深度学习模型。在第一层,它使用YOLOv5图像分割技术从马铃薯植株图像中提取马铃薯叶片。在第二层,开发了一种新颖的深度学习技术,使用卷积神经网络从马铃薯叶图像中检测早疫病和晚疫病马铃薯病害。所提出的马铃薯叶病检测模型在马铃薯叶病数据集上进行了训练和测试。 马铃薯叶病数据集包含了从巴基斯坦旁遮普省中部地区收集的4072张图像。所提出的深度学习技术在马铃薯叶病数据集上达到了99.75%的准确率。所提出的技术还在PlantVillage数据集上进行了性能评估。所提出的技术与最先进的模型进行了比较,并在准确性和计算成本方面取得了显著的成绩。
二、数据集内容
Potato Disease Leaf Dataset (PLD) 是由 PlantVillage 发布的,用于植物病害分类研究,尤其针对土豆叶片的病害。该数据集包含了多种常见的土豆叶片病害的图像,包括健康叶片、早疫病、晚疫病等。PLD数据集的发布时间为 2018年。
数据集目录结构如下:./PLD_3_Classes_256/
- Training
- Healthy:816张图片
- Early_Blight:1,303张图片
- Late_Blight:1,132张图片
- Validation
- Healthy:102张图片
- Early_Blight:163张图片
- Late_Blight:151张图片
- Testing
- Healthy:102张图片
- Early_Blight:162张图片
- Late_Blight:141张图片
数据集图片样例:

数据集版权许可协议
Database Contents License (DbCL) v1.0
引用要求
Potato Leaves Images Create First Time Dataset In Pakistan Region. Please Cite Paper if you Can Download and Use images for your Research.(https://doi.org/10.3390/electronics10172064)
Multi-Level Deep Learning Model for Potato Leaf Disease Recognition
三、解决方案样例
1、方案
农民每年都会因土豆植株的各种疾病而遭受经济损失和作物浪费。早疫病和晚疫病是土豆叶的主要疾病。据估计,土豆产量遭受的重大损失是由这些疾病引起的。因此,图像被分类为3个类别:
- 健康叶片
- 早疫病叶片
- 晚疫病叶片
使用Keras库构建一个卷积神经网络(CNN)模型进行图像分类。
2、安装tensorflow-gpu=2.x
tensorflow 库的安装步骤可以参考文档:Ai-Basic章节中的《安装深度学习框架TensorFlow》。以tensorflow_gpu-2.6.0的安装为例,查表可获知该版本依赖于python3.6-3.9。
conda create -n tensorflow-gpu-2-6-p3-9 python=3.9
conda activate tensorflow-gpu-2-6-p3-9
conda install tensorflow-gpu==2.6
四、工作流程
1、导入库和配置
源码:train.py 和 infer.py
import tensorflow as tf
from tensorflow.keras import models, layers
import matplotlib.pyplot as plt
import numpy as np
import pathlib
# Global initialization of some imp variables
Image_Size = 256
Batch_Size = 32
Channels = 3
Epochs = 50
# 在训练和推理时使用相同的类别名称顺序
# 在创建数据集后,手动设置正确的类别名称
CLASS_NAME = ['Early_Blight', 'Healthy', 'Late_Blight']
2、加载数据集
源码:train.py
从图片目录中读入数据集:
Current_Dir = os.getcwd()
dataset_dir = pathlib.Path(Current_Dir + '/PLD_3_Classes_256')
dataset = tf.keras.preprocessing.image_dataset_from_directory(
dataset_dir, batch_size = Batch_Size, image_size = (Image_Size, Image_Size), shuffle = True)
3、分割数据集和预处理
既然数据集已经预先分割好了Train/Val/Test,应该直接使用这些分割而不是用代码重新分割:
# 分别加载训练、验证、测试集
train_data = tf.keras.preprocessing.image_dataset_from_directory(
'PLD_3_Classes_256/Training')
val_data = tf.keras.preprocessing.image_dataset_from_directory(
'PLD_3_Classes_256/Validation')
test_data = tf.keras.preprocessing.image_dataset_from_directory(
'PLD_3_Classes_256/Testing')
数据预处理:
# 创建图片预处理函数
def preprocess_image(image, label):
image = tf.image.resize(image, [Image_Size, Image_Size])
image = image / 255.0 # 归一化
return image, label
# 应用图片预处理
train_data = train_data.map(preprocess_image)
val_data = val_data.map(preprocess_image)
test_data = test_data.map(preprocess_image)
print(f"Batch size of Training Data is: {len(train_data)}\nBatch size of Validation Data is: {len(val_data)}\nBatch size of Testing Data is: {len(test_data)}")
# Caching, shuffle and prefetching the data
train_ds = train_data.cache().shuffle(100).prefetch(buffer_size = tf.data.AUTOTUNE)
val_ds = val_data.cache().shuffle(100).prefetch(buffer_size = tf.data.AUTOTUNE)
test_ds = test_data.cache().shuffle(100).prefetch(buffer_size = tf.data.AUTOTUNE)
输出:
Found 3251 files belonging to 3 classes.
Found 416 files belonging to 3 classes.
Found 405 files belonging to 3 classes.
Batch size of Training Data is: 102
Batch size of Validation Data is: 13
Batch size of Testing Data is: 13
4、创建模型
这段代码是使用Keras库构建一个卷积神经网络(CNN)模型的过程,用于图像分类任务。下面是对代码的解释:
- 输入层:
model.build(input_shape = (Batch_Size, Image_Size, Image_Size, Channels))
定义了输入图像的形状要求:第一个维度为 32,就是批次维度(Batch_Size)大小为;图像的高度和宽度是256*256;通道数为3支持RGB三种颜色。- 要求已进行归一化处理。
- 训练和批量推理时,数据已经按批次组织,无需手动添加批次维度。对于单个图像需要添加批次维度,从 (256, 256, 3) 变为 (1, 256, 256, 3):在 run_inference_on_model() 函数中 single_image = np.expand_dims(image, axis=0) 。
- 卷积层和池化层:
- 模型包含多个卷积层(
Conv2D
)和最大池化层(MaxPool2D
)。 - 卷积层使用不同数量的滤波器(filters)和3×3的卷积核(kernel_size),激活函数为ReLU。
- 池化层使用2×2的窗口进行最大池化,减少特征图的尺寸。
- 模型包含多个卷积层(
- 全连接层:
Flatten
层将卷积层输出的三维特征图展平为一维向量,以便连接到全连接层。- 第一个全连接层(
Dense
)有128个神经元,激活函数为ReLU。 - 第二个全连接层有len(CLASS_NAME)个神经元,激活函数为softmax,用于多分类任务。
关键设计:
- 移除模型中的预处理层,改为在数据输入前进行预处理,包括训练场景和推理场景。
- 数据增强(Data Augmentation)技术,目的是通过对训练图像进行随机变换(如翻转、旋转、缩放等)来创建更多样的训练样本,从而提高模型的泛化能力。只有当训练时候,才使用数据增强技术,保存模型和推理时候,不使用。
Input_Shape = (Image_Size, Image_Size, Channels)
model = models.Sequential([
# Adjust the size of the image and rescale its pixel values to a specific range
resize_and_rescale,
# Increasing the diversity of training data by applying a series of random transformations to the original image
data_augmentation,
# The first convolutional layer consists of 16 3x3 convolution kernels and ReLU activation function
layers.Conv2D(filters = 16, kernel_size = (3, 3), activation = 'relu', input_shape = Input_Shape),
# The first maximum pooling layer, 2x2 pooling window
layers.MaxPool2D((2, 2)),
# The second convolutional layer consists of 64 3x3 convolution kernels and ReLU activation function
layers.Conv2D(64, (3, 3), activation = 'relu'),
# The second maximum pooling layer, 2x2 pooling window
layers.MaxPool2D((2, 2)),
# The third convolutional layer consists of 128 3x3 convolution kernels and ReLU activation function
layers.Conv2D(128, (3, 3), activation = 'relu'),
# The third maximum pooling layer, 2x2 pooling window
layers.MaxPool2D((2, 2)),
# The fourth convolutional layer consists of 64 3x3 convolution kernels and ReLU activation function
layers.Conv2D(64, (3, 3), activation = 'relu'),
# The fouth maximum pooling layer, 2x2 pooling window
layers.MaxPool2D((2, 2)),
# The fifth convolutional layer consists of 128 3x3 convolution kernels and ReLU activation function
layers.Conv2D(128, (3, 3), activation = 'relu'),
# The fifth maximum pooling layer, 2x2 pooling window
layers.MaxPool2D((2, 2)),
# The sixth convolutional layer consists of 64 3x3 convolution kernels and ReLU activation function
layers.Conv2D(64, (3, 3), activation = 'relu'),
# The sixth maximum pooling layer, 2x2 pooling window
layers.MaxPool2D((2, 2)),
# Flatten layer, flatten the 3D output into one dimension for connection to the fully connected layer
layers.Flatten(),
# The first fully connected layer, consisting of 128 neurons and a ReLU activation function
layers.Dense(128, activation = 'relu'),
# The second fully connected layer, consisting of 64 neurons and a softmax activation function, is used for multi classification tasks
layers.Dense(len(CLASS_NAME), activation = 'softmax'),
])
model.build(input_shape = (Batch_Size, Image_Size, Image_Size, Channels))
model.summary()
输出:
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 254, 254, 16) 448
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 127, 127, 16) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 125, 125, 64) 9280
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 62, 62, 64) 0
_________________________________________________________________
dropout (Dropout) (None, 62, 62, 64) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 60, 60, 128) 73856
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 30, 30, 128) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 30, 30, 128) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 28, 28, 64) 73792
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 14, 14, 64) 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, 12, 12, 128) 73856
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 6, 6, 128) 0
_________________________________________________________________
conv2d_5 (Conv2D) (None, 4, 4, 64) 73792
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 2, 2, 64) 0
_________________________________________________________________
flatten (Flatten) (None, 256) 0
_________________________________________________________________
dense (Dense) (None, 128) 32896
_________________________________________________________________
dense_1 (Dense) (None, 3) 387
=================================================================
Total params: 338,307
Trainable params: 338,307
Non-trainable params: 0
4、训练模型并保存
model.compile(
optimizer = 'adam',
loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits = False),
metrics = ['accuracy'])
# 不将数据增强层包含在最终模型中, 只在训练代码中单独应用数据增强
augmented_train_data = train_data.map(
lambda x, y: (data_augmentation(x, training=True), y)
)
history = model.fit(augmented_train_data,epochs = Epochs, batch_size = Batch_Size, verbose = 1, validation_data = val_data)
# 保存整个模型(包括结构和权重)
model.save('potato_leaf_disease_model.h5')
执行过程:
Epoch 1/50
102/102 [==============================] - 53s 420ms/step - loss: 1.2111 - accuracy: 0.4211 - val_loss: 1.0231 - val_accuracy: 0.5048
Epoch 2/50
102/102 [==============================] - 41s 389ms/step - loss: 0.9358 - accuracy: 0.5484 - val_loss: 0.9264 - val_accuracy: 0.5889
Epoch 3/50
102/102 [==============================] - 41s 402ms/step - loss: 0.8768 - accuracy: 0.6017 - val_loss: 0.8369 - val_accuracy: 0.6683
Epoch 4/50
102/102 [==============================] - 44s 432ms/step - loss: 0.8691 - accuracy: 0.6155 - val_loss: 0.8727 - val_accuracy: 0.5986
Epoch 5/50
102/102 [==============================] - 43s 420ms/step - loss: 0.8059 - accuracy: 0.6536 - val_loss: 0.7719 - val_accuracy: 0.6899
Epoch 6/50
102/102 [==============================] - 45s 442ms/step - loss: 0.6867 - accuracy: 0.7235 - val_loss: 0.6035 - val_accuracy: 0.7596
Epoch 7/50
102/102 [==============================] - 51s 500ms/step - loss: 0.5141 - accuracy: 0.8007 - val_loss: 0.5564 - val_accuracy: 0.7668
Epoch 8/50
102/102 [==============================] - 55s 540ms/step - loss: 0.4678 - accuracy: 0.8139 - val_loss: 0.5371 - val_accuracy: 0.7933
Epoch 9/50
102/102 [==============================] - 65s 635ms/step - loss: 0.4129 - accuracy: 0.8373 - val_loss: 0.4576 - val_accuracy: 0.8125
Epoch 10/50
102/102 [==============================] - 65s 635ms/step - loss: 0.3772 - accuracy: 0.8499 - val_loss: 0.4747 - val_accuracy: 0.7981
Epoch 11/50
102/102 [==============================] - 64s 631ms/step - loss: 0.3369 - accuracy: 0.8767 - val_loss: 0.3738 - val_accuracy: 0.8558
Epoch 12/50
102/102 [==============================] - 73s 712ms/step - loss: 0.2679 - accuracy: 0.9034 - val_loss: 0.3730 - val_accuracy: 0.8558
Epoch 13/50
...
Epoch 49/50
102/102 [==============================] - 91s 894ms/step - loss: 0.0477 - accuracy: 0.9855 - val_loss: 0.0674 - val_accuracy: 0.9784
Epoch 50/50
102/102 [==============================] - 90s 887ms/step - loss: 0.0581 - accuracy: 0.9803 - val_loss: 0.0376 - val_accuracy: 0.9880
5、分析训练结果
以图形方式分析训练/验证模型的结果:
plt.figure(figsize = (15, 15))
plt.subplot(2, 3, 1)
plt.plot(range(Epochs), train_acc, label = 'Training Accuracy')
plt.plot(range(Epochs), val_acc, label = 'Validation Accuracy')
plt.legend(loc = 'lower right')
plt.title('Training and Validation Accuracy')
plt.subplot(2, 3, 2)
plt.plot(range(Epochs), train_loss, label = 'Training Loss')
plt.plot(range(Epochs), val_loss, label = 'Validation Loss')
plt.legend(loc = 'upper right')
plt.title('Training and Validation Loss')
6、展示结果
展示一批土豆叶片图像的实际标签、预测标签和置信度:
plt.figure(figsize = (16, 16))
for batch_image, batch_label in train_data.take(1):
for i in range(9):
ax = plt.subplot(3,3,i+1)
# 反归一化:将0-1范围的图像转换回0-255范围
image = (batch_image[i].numpy()*255).astype('uint8')
label = CLASS_NAME[batch_label[i]]
plt.imshow(image)
batch_prediction = model.predict(batch_image)
predicted_class = CLASS_NAME[np.argmax(batch_prediction[i])]
confidence = round(np.max(batch_prediction[i]) * 100, 2)
plt.title(f'Actual: {label},\nPrediction: {predicted_class},\nConfidence: {confidence}%')
plt.axis('off')
最后执行结果
The Model shows final accuracy > 98%
7、加载模型和推理
源码: infer.py 加载生成的模型文件,进行推理。
容易出现的模型结构问题:训练时使用的模型结构在推理时是否完全相同?设计模型时候,若模型的第一层是resize_and_rescale,然后是一系列卷积层。在推理代码中,加载模型后,输入图像会先经过resize_and_rescale预处理,然后输入模型。但是,模型本身已经包含了resize_and_rescale层,这意味着在推理时图像会被预处理两次(一次在load_and_preprocess_image中,一次在模型中)。这会导致图像被缩放两次,从而可能影响预测结果。
# 创建图片预处理函数 (等同于训练场景下)
def preprocess_image(image):
image = tf.image.resize(image, [Image_Size, Image_Size])
image = image / 255.0 # 归一化
return image
def load_and_preprocess_image(image_path, image_size=256):
"""
加载和预处理图像
"""
# 读取图像
image = tf.io.read_file(image_path)
# 解码图像
image = tf.image.decode_image(image, channels=3)
# 调整图像大小
image = tf.image.resize(image, [image_size, image_size])
# 归一化。模型不做归一化处理。
image = image / 255.0
# 添加批次维度
image = tf.expand_dims(image, axis=0)
return image
def predict_single_image(model, image_path, image_size=256):
"""
对单张图像进行预测
"""
print(f"predict_single_image() image_path={image_path}")
# 加载和预处理图像
image = load_and_preprocess_image(image_path, image_size)
# 进行预测
# 确保数据增强层不生效
predictions = model.predict(image,verbose=0)
predicted_class = np.argmax(predictions[0])
confidence = np.max(predictions[0]) * 100
# 获取预测结果
predicted_label = CLASS_NAME[predicted_class]
return predicted_label, confidence, image
对指定目录下的文件进行批量推理: (子目录名称为图片分类名称)
def predict_batch_images(model, image_dir, image_size=256):
"""
对目录中的所有图像进行批量预测
"""
# 获取所有图像文件
# 目录名: 真实标签
image_dir = pathlib.Path(image_dir)
image_paths = list(image_dir.glob('*/*.jpg')) + list(image_dir.glob('*/*.jpeg')) + \
list(image_dir.glob('*/*.png')) + list(image_dir.glob('*/*.bmp'))
results = []
for image_path in image_paths:
try:
predicted_label, confidence, image = predict_single_image(
model, str(image_path), image_size)
# 获取真实标签(从目录名)
true_label = image_path.parent.name
print(f"{true_label} -> {predicted_label} : ({image_path})")
results.append({
'image_path': str(image_path),
'true_label': true_label,
'predicted_label': predicted_label,
'confidence': confidence,
'is_correct': true_label == predicted_label,
'image_array': image
})
except Exception as e:
print(f"处理图像 {image_path} 时出错: {e}")
return results
需要推理识别的图片存储在目录./PLD_3_Classes_256/Infer中,目录名称代表正确识别后的标签名称:
- Healthy
- Early_Blight
- Late_Blight
执行 python infer.py 后的结果展示如下:
验证训练和推理的图片预处理一致性:
训练预处理形状: (256, 256, 3)
推理预处理形状: (256, 256, 3)
像素值范围差异: 0.0
正在加载模型...
模型加载成功!
正在进行预测...
predict_single_image() image_path=Infer\Early_Blight\Early_Blight_15.jpg
Early_Blight -> Early_Blight : (Infer\Early_Blight\Early_Blight_15.jpg)
predict_single_image() image_path=Infer\Early_Blight\Early_Blight_16.jpg
Early_Blight -> Early_Blight : (Infer\Early_Blight\Early_Blight_16.jpg)
predict_single_image() image_path=Infer\Healthy\Healthy_1.jpg
Healthy -> Healthy : (Infer\Healthy\Healthy_1.jpg)
predict_single_image() image_path=Infer\Late_Blight\Late_Blight_2.jpg
Late_Blight -> Late_Blight : (Infer\Late_Blight\Late_Blight_2.jpg)
predict_single_image() image_path=Infer\Late_Blight\Late_Blight_7.jpg
Late_Blight -> Late_Blight : (Infer\Late_Blight\Late_Blight_7.jpg)
============================================================
预测结果汇总:
============================================================
总图像数: 5
正确预测: 5
准确率: 100.00%
============================================================
按类别统计:
Early_Blight: 2/2 (100.00%)
Healthy: 1/1 (100.00%)
Late_Blight: 2/2 (100.00%)
============================================================
详细预测结果:
============================================================
1. ✓ 图像: Early_Blight_15.jpg
真实标签: Early_Blight
预测标签: Early_Blight
置信度: 100.00%
2. ✓ 图像: Early_Blight_16.jpg
真实标签: Early_Blight
预测标签: Early_Blight
置信度: 98.74%
3. ✓ 图像: Healthy_1.jpg
真实标签: Healthy
预测标签: Healthy
置信度: 99.18%
4. ✓ 图像: Late_Blight_2.jpg
真实标签: Late_Blight
预测标签: Late_Blight
置信度: 100.00%
5. ✓ 图像: Late_Blight_7.jpg
真实标签: Late_Blight
预测标签: Late_Blight
置信度: 100.00%
详细结果已保存至: results\detailed_results.txt