Research on Detection of Rice Pests and Diseases Based on Improved yolov5 Algorithm

Yang, Hua; Lin, Dang; Zhang, Gexiang; Zhang, Haifeng; Wang, Junxiong; Zhang, Shuxiang

doi:10.3390/app131810188

Open AccessArticle

Research on Detection of Rice Pests and Diseases Based on Improved yolov5 Algorithm

¹

School of Mathematics and Computer Science, Wuhan Polytechnic University, Wuhan 430040, China

²

School of Information Science and Technology, Chengdu University of Technology, Chengdu 610059, China

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2023, 13(18), 10188; https://doi.org/10.3390/app131810188

Submission received: 10 June 2023 / Revised: 17 August 2023 / Accepted: 31 August 2023 / Published: 11 September 2023

(This article belongs to the Special Issue Membrane Computing and Its Applications)

Download

Browse Figures

Versions Notes

Abstract

:

Rice pests and diseases have a significant impact on the quality and yield of rice, and even have a certain impact on and cause a loss in the national agricultural industry and economy. The timely and accurate detection of pests and diseases is the basic premise of formulating effective rice pest control and prevention programs. However, the complexity and diversity of pests and diseases and the high similarity between some pests and diseases make the detection and classification task of pests and diseases extremely difficult without detection tools. The existing target detection algorithms can barely complete the task of detecting pests and diseases, but the detection effect is not ideal. In the actual situation of rice disease and insect pest detection, the detection algorithm is required to have fast speed, high accuracy, and good performance for small target detection, and so this paper improved the popular yolov5 algorithm to achieve an ideal detection performance suitable for rice disease and insect pest detection. This paper briefly introduces the current status and influence of rice pests and diseases and several target detection algorithms based on deep learning. Based on the yolov5 algorithm, the RepVGG network structure is introduced, 3*3 convolution is combined with ReLU, a training time model with multi-branch topology is adopted, and the inference time is reduced through layer merging. To improve algorithm detection speed, the SK attention mechanism is introduced to improve the receptive field of the convolution kernel to obtain more information and improve accuracy. In addition, Adaptive NMS is replaced by Adaptive NMS, the dynamic suppression strategy is adopted, and scores for learning density are set, which greatly improves the problems of missing detection and the false detection of small targets. Finally, the improved algorithm model is combined with membrane calculation to further improve the accuracy and speed of the algorithm. According to the experimental results, the accuracy of the improved algorithm is increased by about 2.7 percentage points, and the mAP is increased by 4.3 percentage points, up to 94.4%. The speed is improved by about 2.8 percentage points, and indicators such as recall rate and AP are improved.

Keywords:

pest detection; RepVGG network structure; SK attention mechanism; adaptive NMS; membrane computing

1. Brief Introduction of Rice Pests and Diseases

1.1. Damage and Current Situation of Rice Pests and Diseases

China is a big agricultural country, and rice is one of the important crops in our country. Rice planting area accounts for about a quarter of the country’s food crop planting area. China has a large population, and rice is the main food source that people rely on for survival. In order to ensure the daily food supply of people, the area of rice planting is still expanding to meet the food supply demand [1]. However, pests and diseases are the main factors affecting rice yield. According to statistics, in recent years, the scope of diseases and pests has been expanding, the frequency of occurrence has gradually increased, and the degree of harm has been intensifying [2]. So far, the cumulative occurrence area of major rice diseases and pests is about 583 million mu, of which the cumulative occurrence areas of the rice planthopper, the rice leaf roll moth, and rice blast are 161 million mu, 120 million mu, and 20.62 million mu, respectively.

1.2. Several Common Rice Pests and Diseases

(1): Rice blast

Seedlings occur in different parts of the disease; the disease is not the same, and the disease directly leads to leaves becoming withered and yellow or even death.

(2): Sheath blight

When the disease occurs, it produces small irregular spots of dark green, which later expand into oval shapes, resulting in rapid leaf yellowing under the influence of pathogenic bacteria. In a humid environment, many white arachnid mycelia will appear at the lesion site, which will later cause the rice to lie down or rot to death.

(3): Malignant seedling disease

When the disease occurs, the rice seedlings will appear thin and tall, the leaves will slowly turn yellow, and the rice roots will also appear stunted, which will lead to the death of the rice seedlings, and there will be light-red or white mildew on the dead seedlings.

(4): Smut

At the onset of the disease, the grain will appear small yellow and green due to bacteria, which only affect the ear. The infection of rice with the disease will cause an increase in the empty chaff rate, a decrease in grain weight, and an increase in the quantity of broken rice, which will have a serious impact on the yield and quality of rice.

(5): Borer

During the growth of rice, the pest will gnaw on the rice, which will seriously affect the yield and quality of the rice.

(6): Rice leaf twister

The pest likes to lay eggs in dense green rice fields, which is not conducive to rice production.

(7): Rice planthopper

The pest likes to live in the base of rice clusters and feed on the juice of rice plants, which makes rice leaves turn yellow and affects the growth and yield of rice.

(8): Oryzacris sinensis

This pest likes to eat rice leaves, causing nicks, and in severe cases, even the whole leaf is eaten, leaving only the vein, thus forming white ears and missing grains, which seriously affects the yield of rice [3].

(9): Rice leafhopper

This pest likes to eat rice juice, which inhibits the growth and development of rice, leading to yellow leaves and even the necrosis of whole rice plants, and may spread rice virus disease.

1.3. Control and Detection of Rice Pests and Diseases

With the increasingly vigorous development of image recognition and deep learning, the research in the field of pest identification and detection has been further explored more effectively and accurately. At present, image recognition and deep learning methods are widely used in the field of pest identification and detection, and both have good detection effects [4]. The application of deep learning target detection network to the identification and detection of rice pests and diseases can effectively strengthen the prevention and control of rice pests and diseases and make the detection of diseases and pests more effective, fast, and accurate. In order to accurately detect lychee pests and diseases in complex environments in real time, Wang Wenxing et al. proposed a YOLO v4-GCF lychee pest and disease detection model based on yolov4, which increased the detection speed by 38% and the average accuracy by 4.13 percentage points [5]. He Yiting et al. proposed a detection model of coffee leaf pests and diseases based on yolov5 and the integrated ConvNext network and ECA attention mechanism, and the detection accuracy of this reached 94.13%, meeting the real-time requirement in actual detection [6]. Hu Kai et al. proposed the SSD-RES50-3C detection method for diseases and insect pests based on the improved SSD algorithm. ResNet50 was used to increase the feature extraction ability of diseases and insect pests, and the feature fusion module was added to improve the anti-interference ability of the algorithm. The average accuracy of the final algorithm was as high as 92.86%, which was 6.61% higher than that of the original algorithm [7]. Fan Xijian et al. proposed a plant disease and insect pest monitoring model based on visual enhancement attention improvement, namely the yolov5-VE model. The feature enhancement module CBAM was designed and the DIou loss function was introduced. The detection accuracy and average accuracy of the experimental results reached 65.87% and 73.49%, which reduced the interference of complex field scenes. This model can also be widely used in large-scale plant pest detection [8]. Aiming at the problems of poor real-time detection and robustness and the high missed-detection rate of yolov3 in the identification of crop leaf pests and diseases, Xu Huaijie et al. adopted Darknet 53 as the feature extraction network, established a 104*104 scale detection layer, and proposed a YOLOv3-corn leaf disease and insect detection model. mAP and Recall reached 93.31% and 93.08% [9]. Bai Xuesong et al. proposed an improved crop disease and pest detection algorithm based on the Res2NeXt50 model. The Res2NeXt50 model was obtained by grouping convolution in the Res2Net50 model, and the GELU function was used to replace the ReLU function. In training, label smoothing and the exponential moving average were used to improve the model generalization ability. Finally, the accuracy of the improved model reached 98.79% [10]. A large number of experimental results show that the crop disease and pest detection algorithm model based on the deep learning framework has great practicability and superiority in the field of disease and pest detection.

2. Object Detection Algorithm Based on Deep Learning

2.1. Faster R-CNN Algorithm

Ross B. Girshick proposed the Faster R-CNN algorithm in 2016 after proposing R-CNN and Fast R-CNN algorithms. The Faster R-CNN algorithm puts feature extraction, proposal extraction, and classification into one network, which greatly improves the comprehensive performance, especially in terms of improving the detection speed [11,12]. The Faster R-CNN algorithm inputs images through Conv layers. Conv layers have 13 convolution layers, 13 activation layers, and 4 pooling layers, and then, the feature map of input images can be obtained. The Faster R-CNN algorithm directly uses the RPN network to generate the detection frame, which can obtain more accurate proposals, greatly improve the generation speed of the detection frame, and thus improve the overall detection speed of the algorithm. After RPN, in order to scale feature maps of different sizes into the same size, ROI pooling operations should be carried out, and the pooling results should be used as input for regression and classification [13]. Figure 1 shows the overall flowchart of the Faster R-CNN algorithm.

2.2. yolov1 Algorithm

The input of the yolov1 algorithm is the whole image, and the target detection is treated as a regression problem to be solved, and the position and category of the pre-selected box are obtained by using regression directly on the output layer [14]. The yolov1 algorithm does not use the RPN network structure, which makes its algorithm structure very simple, involving convolution, pooling, and two full connections. The activation function of the final output layer adopts a linear function, which not only obtains the probability of the object but also predicts the location of the bounding box [15]. The algorithm divides the original image into 7 × 7 = 49 grids, in which each grid has to predict both the coordinates of two bounding boxes and a probability of the category of objects contained in the grid as well as the confidence of whether the objects are contained in the grid. The confidence formula is shown in Equation (1).

confidence = \Pr (Object) \times {I O U}_{p r e d}^{t r u t h}

(1)

In the formula, Pr(Object) indicates whether there are artificial marks in the grid; if there is 1, then 0.

{I O U}_{p r e d}^{t r u t h}

bounding box indicates the IOU between the bounding box and real box. The larger the value is, the closer the box is to the real position. By multiplying the class information predicted by each grid and the confidence information predicted, we can obtain the probability of predicting specific objects and the probability of overlapping positions PrIOU of each bounding box and sort the PrIOU obtained for each category. The PrIOU that is lower than the threshold value is eliminated, and finally, the non-maximum suppression operation is carried out [16]. Figure 2 shows the structure of the yolov1 algorithm.

2.3. yolov5 Algorithm

The yolov5 algorithm is a new network structure improved on the basis of previous research results. There are four versions, namely yolov5s, yolov5m, yolov5l, and yolov5x. The residual structure of the network increases in turn, and the detection accuracy also increases in turn.

This paper mainly studies the yolov5s version. yolov5 is a single-stage target detection algorithm that has improved on the yolov4 algorithm and has greatly improved both speed and accuracy [17]. The yolov5 algorithm model uses Mosaic data enhancement. Mosaic data enhancement method uses four pictures for splicing according to random scaling, random cropping, and random arrangement. This method splices several pictures into one, which not only enriches the data set but also greatly improves the training speed of the network. And the memory requirements on the network are correspondingly reduced. When setting the anchor frame, the yolov5 algorithm adopts adaptive anchor frame calculation, which can adaptively calculate the best anchor frame during each training [18]. In the yolov4 algorithm, only the CSP network structure is used in the Backbone network, while the yolov5 algorithm uses two types of CSP network structure: the CSP1_X structure in the Backbone network and CSP2_X structure in the Neck network. For the calculation of loss, the yolov5 algorithm adopts the method of increasing the number of positive samples to achieve the purpose of accelerating convergence. For any output layer, the shape rule is directly used for matching, and the aspect ratio of the bbox and the anchor of the current layer is calculated and compared with the set threshold. If the bbox is greater than the threshold, it indicates that the matching degree of the Bbox is insufficient. The filtering of the bbox is temporarily discarded [19]. Figure 3 shows the yolov5 algorithm structure.

2.4. Membrane Computing

One of the most important application areas of membrane computing is in the solving of optimization problems, where each membrane system contains a unique outer membrane, i.e., there is a surface membrane and a main membrane. There is no direct connection between each region in the membrane structure, and so the calculation of each region can accelerate the optimal solution of the algorithm model. Because the parallel computing ability of membrane computing can prevent the local optimal calculation, the membrane computing is effectively combined with other bionic artificial intelligence algorithms to improve the accuracy and speed of the original algorithm.

3. Experiment and Analysis

3.1. Experiment

3.1.1. Processing of Data Sets

In order to make the experimental results more general and accurate, the experimental data of this paper mainly included the pictures of nine major diseases and pests in Section 1.2 for the experiment. Table 1 shows the numbers of various pests and diseases.

In order to make the picture data of the experiment reliable and diverse, we mainly used the Paititi artificial intelligence data set service platform and manual shooting by the team to collect the picture data set for the experiment. According to the actual situation of the team, about 80% of the data set used for the experiment came from the Paititi artificial intelligence data set service platform and 20% came from manual shooting. The hand-shot data set was compiled in several farmlands in a village and town in Xiantao City, and the shooting and collection was completed in July 2021. In order to restore the detection situation in the actual scene, the distance between the device and the blade was 0.1–0.4 m when shooting so as to collect pictures of different scales and light intensities. The public data sets about rice pests and diseases were downloaded from the Paititi artificial intelligence data set service platform and screened, and the data suitable for the experiment were selected and integrated with the picture data taken to form a new data set. Public data sets on rice pests and diseases on the Payititi AI Dataset Service platform are available at the following links on the website: https://www.payititi.com/opendatasets/show-16.html (accessed on 24 March 2023) and https://www.payititi.com/opendatasets/show-26239.html (accessed on 5 April 2023). In addition, in order to ensure the validity of the data set in this experiment, the identification of the types of diseases and pests in the pictures screened from the public data set and the pictures taken by hand was verified by searching for information on the Internet and consulting professional teachers at our school. The data on and pictures of the nine major rice pests and diseases listed in this paper were searched for online to verify the experimental results of the data set. The acquired data set was cleaned, renamed, and de-weighted, and the diseases and pests were marked using the labelimg marking tool, while the target in the fuzzy part of each picture was not marked. The feature extraction of rice pests and diseases was improved by using data enhancement methods such as flipping, the random addition of Gaussian noise, filling, and random brightness transformation. The prepared data set was divided into the training set, verification set, and test set at the ratio of 8:1:1, as shown in Table 2.

3.1.2. Evaluation Indexes of the Experiment

This paper mainly introduces RepVGG network structure and SK attention mechanism based on the yolov5 algorithm and replaces the original NMS with Adaptive NMS. The improved algorithm is called the yolov5-RSAN algorithm. The improved algorithm is compared with the original algorithm, and the experimental results show that the improved algorithm has a great improvement in performance. In the detection algorithm, the performance evaluation indexes mainly include Precision, Recall, mAP, AP, fps, etc.

Precision = \frac{T P}{T P + F P}

(2)

Recall = \frac{T P}{T P + F N}

(3)

In the formula, TP indicates the correct target quantity detected, FP indicates the wrong target quantity detected, and FN indicates the undetected target quantity. With the Recall value as the horizontal coordinate and the Precision value as the vertical coordinate, the P–R curve of the recall rate can be obtained, and the area enclosed by the P–R curve and the horizontal and vertical axes is the AP value. The larger the AP value is, the better the detection effect is. The calculation method of the AP value is shown in Formula (4).

AP = \int_{0}^{1} p (r) d r

(4)

mAP is obtained by averaging the AP value of each object so as to measure and evaluate the overall detection effect of the target detection algorithm model. fps is a measure of the speed of the detection algorithm. fps indicates how many pictures (frames) can be processed and detected per second in the target network, which can be simply understood as the refresh rate of the image, that is, how many frames occur per second. fps is an indicator to evaluate the speed of the object detection algorithm; the shorter the time is, the faster the speed will be. There is the following calculation formula:

fps = frameNum/elapsedTime

(5)

3.1.3. Introducing the RepVGG Network Structure

The RepVGG network structure is a simple and powerful convolutional neural network that includes a VGG-like inference time body, combining 3*3 convolution and ReLU to train the time model with multi-branch topology [20]. The training time and inference time frame are decoupled by using structural reparameterization. RepVGG’s structural reparameterization of the network structure enables the whole network to have high performance and efficient inference speed, that is, the network performance is improved through the multi-branch structure in training [21], while the structural reparameterization is adopted to transform the network into a single-way structure during inference, thus reducing the video memory occupancy and improving the inference speed [16]. Among them, the conv + BN layer is used extensively to merge layers and improve network performance by reducing the number of layers. The convolutional layer formula is shown in Formula (6), and the BN layer formula is shown in Formula (7).

Conv(x) = W(x)+b

(6)

B N (x) = γ * \frac{(x - m e a n)}{\sqrt{v a r}} + β

(7)

The conv_3*3 process of the convolutional layer is shown in Figure 4, and the RepVGG network structure is shown in Figure 5.

3.1.4. Introduction of SK Attention Mechanism

More information can be obtained by improving the receptive field of the convolution kernel, but in general, the obtained information cannot be distinguished [22]. The introduction of weights in the convolution kernel will help to distinguish the obtained information to some extent. This creates a convolutional attention mechanism. The SK attention mechanism is based on the convolution kernel, and different images can obtain convolution kernels with different weights [23]. In the SK attention mechanism, three feature graphs are obtained after 3*3, 5*5, and 7*7 convolution, and then the final feature graph U is added. U contains information about a variety of receptive fields. By averaging H and W dimensions, one-dimensional vectors of information about channels can be obtained at last, representing the importance of the information in each channel. After several linear transformations and mappings, the linear extraction of channels is completed and the softmax function is used for normalization operation. At this time, each channel has a score corresponding to it, representing the importance of that channel [24]. Finally, information fusion is carried out to obtain the final module, which integrates multiple pieces of receptive field information. Figure 6 shows the principle diagram of the SK attention mechanism.

3.1.5. Replacing it with Adaptive NMS

Adaptive NMS uses a dynamic suppression strategy, and the threshold will rise as targets cluster and block each other, while the targets will decay when they appear alone [25]. Scores are also set for learning density. If the NMS uses a single threshold, a lower threshold will cause the loss of highly overlapping objects, while a higher threshold will cause more false detection, which is not very accurate in the detection of objects in dense scenarios [26]. The density of each target is defined as in Formula (8).

d_{i} = \binom{\max}{b j \in ε, i \neq j} iou (b_{i}, b_{j})

(8)

The update strategy is shown in Formula (9).

N_M = max(Nt,d_M)

(9)

Here, N_M represents the threshold of Adaptive NMS for target M and d_M is the density covered by the target.

3.1.6. Membrane Computational Optimization

In this paper, a membrane optimization algorithm with a multi-layer nested-membrane structure is adopted to optimize the up–down sampling operation of the improved yolov5 algorithm in early image processing so as to further improve the overall performance of the improved yolov5 algorithm. The membrane structure of the multi-layer nested-membrane optimization algorithm has three subsystems and one surface membrane. There are multiple objects in the membrane, and the surface membrane will receive and process the optimal objects of each subsystem and output the final result. In the membrane structure, the evolution rules are adopted for the objects in the membrane, and the membrane rules are executed independently and in parallel. On the experimental platform, the evolution of each membrane is realized by serial mode, and the subsystems evolve separately, and the communication between the better objects within each subsystem and between the subsystem and the surface membrane is maintained, and the better objects are retained to the maximum extent. First, we need to initialize the size of the object set, the subsystem communication object’s scale, the necessary variables and parameters for operation, the maximum algebra of operation, etc. Each subsystem evolves separately according to the evolution rules, and their independent operations do not affect each other. After the evolution of all the subsystems, the internal subsystem, the subsystem, and the surface membrane communicate the regular operation. If the algorithm reaches the maximum number of runs, the algorithm is terminated. In the experiment, the communication scale was 35% of the object set size, the maximum number of runs was 50, the crossover probability was 0.45, the object in each film was 30, and the rewriting probability was 0.3.

3.2. Experimental Results and Analysis

3.2.1. Experimental Platform

This experiment was mainly conducted on the basis of the pytorch deep learning framework, and the equipment used for this experiment was configured as follows: the experimental training used the Intel Core i7-12700H CPU model, the NVIDIA GeForce RTX 3050Ti GPU model, a 4 G unique display, 16 GB running memory, and 1 T mechanical hard disk + 512 G SSD. CUDA 11.1 deployment was used on a Windows 11 system to build the required environment for the experiment, and the Pytorch 1.9.0 framework and Python 3.9.1 language programming were used on the Pycharm 2020.1.1 x64 programming software to train and test the experiment.

3.2.2. Configuration of Experimental Parameters

In order to achieve a better convergence effect and detection speed in the experiment, the Adam optimizer was selected during model training, and the initial learning rate was set to 0.0032. The learning rate was adjusted by using the periodic cosine annealing strategy, the cosine annealing hyperparameter was set to 0.12, and the learning rate momentum was set to 0.843. In order to improve the recognition accuracy of the test training, the picture training size was normalized and adjusted to 640*640, and the Batchsize was set to 32. The maximum number of iterations was 2000, a hot reboot was performed after every ten epochs, and training stopped when the maximum number of iterations was reached. The IOU threshold was set to 0.6 and the confidence threshold to 0.35 during the test. During the experiment, the confidence parameters were set to 0, 0.25, 0.35, and 0.8, and the IOU thresholds were set to 0, 0.35, 0.6, and 1, respectively, to conduct the training and debugging of the model. When the final confidence parameter was 0.35 and the IOU threshold was 0.6, a better experimental effect could be achieved. Considering that the size of the graphics card’s capacity for picture training was adjusted to 640*640, and considering the actual situation wherein the size of Batchsize would have a certain impact on the performance of the model and the configuration of the hardware and software of the experimental equipment, the values of Batchsize were selected as 16, 32, and 64 for the test, and we decided to set the final value to 32. The optimal configuration of the maximum number of iterations, epoch intervals for hot restarts, and other parameters was selected based on the actual size of the experimental data set.

3.2.3. Construction of Improved Algorithm yolov5-RSAN Model

A RepVGG network structure was added to the Backbone of the yolov5 network model. After the slicing operation was completed, the feature map was structure-reparameterized through the RepVGG network structure, and it was transformed into a single-way structure during inference. The conv + BN layers were reused and the layers were merged to reduce the number of layers and improve the training speed. The SK attention mechanism was introduced at the back of the Neck module, and the feature information collected by CSP2_X structure was integrated to obtain a convolution kernel with different weights for different images. And after 3*3, 5*5, and 7*7 convolutions, different feature maps were obtained, and the final feature map information contained multiple receptive fields. After multiple linear transformations and mappings, an information fusion module with multiple receptive fields was obtained. We further strengthened the feature fusion capability of the network to achieve the purpose of improving the accuracy of the network. In the post-stage processing module of network detection, the original NAM operation was replaced with Adaptive NMS operation, and dynamic suppression strategy was used to further improve the false detection phenomena occurring at low or high thresholds and the low accuracy of target detection in dense scenarios. Finally, in order to optimize the performance of the improved algorithm model, membrane computing technology was adopted to further optimize the network performance.

3.2.4. Ablation Experiment

In order to verify the effectiveness of the improved algorithm, ablation experiments were conducted with the same computer configuration and parameter Settings [27,28]. We used “y” to indicate that an improvement point was added and “n” to indicate that an improvement point was not added. Table 3 lists the experimental results.

As can be seen from Table 3, the introduction of the SK attention mechanism significantly improved the accuracy rate, which increased by about 2.7 percentage points. This was because the accuracy rate was greatly improved by increasing the amount of information obtained from the receptive field of the convolutional kernel. The addition of the RepVGG network structure also improved the speed (by about 2.8 percentage points). This was due to the extensive use of conv + BN layers, which reduced inference time and improved speed by merging layers. The improved algorithm recall rate increased by 3.9 percentage points and the mAP improved by 4.3 percentage points. In order to better explain and intuitively observe the performance improvement of the improved algorithm, the following is a diagram of various performance indicators used for the improved algorithm during training (Figure 7).

3.2.5. Performance Comparison of Different Models

In order to make the experimental results more reliable and convincing and explore the detection performance of different models against rice diseases and pests, Faster R-CNN, yolov1, yolov4, and yolov5 were used for comparison with the YOLOV5-RSAN algorithm proposed in this paper under the same experimental environment conditions [29,30,31,32]. Table 4 lists the performance comparison of the test results.

As can be seen from Table 4, compared with other models, the accuracy rate of the YOLOV5-RSAN algorithm proposed in this paper is 1.7, 5.4, 4.6, and 2.7 percentage points higher than that of Faster R-CNN, yolov1, yolov4, and yolov5, respectively, and it is better than other models in terms of mAP and fps. The mAP of the yolov5-RSAN algorithm is as high as 94.4%, which is more suitable for the detection of rice diseases and pests, and lets it effectively detect diseases and pests so as to timely carry out the prevention and control of rice diseases and pests and reduce the impact of diseases and pests on rice yield to a certain extent. For the detection of several major rice pests and diseases proposed in this paper, the detection effect diagram of the improved algorithm is shown in Figure 8.

The most important and effective way to improve the accuracy of yolov5 target detection is to improve the degree of feature extraction. In this paper, the SK attention mechanism is introduced at the back of the Neck module to integrate the feature information collected by each CSP2_X structure, which greatly improves the efficiency of feature extraction. This improves accuracy. The accuracy rate was 94.8%. Experimental results show that the accuracy of the improved algorithm is higher than that of the other object detection algorithms mentioned in this paper. A RepVGG network structure is added to the Backbone. conv+BN layers are reused, layers are merged, and the number of layers is reduced to improve the training speed and the speed of the model. According to the ablation experiment results, the fps of model detection is increased by about 0.5 f/ms when only the RepVGG network structure is added. In theory, this shows that the model can detect an additional 0.5 frames in the same time unit, which also indicates that the detection speed of the model is improved. According to the experimental results and the above analysis, the performance of the improved yolov5-RSAN algorithm has been improved, especially in the aspects of accuracy, speed, mAP, etc., and some improvements have also been made in improving the problems of small target detection, such as missed detection and false detection.

4. Conclusions

In order to solve the problem that the types of diseases and pests in rice disease and pest detection are complex and diverse, and that a few diseases and pests have certain similarities, the detection task is extremely difficult, and the detection results cannot meet the expected effects and detection requirements. In this paper, the YOLOV5-RSAN algorithm is proposed, and this mainly introduces the RepVGG network structure based on the yolov5 algorithm and adopts the idea of layer merging to reduce the reasoning time and improve the detection speed. The SK attention mechanism is introduced to improve the receptive field of the convolution kernel to obtain more information and improve accuracy. Replace the original NMS with Adaptive NMS, the dynamic suppression strategy is adopted, and scores for learning density are set to improve the problems of the missing and false detection of small targets. After a lot of repeated experiments, the improved yolov5-RSAN algorithm has been significantly improved in terms of accuracy, mAP, fps, and other aspects, which is helpful for rice disease and pest detection, prevention, and control, and has certain guiding significance in the research on the improvement of target detection algorithms in the field of crop disease and pest detection. The improved yolov5-RSAN algorithm has improved its overall performance, but its effect on multi-target detection still needs to be improved. Occasionally, there will be missed detection in multi-target detection. This problem can be further improved in subsequent research. In future studies and scientific research, we can think about what other improvement methods can further improve the performance of the algorithm on the basis of the good detection effect achieved by the improved algorithm, and how to improve the algorithm to be more lightweight and embedded in a mobile platform so as to make the detection more convenient.

Author Contributions

Investigation, H.Z.; Writing—review & editing, H.Y. and D.L.; Supervision, G.Z., H.Z., J.W. and S.Z.; H.Y. and D.L. are responsible for putting forward the concept, putting forward the writing ideas and technical support, and completing the writing of the first draft of the paper. G.Z. and H.Z. are responsible for the format check and modification of the manuscript, the English translation of the manuscript and the collection of related resources. J.W. and S.Z. are responsible for the first and second round of revision of the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the School Enterprise Cooperation Project (No. wphu-2021-kj-762 and 1145, No. whpu-2022-kj-1586 and 2153), the Hubei Provincial Teaching and Research Project (No. 2018368), and the Ministry of Education Industry-University Cooperation Collaborative Education Project (No. 220900786024216).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

We would like to express our special thanks to Tiancheng Zhang, Jianglin Xiong, Jie Xiao, Qing Zhao four scholars for their help and suggestions in our work of proofreading the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

Yin, D.; Zhao, Y. Analysis of high-yield cultivation techniques of high quality rice and common pest control methods. Contemp. Agric. Mach. 2023, 4, 87+89. [Google Scholar]
Gong, J.; Jin, R.; Jiang, W.; Xu, Q.; Gu, Y. Application effect analysis of plant protection UAV in rice pest control. Seed Sci. Technol. 2023, 41, 112–114. [Google Scholar]
Sui, Y.; Zhou, M.; Cao, Y. Damage symptoms and main control measures of rice diseases and insect pests in northern rice growing areas. Mod. Agric. Sci. Technol. 2023, 7, 104–107. [Google Scholar]
Chen, Q.H. Analysis on control technology of main diseases and insect pests in rice. New Agric. 2022, 15, 21–22. [Google Scholar]
Wang, W.; Liu, Z.; Gao, P.; Liao, F.; Li, Q.; Xie, J. Litchi pest detection model based on improved YOLO v4. Chin. J. Agric. Mach. 2019, 54, 227–235. (In Chinese) [Google Scholar]
He, Y.; Lin, Y.; Zeng, Y.; Fei, J.; Li, Q.; Yang, Y. Research on detection of coffee leaf pests and diseases based on YOLOv5. Anhui Agric. Sci. 2019, 51, 221–226. [Google Scholar]
Hu, K.; Luo, R.; Liu, Z.; Cao, Y.; Liao, F.; Wang, W.; Li, Q.; Sun, D. Detection method of diseases and insect pests of Guangfo hand based on improved SSD. J. Nanjing Agric. Univ. 2023, 46, 813–821. (In Chinese) [Google Scholar]
Yang, K.; Fan, X.; Bo, W.; Liu, J.; Wang, J. Detection of plant diseases and insect pests based on visual enhanced attention model. J. Nanjing For. Univ. Nat. Sci. Ed. 2023, 47, 11–18. (In Chinese) [Google Scholar]
Xu, H.; Huang, Y.; Liu, M. Research on Detection and identification of maize leaf pests and diseases based on improved YOLOv3 model. J. Nanjing Agric. Univ. 2022, 45, 1276–1285. [Google Scholar]
Bai, X.; Wu, J.; Jing, W.; Cui, Y.; Kang, X. Research on crop pest detection based on improved residual network. Comput. Technol. Dev. 2023, 33, 145–151. (In Chinese) [Google Scholar]
Girshick, R. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
Roh, M.C.; Lee, J. Refining faster-RCNN for accurate object detection. In Proceedings of the 2017 Fifteenth IAPR International Conference on Machine Vision Applications (MVA), Nagoya, Japan, 8–12 May 2017; pp. 514–517. [Google Scholar]
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 2015, 28. [Google Scholar] [CrossRef]
Li, G.; Song, Z.; Fu, Q. A new method of image detection for small datasets under the framework of YOLO network. In Proceedings of the 2018 IEEE 3rd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chongqing, China, 12–14 October 2018; pp. 1031–1035. [Google Scholar]
Ranjan, A.; Pathare, N.; Dhavale, S.; Kumar, S. Performance Analysis of YOLO Algorithms for Real-Time Crowd Counting. In Proceedings of the 2022 2nd Asian Conference on Innovation in Technology (ASIANCON), Ravet, India, 26–28 August 2022; pp. 1–8. [Google Scholar]
Wen, H.; Dai, F. A Study of YOLO Algorithm for Multi-target Detection. J. Adv. Artif. Life Robot. 2021, 2, 70–73. [Google Scholar]
Wu, W.; Liu, H.; Li, L.; Long, Y.; Wang, X.; Wang, Z.; Li, J.; Chang, Y. Application of local fully Convolutional Neural Network combined with YOLO v5 algorithm in small target detection of remote sensing image. PLoS ONE 2021, 16, e0259283. [Google Scholar] [CrossRef]
Kim, J.H.; Kim, N.; Park, Y.W.; Won, C.S. Object detection and classification based on YOLO-V5 with improved maritime dataset. J. Mar. Sci. Eng. 2022, 10, 377. [Google Scholar] [CrossRef]
Yao, J.; Qi, J.; Zhang, J.; Shao, H.; Yang, J.; Li, X. A real-time detection algorithm for Kiwifruit defects based on YOLOv5. Electronics 2021, 10, 1711. [Google Scholar] [CrossRef]
Ding, X.; Zhang, X.; Ma, N.; Han, J.; Ding, G.; Sun, J. Repvgg: Making vgg-style convnets great again. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 13733–13742. [Google Scholar]
Chu, X.; Li, L.; Zhang, B. Make RepVGG Greater Again: A Quantization-aware Approach. arXiv 2022, arXiv:2212.01593. [Google Scholar]
Nergiz, M. Analysis of RepVGG on Small Sized Dandelion Images Dataset in terms of Transfer Learning, Regularization, Spatial Attention as well as Squeeze and Excitation Blocks. In Proceedings of the 2021 6th International Conference on Computer Science and Engineering (UBMK), Ankara, Turkey, 15–17 September 2021; pp. 378–382. [Google Scholar]
Li, X.; Wang, W.; Hu, X.; Yang, J. Selective kernel networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–17 June 2019; pp. 510–519. [Google Scholar]
Alipour-Fard, T.; Paoletti, M.E.; Haut, J.M.; Arefi, H.; Plaza, J.; Plaza, A. Multibranch selective kernel networks for hyperspectral image classification. IEEE Geosci. Remote Sens. Lett. 2020, 18, 1089–1093. [Google Scholar] [CrossRef]
Li, Z.; Hu, F.; Wang, C.; Deng, W.; Zhang, Q. Selective kernel networks for weakly supervised relation extraction. CAAI Trans. Intell. Technol. 2021, 6, 224–234. [Google Scholar] [CrossRef]
Liu, S.; Huang, D.; Wang, Y. Adaptive nms: Refining pedestrian detection in a crowd. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 6459–6468. [Google Scholar]
Ma, W.; Zhou, T.; Qin, J.; Zhou, Q.; Cai, Z. Joint-attention feature fusion network and dual-adaptive NMS for object detection. Knowl. SBased Syst. 2022, 241, 108213. [Google Scholar] [CrossRef]
Peng, H.; Xu, H.; Gao, Z.; Tian, X.; Deng, Q.; Xian, C. Detection method of crop pests in field based on improved YOLOF model. Trans. Chin. Soc. Agric. Mach. 2023, 54, 285–294, 303. [Google Scholar]
Cheng, H. Research on tomato pest and disease recognition algorithm based on YOLO. J. Agric. Technol. 2022, 42, 38–40. [Google Scholar]
Zhou, W.; Niu, Y.; Wang, Y.; Li, D. Improved YOLOv4-GhostNet method for identification of rice pests and diseases. J. Jiangsu Agric. Sci. 2019, 38, 685–695. [Google Scholar]
Liu, L.; Li, B.; He, Z.; Yao, W. Cotton defect detection based on FS-YOLOv3 and multi-scale feature fusion. J. Cent. South Univ. Natl. Nat. Sci. Ed. 2021, 40, 95–101. [Google Scholar]
Mao, T.; Song, Y.; Zheng, L. Apple object detection based on multi-scale and mixed attention mechanism. J. Cent. South. Univ. Natl. Nat. Sci. Ed. 2022, 41, 235–242. [Google Scholar]

Figure 1. Flowchart of the Faster R-CNN algorithm.

Figure 2. Structure of the yolov1 algorithm.

Figure 3. Structure of the yolov5 algorithm.

Figure 4. conv_3*3 process diagram.

Figure 5. RepVGG network structure.

Figure 6. Schematic diagram of SK attention mechanism.

Figure 7. Performance counters of the yolov5-RSAN algorithm.

Figure 8. Effect of yolov5-RSAN algorithm on rice pest detection.

Table 1. Numbers of various pests and diseases.

Name of Disease and Pest	Number of Pictures Collected in the Experiment (Photos)
Rice blast	365
Sheath blight	402
Malignant seedling disease	380
Smut	410
Borer	350
Rice leaf twister	364
Rice planthopper	403
Oryzacris sinensis	412
Rice leafhopper	375

Table 2. Data set construction division table.

Names of Pests and Diseases	Training Set	Validation Set	Test Set	Total
Rice blast	293	36	36	365
Sheath blight	322	40	40	402
Malignant seedling disease	304	38	38	380
smut	328	41	41	410
borer	280	35	35	350
Rice leaf twister	292	36	36	364
Rice planthopper	323	40	40	403
Oryzacris sinensis	330	41	41	412
Rice leafhopper	301	37	37	375
total	2773	344	344	3461

Table 3. Ablation results.

RepVGG Network Structure	SK Attention Mechanism	Adaptive NMS	Accuracy Rate (%)	Recall Rate (%)	mAP (%)	AP (%)	fps (f/ms)
y	n	n	92.3	89.5	91.2	90.3	11.5
y	y	n	92.6	89.9	91.5	91.3	11.9
y	y	y	94.8	93.1	94.4	96.7	13.8
y	n	y	93.4	92.8	92.3	92.6	12.3
n	y	n	92.9	92.1	92.1	91.5	11.6
n	n	y	92.6	91.6	91.8	90.9	11.8
n	y	y	93.1	91.2	93.5	94.5	13.1
n	n	n	92.1	89.2	90.1	89.3	11.0

Table 4. Comparison of the performance of each algorithm model.

Algorithm Model	Accuracy Rate (%)	mAP (%)	fps (f/ms)
Faster R-CNN	93.1	89.2	8.9
yolov1	89.4	89.5	10.3
yolov4	90.2	90.5	10.5
yolov5	92.1	90.1	11.0
yolov5-RSAN	94.8	94.4	13.8

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yang, H.; Lin, D.; Zhang, G.; Zhang, H.; Wang, J.; Zhang, S. Research on Detection of Rice Pests and Diseases Based on Improved yolov5 Algorithm. Appl. Sci. 2023, 13, 10188. https://doi.org/10.3390/app131810188

AMA Style

Yang H, Lin D, Zhang G, Zhang H, Wang J, Zhang S. Research on Detection of Rice Pests and Diseases Based on Improved yolov5 Algorithm. Applied Sciences. 2023; 13(18):10188. https://doi.org/10.3390/app131810188

Chicago/Turabian Style

Yang, Hua, Dang Lin, Gexiang Zhang, Haifeng Zhang, Junxiong Wang, and Shuxiang Zhang. 2023. "Research on Detection of Rice Pests and Diseases Based on Improved yolov5 Algorithm" Applied Sciences 13, no. 18: 10188. https://doi.org/10.3390/app131810188

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research on Detection of Rice Pests and Diseases Based on Improved yolov5 Algorithm

Abstract

1. Brief Introduction of Rice Pests and Diseases

1.1. Damage and Current Situation of Rice Pests and Diseases

1.2. Several Common Rice Pests and Diseases

1.3. Control and Detection of Rice Pests and Diseases

2. Object Detection Algorithm Based on Deep Learning

2.1. Faster R-CNN Algorithm

2.2. yolov1 Algorithm

2.3. yolov5 Algorithm

2.4. Membrane Computing

3. Experiment and Analysis

3.1. Experiment

3.1.1. Processing of Data Sets

3.1.2. Evaluation Indexes of the Experiment

3.1.3. Introducing the RepVGG Network Structure

3.1.4. Introduction of SK Attention Mechanism

3.1.5. Replacing it with Adaptive NMS

3.1.6. Membrane Computational Optimization

3.2. Experimental Results and Analysis

3.2.1. Experimental Platform

3.2.2. Configuration of Experimental Parameters

3.2.3. Construction of Improved Algorithm yolov5-RSAN Model

3.2.4. Ablation Experiment

3.2.5. Performance Comparison of Different Models

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI