Automatic Pipeline for Detection and Classification of Phytoplankton Specimens in Digital Microscopy Images of Freshwater Samples

Rivas-Villar, David; Rouco, José; Carballeira, Rafael; Penedo, Manuel G.; Novo, Jorge

doi:10.3390/engproc2021007009

Open AccessProceeding Paper

Automatic Pipeline for Detection and Classification of Phytoplankton Specimens in Digital Microscopy Images of Freshwater Samples^†

¹

Centro de Investigación CITIC, Universidade da Coruña, 15071 A Coruña, Spain

²

Grupo VARPA, Instituto de Investigación Biomédica de A Coruña (INIBIC), Universidade da Coruna, 15006 A Coruña, Spain

³

Centro de Investigacions Científicas Avanzadas (CICA), Facultade de Ciencias, Universidade da Coruna, 15071 A Coruña, Spain

^*

Author to whom correspondence should be addressed.

^†

Presented at the 4th XoveTIC Conference, A Coruña, Spain, 7–8 October 2021.

Eng. Proc. 2021, 7(1), 9; https://doi.org/10.3390/engproc2021007009

Published: 29 September 2021

(This article belongs to the Proceedings of The 4th XoveTIC Conference)

Download

Browse Figure

Versions Notes

Abstract

:

Phytoplankton blooming can compromise the quality of the water and its safety due to the negative effects of the toxins that some species produce. Therefore, the continuous monitoring of water sources is typically required. This task is commonly and routinely performed by specialists manually, which represents a major limitation in the quality and quantity of these studies. We present an accurate methodology to automate this task using multi-specimen images of phytoplankton which are acquired by regular microscopes. The presented fully automatic pipeline is capable of detecting and segmenting individual specimens using classic computer vision algorithms. Furthermore, the method can fuse sparse specimens and colonies when needed. Moreover, the system can differentiate genuine phytoplankton from other similar non-phytoplanktonic objects like zooplankton and detritus. These genuine phytoplankton specimens can also be classified in a target set of species, with special focus on the toxin-producing ones. The experiments demonstrate satisfactory and accurate results in each one of the different steps that compose this pipeline. Thus, this fully automatic system can aid the specialists in the routine analysis of water sources.

Keywords:

microscope images; phytoplankton detection; colony merging; gabor filters; deep features; bag of visual words

1. Introduction

Phytoplankton has retained scientific attention over the years for various reasons. It is the basis of the food chain in all aquatic environments, producing oxygen through photosynthesis and being able to fix carbon. Furthermore, several species produce toxins which can contaminate drinking water sources [1]. Thus, continuous monitoring of phytoplankton populations is not only a purely scientific activity, it is also a matter of public health. The monitoring of water sources is done manually by experts, therefore, automating part of the process is highly desirable. In this work, we present an accurate method that uses a systematic microscopic imaging approach which can liberate experts from operating the microscope [2]. The presented system can segment, identify and classify phytoplankton species, with special focus on the toxin-producing ones [3].

2. Materials and Methods

The presented method is divided into several steps. Firstly, the foreground-background stage uses an adaptive Gaussian threshold [4] over each of the input image channels to binarize the image. The results are merged with an OR operator to preserve the highest amount of information. Next, to detect every specimen, we employ Suzuki and Abe’s Algorithm [5]. In this step, we discard any detection smaller than 5 μm², since, due to their size, they can not be phytoplankton. Moreover, incomplete specimens cut by the image borders are discarded. Following this step, we present an algorithm to fuse sparse specimens and colonies, which do not have evident visual links among their parts. We employ a Delaunay Triangulation [6] linking neighbouring detections. We prune the graph according to a colour similarity metric, keeping only the similar neighbours. Finally, the neighbouring detections are fused if they are still connected after the pruning step. The output of these first steps are a set of bounding boxes enclosing each specimen.

Once the specimens are segmented we must classify them. Firstly, a step to separate genuine phytoplankton from non-phytoplanktonic elements is devised. This is due to the tuning towards recall of the previous steps, as they capture most of the phytoplankton but they also mistakenly let through some similar specimens. Therefore, the first classification step separates phytoplankton from other similar objects like zooplankton, mineral particles or organic detritus. After this, another classification is needed, separating the genuine phytoplankton specimens into a set of relevant species. In this case the focus is set on two toxin-producing ones Woronichinia naegeliana and Anabaena spiroides and a harmless but complex one, Dinobryon sociale. Lastly, this classification will also have an "Others" tag which includes all the other phytoplankton species.

For these classification steps we test several features. To capture texture information we use Gabor Filter banks with a Bag of Visual Words (BoVW). Furthermore, colour information is also gathered using a BoVW, capturing the information of each of the RGB channels. Finally, we also use Deep Features, extracted from a ResNet50 [7] pretrained using ImageNet [8]. The different features are tested, masked and unmasked. This means that, either the features are obtained from the whole bounding box or just from the area of the specimen, using the segmentation mask. These features are used in combination with Random Forest (RF) and Support Vector Machines (SVM) as classifiers.

All the experiments were carried out in the same microscopic image dataset. Contrary to the state of the art, this dataset was captured using fixed focal points and magnification. This greatly complicates the automated task but frees the specialists from operating the microscope, as any technician can follow the systematic approach. The first steps, the segmentation of specimens, are trained on a random subset of 50 images. The rest of the images are the test set to evaluate the algorithms. The classification steps employ an 80-20% split on a 10-fold crossvalidation with grid search to determine the best parameters for the features and classifiers.

The ground truth of the dataset are bounding boxes containing the phytoplankton specimens, with an associated label identifying the species as marked by an expert.

3. Results and Conclusions

For the specimen detection and merging steps, we obtain a False Negative Rate (FNR) of 0.4%. We count as positives the cases where bounding boxes enclose at least 50% of the specimens’ area. Overall this step is satisfactory, missing very few specimens.

In terms of phytoplankton identification, separating it from other spurious elements, we evaluate it using precision at high levels of recall, like 90% or 95%. In particular, the best result at 90% of recall is a 84.07%, obtained using an SVM that only uses unmasked Deep Features, as adding any other feature reported no benefit. In terms of precision at 95% of recall, the best result is RF with the combination of all unmasked features. Overall, masking the features showed no improvement in this step, on the contrary. Despite the complexities due to the heterogeneity of the classes, the first classification step shows accurate results.

Regarding the species classification, the best performance is obtained with masked features and mixing Deep Features with colour features. In this case, RF performs better than SVM, obtaining a top result of 87.50% global classification accuracy and a 87.99% of F1-Score. In terms of particular results for each species, W. naegeliana obtains an accuracy of 94.53%, A. spiroides 97.66%, D. sociale 94.53% and the others class results in a 88.28% of accuracy. This step demonstrates a satisfactory performance despite the complexities of classifying among species has, like morphological similarities among different species.

Image examples of the results of the classification steps can be seen in Figure 1, which also represent the bounding boxes that the system detects.

Overall, the performance in each of the different steps has been satisfactory, despite the particular complexities that each one of them shows, like similarities among different phytoplankton species or the variations among a single species. Therefore, we can say that the methodology presented in this work can be of notable help to the trained taxonomists that usually carry out potability analysis in water sources.

Author Contributions

Conceptualization, J.R. and J.N.; methodology, D.R.-V.; software, D.R.-V.; validation, D.R.-V.; formal analysis, D.R.-V.; investigation, D.R.-V.; resources, M.G.P., R.C., J.R. and J.N.; data curation, R.C. and D.R.-V.; writing—original draft preparation, D.R.-V.; writing—review and editing, J.R. and J.N.; visualization, D.R.-V.; supervision, J.R. and J.N.; project administration, J.R. and J.N.; funding acquisition, J.R. and J.N. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Consellería de Cultura, Educación e Universidade, Xunta de Galicia through the predoctoral grant contract ref. ED481A 2021/147 and Grupos de Referencia Competitiva, grant ref. ED431C 2020/24; CITIC, Centro de Investigación de Galicia ref. ED431G 2019/01, receives financial support from Consellería de Educación, Universidade e Formación Profesional, Xunta de Galicia, through the ERDF (80%) and Secretaría Xeral de Universidades (20%).

Conflicts of Interest

The authors declare no conflict of interest.

References

Zamyadi, A.; Choo, F.; Newcombe, G.; Stuetz, R.; Henderson, R.K. A review of monitoring technologies for real-time management of cyanobacteria: Recent advances and future direction. Trends Analyt. Chem. 2016, 85, 83–96. [Google Scholar] [CrossRef]
Rivas-Villar, D.; Rouco, J.; Penedo, M.G.; Carballeira, R.; Novo, J. Automatic Detection of Freshwater Phytoplankton Specimens in Conventional Microscopy Images. Sensors 2020, 20, 6704. [Google Scholar] [CrossRef] [PubMed]
Rivas-Villar, D.; Rouco, J.; Carballeira, R.; Penedo, M.G.; Novo, J. Fully automatic detection and classification of phytoplankton specimens in digital microscopy images. Comput. Methods Programs Biomed. 2021, 200, 105923. [Google Scholar] [CrossRef] [PubMed]
Parker, J.R. Algorithms for Image Processing and Computer Vision, 2nd ed.; Wiley Publishing: Indianapolis, IN, USA, 2010. [Google Scholar]
Suzuki, S.; Abe, K. Topological structural analysis of digitized binary images by border following. Comput. Vis. Image Underst. 1985, 30, 32–46. [Google Scholar] [CrossRef]
Delaunay, B. Sur la sphère vide. Bulletin de l’Académie des Sciences de l’URSS, Classe des Sciences Mathématiques et Naturelles 1934, 6, 793–800. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Deng, J.; Dong, W.; Socher, R.; Li, L.-J.; Li, K.; Li, F.-F. ImageNet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, USA, 20–25 June 2009; pp. 248–255. [Google Scholar] [CrossRef] [Green Version]

Figure 1. Examples of phytoplankton detection (left) and species classification (right). In the left image, true positives are represented in green and true negatives in blue. In the right image W. naegeliana in red, A. spiroides in magenta and D. sociale in green.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Rivas-Villar, D.; Rouco, J.; Carballeira, R.; Penedo, M.G.; Novo, J. Automatic Pipeline for Detection and Classification of Phytoplankton Specimens in Digital Microscopy Images of Freshwater Samples. Eng. Proc. 2021, 7, 9. https://doi.org/10.3390/engproc2021007009

AMA Style

Rivas-Villar D, Rouco J, Carballeira R, Penedo MG, Novo J. Automatic Pipeline for Detection and Classification of Phytoplankton Specimens in Digital Microscopy Images of Freshwater Samples. Engineering Proceedings. 2021; 7(1):9. https://doi.org/10.3390/engproc2021007009

Chicago/Turabian Style

Rivas-Villar, David, José Rouco, Rafael Carballeira, Manuel G. Penedo, and Jorge Novo. 2021. "Automatic Pipeline for Detection and Classification of Phytoplankton Specimens in Digital Microscopy Images of Freshwater Samples" Engineering Proceedings 7, no. 1: 9. https://doi.org/10.3390/engproc2021007009

Article Menu

Automatic Pipeline for Detection and Classification of Phytoplankton Specimens in Digital Microscopy Images of Freshwater Samples^†

Abstract

1. Introduction

2. Materials and Methods

3. Results and Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Automatic Pipeline for Detection and Classification of Phytoplankton Specimens in Digital Microscopy Images of Freshwater Samples †

Abstract

1. Introduction

2. Materials and Methods

3. Results and Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Automatic Pipeline for Detection and Classification of Phytoplankton Specimens in Digital Microscopy Images of Freshwater Samples^†