Convolutional Neural Networks and Transformers-Based Techniques for Underwater Marine Debris Classification: A Comparative Study

IRIS

Marine debris poses a substantial threat to ecosystems and their inhabitants, representing a worldwide issue. It is crucial to understand the nature and extent of this environmental issue, given its widespread distribution and diverse nature. In recent years, aerial and underwater imaging has been extensively employed to monitor these debris. The classification of marine debris facilitates the identification of the specific types involved and is crucial for implementing preventive measures. Nevertheless, monitoring demands substantial human effort, indicating the need for automated and cost-effective methodologies. In this context, artificial intelligence has emerged as a powerful tool for identifying underwater marine debris, offering several advantages over conventional monitoring processes, such as manual visual surveys conducted by divers or the use of remotely operated vehicles. We evaluated deep architectures through transfer learning, utilizing knowledge derived from the ImageNet data set and applying it to the smaller data set, considering the limited availability of underwater marine data. While prior literature exclusively focused on convolutional neural network (CNN)-based architectures in their comparative study, this article additionally incorporates state-of-the-art attention-based deep architectures for classifying underwater marine debris. We utilized multiple training schemes for classifying marine debris, employing a data set sourced from the database of The Japan Agency for Marine-Earth Science and Technology, which comprises of a library of deep-sea images. The attention-based shifted window (Swin) architecture demonstrated the highest accuracy at 93.08% with relatively higher parameter count, followed by DenseNet121 with an accuracy of 90.99%. MobileNetV2, with the fewest parameters (2.26 M), achieved an accuracy of 84.91%. The comparative analysis in this study indicates that transformer-based models exhibited relatively higher accuracies compared to CNNs. However, the selection between CNNs and transformer-based architectures depends on multiple factors such as data availability, computational resources and marine environment.

Convolutional Neural Networks and Transformers-Based Techniques for Underwater Marine Debris Classification: A Comparative Study

Bushra Jalil;Luca Valcarenghi;Luca Maggiani

2024-01-01

Abstract

Marine debris poses a substantial threat to ecosystems and their inhabitants, representing a worldwide issue. It is crucial to understand the nature and extent of this environmental issue, given its widespread distribution and diverse nature. In recent years, aerial and underwater imaging has been extensively employed to monitor these debris. The classification of marine debris facilitates the identification of the specific types involved and is crucial for implementing preventive measures. Nevertheless, monitoring demands substantial human effort, indicating the need for automated and cost-effective methodologies. In this context, artificial intelligence has emerged as a powerful tool for identifying underwater marine debris, offering several advantages over conventional monitoring processes, such as manual visual surveys conducted by divers or the use of remotely operated vehicles. We evaluated deep architectures through transfer learning, utilizing knowledge derived from the ImageNet data set and applying it to the smaller data set, considering the limited availability of underwater marine data. While prior literature exclusively focused on convolutional neural network (CNN)-based architectures in their comparative study, this article additionally incorporates state-of-the-art attention-based deep architectures for classifying underwater marine debris. We utilized multiple training schemes for classifying marine debris, employing a data set sourced from the database of The Japan Agency for Marine-Earth Science and Technology, which comprises of a library of deep-sea images. The attention-based shifted window (Swin) architecture demonstrated the highest accuracy at 93.08% with relatively higher parameter count, followed by DenseNet121 with an accuracy of 90.99%. MobileNetV2, with the fewest parameters (2.26 M), achieved an accuracy of 84.91%. The comparative analysis in this study indicates that transformer-based models exhibited relatively higher accuracies compared to CNNs. However, the selection between CNNs and transformer-based architectures depends on multiple factors such as data availability, computational resources and marine environment.

Scheda breve

Scheda completa

Scheda completa (DC)

Anno del prodotto

2024

Appare nelle tipologie:

1.1 Articolo su Rivista/Article

File in questo prodotto:

File	Dimensione	Formato
Convolutional_Neural_Networks_and_Transformers-Based_Techniques_for_Underwater_Marine_Debris_Classification_A_Comparative_Study.pdf solo utenti autorizzati Tipologia: Documento in Pre-print/Submitted manuscript Licenza: Copyright dell'editore Dimensione 4.11 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	4.11 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11382/571832

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

0

social impact