A hardware accelerator to support deep learning processor units in real-time image processing

IRIS

Deep neural networks are becoming crucial in many cyber–physical systems involving complex perceptual tasks. For those embedded systems requiring real-time interactions with dynamic environments, as autonomous robots and drones, it is of paramount importance that such algorithms are efficiently executed onboard on properly designed hardware accelerators to meet the required performance specifications. In particular, some neural network architectures for object detection and tracking, as You Only Look Once (YOLO), include heavy computational stages that need to be executed before and after the model inference. Such stages are typically not incorporated in traditional accelerators and are executed on general-purpose processors, thus introducing a bottleneck in the overall processing pipeline. To overcome such a problem, this paper presents a general-purpose accelerator on a field-programmable gate array (FPGA) able to run pre-processing and post-processing operations typically required by vision tasks. The proposed solution has been tested in combination with a YOLO object detector accelerated on an Advanced Micro Devices (AMD) Xilinx Kria KR260 board mounting an UltraScale+ multiprocessor system-on-chip, achieving a significant improvement in terms of both timing performance and power consumption, and enabling onboard visual processing into drones. The proposed solution is able to boost the traditional object detection process by a factor of 4.4, allowing the execution of the full processing pipeline at 60 frames per second (fps), versus 13.6 fps reachable without the proposed accelerator. As a result, this work enables the use of high-speed cameras for developing more reactive systems that can respond to incoming events with lower latency.

A hardware accelerator to support deep learning processor units in real-time image processing

Cittadini, Edoardo^Primo;Marinoni, Mauro^Secondo;Buttazzo, Giorgio^Ultimo

2025-01-01

Abstract

Deep neural networks are becoming crucial in many cyber–physical systems involving complex perceptual tasks. For those embedded systems requiring real-time interactions with dynamic environments, as autonomous robots and drones, it is of paramount importance that such algorithms are efficiently executed onboard on properly designed hardware accelerators to meet the required performance specifications. In particular, some neural network architectures for object detection and tracking, as You Only Look Once (YOLO), include heavy computational stages that need to be executed before and after the model inference. Such stages are typically not incorporated in traditional accelerators and are executed on general-purpose processors, thus introducing a bottleneck in the overall processing pipeline. To overcome such a problem, this paper presents a general-purpose accelerator on a field-programmable gate array (FPGA) able to run pre-processing and post-processing operations typically required by vision tasks. The proposed solution has been tested in combination with a YOLO object detector accelerated on an Advanced Micro Devices (AMD) Xilinx Kria KR260 board mounting an UltraScale+ multiprocessor system-on-chip, achieving a significant improvement in terms of both timing performance and power consumption, and enabling onboard visual processing into drones. The proposed solution is able to boost the traditional object detection process by a factor of 4.4, allowing the execution of the full processing pipeline at 60 frames per second (fps), versus 13.6 fps reachable without the proposed accelerator. As a result, this work enables the use of high-speed cameras for developing more reactive systems that can respond to incoming events with lower latency.

Scheda breve

Scheda completa

Scheda completa (DC)

Anno del prodotto

2025

Appare nelle tipologie:

1.1 Articolo su Rivista/Article

File in questo prodotto:

File	Dimensione	Formato
1-s2.0-S0952197625001599-main.pdf accesso aperto Tipologia: PDF Editoriale Licenza: Creative commons (selezionare) Dimensione 3.78 MB Formato Adobe PDF Visualizza/Apri	3.78 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11382/576132

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

ND

social impact