Traditional auto-scaling approaches are conceived as reactive automations, typically triggered when predefined thresholds are breached by resource consumption metrics. Managing such rules at scale is cumbersome, especially when resources require non-negligible time to be instantiated. This paper introduces an architecture for predictive cloud operations, which enables orchestrators to apply time-series forecasting techniques to estimate the evolution of relevant metrics and take decisions based on the predicted state of the system. In this way, they can anticipate load peaks and trigger appropriate scaling actions in advance, such that new resources are available when needed. The proposed architecture is implemented in OpenStack, extending the monitoring capabilities of Monasca by injecting short-term forecasts of standard metrics. We use our architecture to implement predictive scaling policies leveraging on linear regression, autoregressive integrated moving average, feed-forward, and recurrent neural networks (RNN). Then, we evaluate their performance on a synthetic workload, comparing them to those of a traditional policy. To assess the ability of the different models to generalize to unseen patterns, we also evaluate them on traces from a real content delivery network (CDN) workload. In particular, the RNN model exhibites the best overall performance in terms of prediction error, observed client-side response latency, and forecasting overhead. The implementation of our architecture is open-source.

Extending OpenStack Monasca for Predictive Elasticity Control

Lanciano G.
;
Galli F.;Cucinotta T.
;
Bacciu D.;
2024-01-01

Abstract

Traditional auto-scaling approaches are conceived as reactive automations, typically triggered when predefined thresholds are breached by resource consumption metrics. Managing such rules at scale is cumbersome, especially when resources require non-negligible time to be instantiated. This paper introduces an architecture for predictive cloud operations, which enables orchestrators to apply time-series forecasting techniques to estimate the evolution of relevant metrics and take decisions based on the predicted state of the system. In this way, they can anticipate load peaks and trigger appropriate scaling actions in advance, such that new resources are available when needed. The proposed architecture is implemented in OpenStack, extending the monitoring capabilities of Monasca by injecting short-term forecasts of standard metrics. We use our architecture to implement predictive scaling policies leveraging on linear regression, autoregressive integrated moving average, feed-forward, and recurrent neural networks (RNN). Then, we evaluate their performance on a synthetic workload, comparing them to those of a traditional policy. To assess the ability of the different models to generalize to unseen patterns, we also evaluate them on traces from a real content delivery network (CDN) workload. In particular, the RNN model exhibites the best overall performance in terms of prediction error, observed client-side response latency, and forecasting overhead. The implementation of our architecture is open-source.
2024
File in questo prodotto:
File Dimensione Formato  
BDMA-2023.pdf

accesso aperto

Tipologia: Documento in Pre-print/Submitted manuscript
Licenza: Creative commons (selezionare)
Dimensione 24.18 MB
Formato Adobe PDF
24.18 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11382/569732
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
social impact