Dual Scaling VMs and Queries: Cost-Effective Latency Curtailment

Juan F. Perez; Robert Birke; Mathias Bjorkqvist; Lydia Y. Chen

doi:10.1109/ICDCS.2017.231

Dual Scaling VMs and Queries: Cost-Effective Latency Curtailment

Título traducido de la contribución: VMs y consultas de doble escala: Reducción de la latencia rentable

Juan F. Perez, Robert Birke, Mathias Bjorkqvist, Lydia Y. Chen

Producción científica: Capítulo en Libro/Reporte › Contribución a la conferencia

6 Citas (Scopus)

Resumen

Las instancias virtuales wimpy equipadas con un pequeño número de núcleos y RAM son ofertas de cloud públicas y privadas populares debido a su bajo coste para aplicaciones de hosting. El desafío es cómo ejecutar aplicaciones sensibles a la latencia utilizando estas instancias, que compensan el rendimiento por el coste. En este estudio, demostramos analítica y experimentalmente que el escalado simultáneo de recursos a granularidad gruesa y cargas de trabajo, es decir, el envío de múltiples clones de consultas a diferentes servidores, a granularidad fina, puede superar las desventajas de rendimiento de las instancias de VM débiles y lograr objetivos de latencia estrictos que son incluso inferiores a los tiempos de ejecución medios de los servidores wimpy. Para tal fin, primero derivamos un análisis de forma cerrada para la latencia bajo cualquier nivel dado de aprovisionamiento de VM y replicación de consultas, considerando políticas de clonación que pueden (no) terminar clones pendientes con (sin) una sobrecarga. Validado en simulaciones basadas en trazas, nuestro análisis es capaz de predecir con precisión la latencia y buscar eficientemente el número óptimo de VMs y clones. En segundo lugar, desarrollamos un escarificador elástico doble, DuoScale, que escala dinámicamente las máquinas virtuales y los clones en función de la dinámica de la carga de trabajo para conseguir la latencia objetivo de una manera rentable. La eficacia de DuoScale radica en la observación de que el rendimiento de la aplicación sólo se escala sublinealmente con el aumento del aprovisionamiento de recursos vertical u horizontal, es decir, recursos por VM o número de VMs. Evaluamos DuoScale contra estrategias de escalado sólo VM a través de extensas simulaciones basadas en trazas, así como resultados experimentales en un banco de pruebas de nubes. Nuestros resultados muestran que DuoScale es capaz de alcanzar la rigurosa latencia objetivo mediante el uso de clones en VMs wimpy con un ahorro de costes de hasta el 50%, en comparación con las VMs más robustas que tienen un mejor rendimiento a un coste unitario más alto.

Título traducido de la contribución	VMs y consultas de doble escala: Reducción de la latencia rentable
Idioma original	Inglés estadounidense
Título de la publicación alojada	Proceedings - IEEE 37th International Conference on Distributed Computing Systems, ICDCS 2017
Editorial	Institute of Electrical and Electronics Engineers Inc.
Páginas	988-998
Número de páginas	11
ISBN (versión digital)	9781538617915
DOI	https://doi.org/10.1109/ICDCS.2017.231
Estado	Publicada - jul. 13 2017
Evento	37th IEEE International Conference on Distributed Computing Systems, ICDCS 2017 - Atlanta, Estados Unidos Duración: jun. 5 2017 → jun. 8 2017

Conferencia

Conferencia	37th IEEE International Conference on Distributed Computing Systems, ICDCS 2017
País/Territorio	Estados Unidos
Ciudad	Atlanta
Período	6/5/17 → 6/8/17

Áreas temáticas de ASJC Scopus

Software
Hardware y arquitectura
Redes de ordenadores y comunicaciones

ODS de las Naciones Unidas

Este resultado contribuye a los siguientes Objetivos de Desarrollo Sostenible

Acceder al documento

10.1109/ICDCS.2017.231

Otros archivos y enlaces

Citar esto

@inproceedings{2196aaf0d9df4baebc12316034d5ba0f,

title = "Dual Scaling VMs and Queries: Cost-Effective Latency Curtailment",

abstract = "Wimpy virtual instances equipped with small numbers of cores and RAM are popular public and private cloud offerings because of their low cost for hosting applications. The challenge is how to run latency-sensitive applications using such instances, which trade off performance for cost. In this study, we analytically and experimentally show that simultaneously scaling resources at coarse granularity and workloads, i.e., submitting multiple query clones to different servers, at fine granularity can overcome the performance disadvantages of wimpy VM instances and achieve stringent latency targets that are even lower than the average execution times of wimpy servers. To such an end, we first derive a closed-form analysis for the latency under any given VM provisioning and query replication level, considering cloning policies that can (not) terminate outstanding clones with (without) an overhead. Validated on trace-driven simulations, our analysis is able to accurately predict the latency and efficiently search for the optimal number of VMs and clones. Secondly, we develop a dual elastic scaler, DuoScale, that dynamically scales VMs and clones according to the workload dynamics so as to achieve the target latency in a cost-effective manner. The effectiveness of DuoScale lies on the observation that the application performance only scales sub-linearly with increasing vertical or horizontal resource provisioning, i.e., resources per VM or number of VMs. We evaluate DuoScale against VM-only scaling strategies via extensive trace-driven simulations as well as experimental results on a cloud test-bed. Our results show that DuoScale is able to achieve the stringent target latency by using clones on wimpy VMs with cost savings up to 50%, compared to scaling brawny VMs that have better performance at a higher unit cost.",

author = "Perez, {Juan F.} and Robert Birke and Mathias Bjorkqvist and Chen, {Lydia Y.}",

year = "2017",

month = jul,

day = "13",

doi = "10.1109/ICDCS.2017.231",

language = "English (US)",

pages = "988--998",

booktitle = "Proceedings - IEEE 37th International Conference on Distributed Computing Systems, ICDCS 2017",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

address = "United States",

note = "37th IEEE International Conference on Distributed Computing Systems, ICDCS 2017 ; Conference date: 05-06-2017 Through 08-06-2017",

}

Perez, JF, Birke, R, Bjorkqvist, M & Chen, LY 2017, Dual Scaling VMs and Queries: Cost-Effective Latency Curtailment. En Proceedings - IEEE 37th International Conference on Distributed Computing Systems, ICDCS 2017., 7980040, Institute of Electrical and Electronics Engineers Inc., pp. 988-998, 37th IEEE International Conference on Distributed Computing Systems, ICDCS 2017, Atlanta, Estados Unidos, 6/5/17. https://doi.org/10.1109/ICDCS.2017.231

Dual Scaling VMs and Queries: Cost-Effective Latency Curtailment. / Perez, Juan F.; Birke, Robert; Bjorkqvist, Mathias et al.
Proceedings - IEEE 37th International Conference on Distributed Computing Systems, ICDCS 2017. Institute of Electrical and Electronics Engineers Inc., 2017. p. 988-998 7980040.

Producción científica: Capítulo en Libro/Reporte › Contribución a la conferencia

TY - GEN

T1 - Dual Scaling VMs and Queries

T2 - 37th IEEE International Conference on Distributed Computing Systems, ICDCS 2017

AU - Perez, Juan F.

AU - Birke, Robert

AU - Bjorkqvist, Mathias

AU - Chen, Lydia Y.

PY - 2017/7/13

Y1 - 2017/7/13

N2 - Wimpy virtual instances equipped with small numbers of cores and RAM are popular public and private cloud offerings because of their low cost for hosting applications. The challenge is how to run latency-sensitive applications using such instances, which trade off performance for cost. In this study, we analytically and experimentally show that simultaneously scaling resources at coarse granularity and workloads, i.e., submitting multiple query clones to different servers, at fine granularity can overcome the performance disadvantages of wimpy VM instances and achieve stringent latency targets that are even lower than the average execution times of wimpy servers. To such an end, we first derive a closed-form analysis for the latency under any given VM provisioning and query replication level, considering cloning policies that can (not) terminate outstanding clones with (without) an overhead. Validated on trace-driven simulations, our analysis is able to accurately predict the latency and efficiently search for the optimal number of VMs and clones. Secondly, we develop a dual elastic scaler, DuoScale, that dynamically scales VMs and clones according to the workload dynamics so as to achieve the target latency in a cost-effective manner. The effectiveness of DuoScale lies on the observation that the application performance only scales sub-linearly with increasing vertical or horizontal resource provisioning, i.e., resources per VM or number of VMs. We evaluate DuoScale against VM-only scaling strategies via extensive trace-driven simulations as well as experimental results on a cloud test-bed. Our results show that DuoScale is able to achieve the stringent target latency by using clones on wimpy VMs with cost savings up to 50%, compared to scaling brawny VMs that have better performance at a higher unit cost.

AB - Wimpy virtual instances equipped with small numbers of cores and RAM are popular public and private cloud offerings because of their low cost for hosting applications. The challenge is how to run latency-sensitive applications using such instances, which trade off performance for cost. In this study, we analytically and experimentally show that simultaneously scaling resources at coarse granularity and workloads, i.e., submitting multiple query clones to different servers, at fine granularity can overcome the performance disadvantages of wimpy VM instances and achieve stringent latency targets that are even lower than the average execution times of wimpy servers. To such an end, we first derive a closed-form analysis for the latency under any given VM provisioning and query replication level, considering cloning policies that can (not) terminate outstanding clones with (without) an overhead. Validated on trace-driven simulations, our analysis is able to accurately predict the latency and efficiently search for the optimal number of VMs and clones. Secondly, we develop a dual elastic scaler, DuoScale, that dynamically scales VMs and clones according to the workload dynamics so as to achieve the target latency in a cost-effective manner. The effectiveness of DuoScale lies on the observation that the application performance only scales sub-linearly with increasing vertical or horizontal resource provisioning, i.e., resources per VM or number of VMs. We evaluate DuoScale against VM-only scaling strategies via extensive trace-driven simulations as well as experimental results on a cloud test-bed. Our results show that DuoScale is able to achieve the stringent target latency by using clones on wimpy VMs with cost savings up to 50%, compared to scaling brawny VMs that have better performance at a higher unit cost.

UR - http://www.scopus.com/inward/record.url?scp=85027252830&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85027252830&partnerID=8YFLogxK

U2 - 10.1109/ICDCS.2017.231

DO - 10.1109/ICDCS.2017.231

M3 - Conference contribution

AN - SCOPUS:85027252830

SP - 988

EP - 998

BT - Proceedings - IEEE 37th International Conference on Distributed Computing Systems, ICDCS 2017

PB - Institute of Electrical and Electronics Engineers Inc.

Y2 - 5 June 2017 through 8 June 2017

ER -

Dual Scaling VMs and Queries: Cost-Effective Latency Curtailment

Resumen

Conferencia

Áreas temáticas de ASJC Scopus

ODS de las Naciones Unidas

Acceder al documento

Otros archivos y enlaces

Huella

Citar esto