Variability-aware request replication for latency curtailment

Título traducido de la contribución: Replicación de solicitudes con reconocimiento de la variabilidad para la reducción de la latencia

Zhan Qiu, Juan F. Pérez, Peter G. Harrison

Resultado de la investigación: Contribución a libro /Tipo informe o reporteContribución en conferencia

6 Citas (Scopus)

Resumen

La variabilidad del tiempo de procesamiento es común en los sistemas distribuidos, donde los recursos muestran un rendimiento dispar debido a, por ejemplo, diferentes niveles de carga de trabajo, procesos en segundo plano y contención en entornos virtualizados. Sin embargo, es fundamental que los proveedores de servicios mantengan bajo control la variabilidad en el tiempo de respuesta para poder ofrecer servicios que respondan a las necesidades. Investigamos cómo se puede utilizar la replicación de solicitudes para explotar la variabilidad del tiempo de procesamiento y reducir los tiempos de respuesta, teniendo en cuenta no sólo los valores medios sino también la cola de la distribución del tiempo de respuesta. Nos centramos en la configuración distribuida, donde la replicación se logra ejecutando copias de las solicitudes en varios servidores que, de lo contrario, evolucionan de forma independiente, y esperando a que la primera réplica complete el servicio. Construimos modelos que capturan la evolución de un sistema con peticiones replicadas utilizando métodos aproximados y observamos que los tiempos de servicio altamente variables ofrecen las mejores oportunidades para la replicación, reduciendo el tiempo de respuesta en particular. Además, el efecto de la replicación no es uniforme sobre la distribución del tiempo de respuesta: las ganancias en una métrica, por ejemplo, la media, pueden ser a costa de otra, por ejemplo, los percentiles de cola. Esto se demuestra en una amplia gama de experimentos numéricos virtuales. Se puede ver que la captura de la variabilidad del tiempo de servicio es clave para la evaluación de las estrategias de tolerancia de latencia y en su diseño.
Idioma originalEnglish (US)
Título de la publicación alojadaIEEE INFOCOM 2016 - 35th Annual IEEE International Conference on Computer Communications
EditorialInstitute of Electrical and Electronics Engineers Inc.
Volumen2016-July
ISBN (versión digital)9781467399531
DOI
EstadoPublished - jul 27 2016
Publicado de forma externa
Evento35th Annual IEEE International Conference on Computer Communications, IEEE INFOCOM 2016 - San Francisco
Duración: abr 10 2016abr 14 2016

Conference

Conference35th Annual IEEE International Conference on Computer Communications, IEEE INFOCOM 2016
PaísUnited States
CiudadSan Francisco
Período4/10/164/14/16

Huella dactilar

Processing
Servers
Experiments

All Science Journal Classification (ASJC) codes

  • Computer Science(all)
  • Electrical and Electronic Engineering

Citar esto

Qiu, Z., Pérez, J. F., & Harrison, P. G. (2016). Variability-aware request replication for latency curtailment. En IEEE INFOCOM 2016 - 35th Annual IEEE International Conference on Computer Communications (Vol. 2016-July). [7524365] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/INFOCOM.2016.7524365
Qiu, Zhan ; Pérez, Juan F. ; Harrison, Peter G. / Variability-aware request replication for latency curtailment. IEEE INFOCOM 2016 - 35th Annual IEEE International Conference on Computer Communications. Vol. 2016-July Institute of Electrical and Electronics Engineers Inc., 2016.
@inproceedings{41d357b44ccb42f3aa8ea38a1b1502df,
title = "Variability-aware request replication for latency curtailment",
abstract = "Processing time variability is commonplace in distributed systems, where resources display disparate performance due to, e.g., different workload levels, background processes, and contention in virtualized environments. However, it is paramount for service providers to keep variability in response time under control in order to offer responsive services. We investigate how request replication can be used to exploit processing time variability to reduce response times, considering not only mean values but also the tail of the response time distribution. We focus on the distributed setup, where replication is achieved by running copies of requests on multiple servers that otherwise evolve independently, and waiting for the first replica to complete service. We construct models that capture the evolution of a system with replicated requests using approximate methods and observe that highly variable service times offer the best opportunities for replication - reducing the response time tail in particular. Further, the effect of replication is non-uniform over the response time distribution: gains in one metric, e.g., the mean, can be at the cost of another, e.g., the tail percentiles. This is demonstrated in wide range of numerical virtual experiments. It can be seen that capturing service time variability is key to the evaluation of latency tolerance strategies and in their design.",
author = "Zhan Qiu and P{\'e}rez, {Juan F.} and Harrison, {Peter G.}",
year = "2016",
month = "7",
day = "27",
doi = "10.1109/INFOCOM.2016.7524365",
language = "English (US)",
volume = "2016-July",
booktitle = "IEEE INFOCOM 2016 - 35th Annual IEEE International Conference on Computer Communications",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
address = "United States",

}

Qiu, Z, Pérez, JF & Harrison, PG 2016, Variability-aware request replication for latency curtailment. En IEEE INFOCOM 2016 - 35th Annual IEEE International Conference on Computer Communications. vol. 2016-July, 7524365, Institute of Electrical and Electronics Engineers Inc., San Francisco, 4/10/16. https://doi.org/10.1109/INFOCOM.2016.7524365

Variability-aware request replication for latency curtailment. / Qiu, Zhan; Pérez, Juan F.; Harrison, Peter G.

IEEE INFOCOM 2016 - 35th Annual IEEE International Conference on Computer Communications. Vol. 2016-July Institute of Electrical and Electronics Engineers Inc., 2016. 7524365.

Resultado de la investigación: Contribución a libro /Tipo informe o reporteContribución en conferencia

TY - GEN

T1 - Variability-aware request replication for latency curtailment

AU - Qiu, Zhan

AU - Pérez, Juan F.

AU - Harrison, Peter G.

PY - 2016/7/27

Y1 - 2016/7/27

N2 - Processing time variability is commonplace in distributed systems, where resources display disparate performance due to, e.g., different workload levels, background processes, and contention in virtualized environments. However, it is paramount for service providers to keep variability in response time under control in order to offer responsive services. We investigate how request replication can be used to exploit processing time variability to reduce response times, considering not only mean values but also the tail of the response time distribution. We focus on the distributed setup, where replication is achieved by running copies of requests on multiple servers that otherwise evolve independently, and waiting for the first replica to complete service. We construct models that capture the evolution of a system with replicated requests using approximate methods and observe that highly variable service times offer the best opportunities for replication - reducing the response time tail in particular. Further, the effect of replication is non-uniform over the response time distribution: gains in one metric, e.g., the mean, can be at the cost of another, e.g., the tail percentiles. This is demonstrated in wide range of numerical virtual experiments. It can be seen that capturing service time variability is key to the evaluation of latency tolerance strategies and in their design.

AB - Processing time variability is commonplace in distributed systems, where resources display disparate performance due to, e.g., different workload levels, background processes, and contention in virtualized environments. However, it is paramount for service providers to keep variability in response time under control in order to offer responsive services. We investigate how request replication can be used to exploit processing time variability to reduce response times, considering not only mean values but also the tail of the response time distribution. We focus on the distributed setup, where replication is achieved by running copies of requests on multiple servers that otherwise evolve independently, and waiting for the first replica to complete service. We construct models that capture the evolution of a system with replicated requests using approximate methods and observe that highly variable service times offer the best opportunities for replication - reducing the response time tail in particular. Further, the effect of replication is non-uniform over the response time distribution: gains in one metric, e.g., the mean, can be at the cost of another, e.g., the tail percentiles. This is demonstrated in wide range of numerical virtual experiments. It can be seen that capturing service time variability is key to the evaluation of latency tolerance strategies and in their design.

UR - http://www.scopus.com/inward/record.url?scp=84983362827&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84983362827&partnerID=8YFLogxK

U2 - 10.1109/INFOCOM.2016.7524365

DO - 10.1109/INFOCOM.2016.7524365

M3 - Conference contribution

VL - 2016-July

BT - IEEE INFOCOM 2016 - 35th Annual IEEE International Conference on Computer Communications

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Qiu Z, Pérez JF, Harrison PG. Variability-aware request replication for latency curtailment. En IEEE INFOCOM 2016 - 35th Annual IEEE International Conference on Computer Communications. Vol. 2016-July. Institute of Electrical and Electronics Engineers Inc. 2016. 7524365 https://doi.org/10.1109/INFOCOM.2016.7524365