Cutting Latency Tail: Analyzing and Validating Replication without Canceling

Zhan Qiu; Juan F. Perez; Robert Birke; Lydia Chen; Peter G. Harrison

doi:10.1109/TPDS.2017.2706268

Cutting Latency Tail: Analyzing and Validating Replication without Canceling

Título traducido de la contribución: Latencia de corte de la cola: Análisis y validación de la reproducción sin cancelar

Zhan Qiu, Juan F. Perez, Robert Birke, Lydia Chen, Peter G. Harrison

Producción científica: Contribución a una revista › Artículo › revisión exhaustiva

9 Citas (Scopus)

Resumen

La variabilidad del tiempo de respuesta en las aplicaciones de software puede degradar gravemente la calidad de la experiencia del usuario. Para reducir esta variabilidad, la replicación de solicitudes surge como una solución efectiva al generar múltiples copias de cada solicitud y utilizar el resultado de la primera que se complete. La mayoría de los estudios anteriores se han centrado principalmente en la latencia media de los sistemas que implementan la cancelación de réplicas, es decir, todas las réplicas de una solicitud se cancelan una vez que termina la primera. En cambio, desarrollamos modelos para obtener la distribución del tiempo de respuesta para sistemas en los que la cancelación de réplicas puede ser demasiado costosa o inviable de implementar, como en los sistemas "rápidos", como los servicios web, o en los sistemas heredados. Además, introducimos un nuevo modelo de servicio para considerar explícitamente la correlación en los tiempos de procesamiento de las réplicas de las solicitudes, y diseñamos un algoritmo eficiente para parametrizar el modelo a partir de datos reales. Las evaluaciones exhaustivas de un benchmark de MATLAB y una aplicación web de tres niveles (MediaWiki) muestran una precisión notable, por ejemplo, un error medio del 7 (4 por ciento) en el tiempo de respuesta del percentil 99 para el benchmark (respectivamente, MediaWiki), cuyas peticiones se ejecutan en el orden de los segundos (respectivamente, milisegundos). De este modo, se obtiene información sobre los niveles óptimos de replicación a partir de este análisis cuantitativo preciso, bajo una amplia variedad de escenarios de sistema.

Título traducido de la contribución	Latencia de corte de la cola: Análisis y validación de la reproducción sin cancelar
Idioma original	Inglés estadounidense
Número de artículo	7932099
Páginas (desde-hasta)	3128-3141
Número de páginas	14
Publicación	IEEE Transactions on Parallel and Distributed Systems
Volumen	28
N.º	11
DOI	https://doi.org/10.1109/TPDS.2017.2706268
Estado	Publicada - nov. 1 2017
Publicado de forma externa	Sí

Áreas temáticas de ASJC Scopus

Procesamiento de senales
Hardware y arquitectura
Teoría computacional y matemáticas

Acceder al documento

10.1109/TPDS.2017.2706268

Otros archivos y enlaces

Citar esto

@article{d704b62d690b4ad9b85bff436aa24344,

title = "Cutting Latency Tail: Analyzing and Validating Replication without Canceling",

abstract = "Response time variability in software applications can severely degrade the quality of the user experience. To reduce this variability, request replication emerges as an effective solution by spawning multiple copies of each request and using the result of the first one to complete. Most previous studies have mainly focused on the mean latency for systems implementing replica cancellation, i.e., all replicas of a request are canceled once the first one finishes. Instead, we develop models to obtain the response-time distribution for systems where replica cancellation may be too expensive or infeasible to implement, as in 'fast' systems, such as web services, or in legacy systems. Furthermore, we introduce a novel service model to explicitly consider correlation in the processing times of the request replicas, and design an efficient algorithm to parameterize the model from real data. Extensive evaluations on a MATLAB benchmark and a three-tier web application (MediaWiki) show remarkable accuracy, e.g., 7 (4 percent) average error on the 99th percentile response time for the benchmark (respectively, MediaWiki), the requests of which execute in the order of seconds (respectively, milliseconds). Insights into optimal replication levels are thereby gained from this precise quantitative analysis, under a wide variety of system scenarios.",

author = "Zhan Qiu and Perez, {Juan F.} and Robert Birke and Lydia Chen and Harrison, {Peter G.}",

year = "2017",

month = nov,

day = "1",

doi = "10.1109/TPDS.2017.2706268",

language = "English (US)",

volume = "28",

pages = "3128--3141",

journal = "IEEE Transactions on Parallel and Distributed Systems",

issn = "1045-9219",

publisher = "IEEE Computer Society",

number = "11",

}

TY - JOUR

T1 - Cutting Latency Tail

T2 - Analyzing and Validating Replication without Canceling

AU - Qiu, Zhan

AU - Perez, Juan F.

AU - Birke, Robert

AU - Chen, Lydia

AU - Harrison, Peter G.

PY - 2017/11/1

Y1 - 2017/11/1

N2 - Response time variability in software applications can severely degrade the quality of the user experience. To reduce this variability, request replication emerges as an effective solution by spawning multiple copies of each request and using the result of the first one to complete. Most previous studies have mainly focused on the mean latency for systems implementing replica cancellation, i.e., all replicas of a request are canceled once the first one finishes. Instead, we develop models to obtain the response-time distribution for systems where replica cancellation may be too expensive or infeasible to implement, as in 'fast' systems, such as web services, or in legacy systems. Furthermore, we introduce a novel service model to explicitly consider correlation in the processing times of the request replicas, and design an efficient algorithm to parameterize the model from real data. Extensive evaluations on a MATLAB benchmark and a three-tier web application (MediaWiki) show remarkable accuracy, e.g., 7 (4 percent) average error on the 99th percentile response time for the benchmark (respectively, MediaWiki), the requests of which execute in the order of seconds (respectively, milliseconds). Insights into optimal replication levels are thereby gained from this precise quantitative analysis, under a wide variety of system scenarios.

AB - Response time variability in software applications can severely degrade the quality of the user experience. To reduce this variability, request replication emerges as an effective solution by spawning multiple copies of each request and using the result of the first one to complete. Most previous studies have mainly focused on the mean latency for systems implementing replica cancellation, i.e., all replicas of a request are canceled once the first one finishes. Instead, we develop models to obtain the response-time distribution for systems where replica cancellation may be too expensive or infeasible to implement, as in 'fast' systems, such as web services, or in legacy systems. Furthermore, we introduce a novel service model to explicitly consider correlation in the processing times of the request replicas, and design an efficient algorithm to parameterize the model from real data. Extensive evaluations on a MATLAB benchmark and a three-tier web application (MediaWiki) show remarkable accuracy, e.g., 7 (4 percent) average error on the 99th percentile response time for the benchmark (respectively, MediaWiki), the requests of which execute in the order of seconds (respectively, milliseconds). Insights into optimal replication levels are thereby gained from this precise quantitative analysis, under a wide variety of system scenarios.

UR - http://www.scopus.com/inward/record.url?scp=85032457020&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85032457020&partnerID=8YFLogxK

U2 - 10.1109/TPDS.2017.2706268

DO - 10.1109/TPDS.2017.2706268

M3 - Article

AN - SCOPUS:85032457020

SN - 1045-9219

VL - 28

SP - 3128

EP - 3141

JO - IEEE Transactions on Parallel and Distributed Systems

JF - IEEE Transactions on Parallel and Distributed Systems

IS - 11

M1 - 7932099

ER -

Cutting Latency Tail: Analyzing and Validating Replication without Canceling

Resumen

Áreas temáticas de ASJC Scopus

Acceder al documento

Otros archivos y enlaces

Huella

Citar esto