Tackling latency via replication in distributed systems

Título traducido de la contribución: Tratamiento de la latencia mediante replicación en sistemas distribuidos

Zhan Qiu, Juan F. Pérez, Peter G. Harrison

Resultado de la investigación: Capítulo en Libro/Reporte/ConferenciaContribución a la conferencia

6 Citas (Scopus)

Resumen

Consistently high reliability and low latency are twin requirements common to many forms of distributed processing; for example, server farms and mirrored storage access. To address them, we consider replication of requests with canceling – i.e. initiate multiple concurrent replicas of a request and use the first successful result returned, canceling all outstanding replicas. This scheme has been studied recently, but mostly for systems with a single central queue, while server farms exploit distributed resources for scalability and robustness. We develop an approximate stochastic model to determine the response time distribution in a system with distributed queues, and compare its performance against its centralized counterpart. Validation against simulation indicates that our model is accurate for not only the mean response time but also its quantiles, which are particularly relevant for deadline-driven applications. Further, we show that in the distributed setup, replication with canceling has the potential to reduce response times, even at relatively high utilization. We also find that it offers response times close to those of the centralized system, especially at medium-to-high request reliability. These findings support the use of replication with canceling as an effective mechanism for both fault- and delay-tolerance.
Título traducido de la contribuciónTratamiento de la latencia mediante replicación en sistemas distribuidos
Idioma originalInglés estadounidense
Título de la publicación alojadaICPE 2016 - Proceedings of the 7th ACM/SPEC International Conference on Performance Engineering
EditorialAssociation for Computing Machinery
Páginas197-208
Número de páginas12
ISBN (versión digital)9781450340809
DOI
EstadoPublicada - mar 12 2016
Publicado de forma externa
Evento7th ACM/SPEC International Conference on Performance Engineering, ICPE 2016 - Delft, Países Bajos
Duración: mar 12 2016mar 16 2016

Conferencia

Conferencia7th ACM/SPEC International Conference on Performance Engineering, ICPE 2016
País/TerritorioPaíses Bajos
CiudadDelft
Período3/12/163/16/16

All Science Journal Classification (ASJC) codes

  • Software
  • Informática aplicada
  • Hardware y arquitectura

Huella

Profundice en los temas de investigación de 'Tratamiento de la latencia mediante replicación en sistemas distribuidos'. En conjunto forman una huella única.

Citar esto