SmallTail: Scaling cores and probabilistic cloning requests for web systems

Ewnetu Bayuh Lakew; Robert Birke; Juan F. Pérez; Erik Elmroth; Lydia Y. Chen

doi:10.1109/ICAC.2018.00013

SmallTail: Scaling cores and probabilistic cloning requests for web systems

Ewnetu Bayuh Lakew, Robert Birke, Juan F. Pérez, Erik Elmroth, Lydia Y. Chen

Research output: Chapter in Book/Report › Conference contribution

2 Scopus citations

Abstract

Users quality of experience on web systems are largely determined by the tail latency, e.g., 95th percentile. Scaling resources along, e.g., the number of virtual cores per VM, is shown to be effective to meet the average latency but falls short in taming the latency tail in the cloud where the performance variability is higher. The prior art shows the prominence of increasing the request redundancy to curtail the latency either in the off-line setting or without scaling-in cores of virtual machines. In this paper, we propose an opportunistic scaler, termed SmallTail, which aims to achieve stringent targets of tail latency while provisioning a minimum amount of resources and keeping them well utilized. Against dynamic workloads, SmallTail simultaneously adjusts the core provisioning per VM and probabilistically replicates requests so as to achieve the tail latency target. The core of SmallTail is a two level controller, where the outer loops controls the core provision per distributed VMs and the inner loop controls the clones in a finer granularity. We also provide theoretical analysis on the steady-state latency for a given probabilistic replication that clones one out of N arriving requests. We extensively evaluate SmallTail on three different web systems, namely web commerce, web searching, and web bulletin board. Our testbed results show that SmallTail can ensure the 95th latency below 1000 ms using up to 53% less cores compared to the strategy of constant cloning, whereas scaling-core only solution exceeds the latency target by up to 70%.

Original language	English (US)
Title of host publication	Proceedings - 15th IEEE International Conference on Autonomic Computing, ICAC 2018
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	31-40
Number of pages	10
ISBN (Electronic)	9781538651391
DOIs	https://doi.org/10.1109/ICAC.2018.00013
State	Published - Oct 18 2018
Event	15th IEEE International Conference on Autonomic Computing, ICAC 2018 - Trento, Italy Duration: Sep 3 2018 → Sep 7 2018

Publication series

Name	Proceedings - 15th IEEE International Conference on Autonomic Computing, ICAC 2018

Conference

Conference	15th IEEE International Conference on Autonomic Computing, ICAC 2018
Country/Territory	Italy
City	Trento
Period	9/3/18 → 9/7/18

All Science Journal Classification (ASJC) codes

Computer Networks and Communications
Computer Science Applications
Safety, Risk, Reliability and Quality
Control and Optimization

Access to Document

10.1109/ICAC.2018.00013

Cite this

Lakew, E. B., Birke, R., Pérez, J. F., Elmroth, E., & Chen, L. Y. (2018). SmallTail: Scaling cores and probabilistic cloning requests for web systems. In Proceedings - 15th IEEE International Conference on Autonomic Computing, ICAC 2018 (pp. 31-40). Article 8498124 (Proceedings - 15th IEEE International Conference on Autonomic Computing, ICAC 2018). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICAC.2018.00013

Lakew, Ewnetu Bayuh ; Birke, Robert ; Pérez, Juan F. et al. / SmallTail : Scaling cores and probabilistic cloning requests for web systems. Proceedings - 15th IEEE International Conference on Autonomic Computing, ICAC 2018. Institute of Electrical and Electronics Engineers Inc., 2018. pp. 31-40 (Proceedings - 15th IEEE International Conference on Autonomic Computing, ICAC 2018).

@inproceedings{f68444786b13405c9adc1d52866d21c1,

title = "SmallTail: Scaling cores and probabilistic cloning requests for web systems",

abstract = "Users quality of experience on web systems are largely determined by the tail latency, e.g., 95th percentile. Scaling resources along, e.g., the number of virtual cores per VM, is shown to be effective to meet the average latency but falls short in taming the latency tail in the cloud where the performance variability is higher. The prior art shows the prominence of increasing the request redundancy to curtail the latency either in the off-line setting or without scaling-in cores of virtual machines. In this paper, we propose an opportunistic scaler, termed SmallTail, which aims to achieve stringent targets of tail latency while provisioning a minimum amount of resources and keeping them well utilized. Against dynamic workloads, SmallTail simultaneously adjusts the core provisioning per VM and probabilistically replicates requests so as to achieve the tail latency target. The core of SmallTail is a two level controller, where the outer loops controls the core provision per distributed VMs and the inner loop controls the clones in a finer granularity. We also provide theoretical analysis on the steady-state latency for a given probabilistic replication that clones one out of N arriving requests. We extensively evaluate SmallTail on three different web systems, namely web commerce, web searching, and web bulletin board. Our testbed results show that SmallTail can ensure the 95th latency below 1000 ms using up to 53% less cores compared to the strategy of constant cloning, whereas scaling-core only solution exceeds the latency target by up to 70%.",

author = "Lakew, {Ewnetu Bayuh} and Robert Birke and P{\'e}rez, {Juan F.} and Erik Elmroth and Chen, {Lydia Y.}",

year = "2018",

month = oct,

day = "18",

doi = "10.1109/ICAC.2018.00013",

language = "English (US)",

series = "Proceedings - 15th IEEE International Conference on Autonomic Computing, ICAC 2018",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "31--40",

booktitle = "Proceedings - 15th IEEE International Conference on Autonomic Computing, ICAC 2018",

address = "United States",

note = "15th IEEE International Conference on Autonomic Computing, ICAC 2018 ; Conference date: 03-09-2018 Through 07-09-2018",

}

Lakew, EB, Birke, R, Pérez, JF, Elmroth, E & Chen, LY 2018, SmallTail: Scaling cores and probabilistic cloning requests for web systems. in Proceedings - 15th IEEE International Conference on Autonomic Computing, ICAC 2018., 8498124, Proceedings - 15th IEEE International Conference on Autonomic Computing, ICAC 2018, Institute of Electrical and Electronics Engineers Inc., pp. 31-40, 15th IEEE International Conference on Autonomic Computing, ICAC 2018, Trento, Italy, 9/3/18. https://doi.org/10.1109/ICAC.2018.00013

SmallTail: Scaling cores and probabilistic cloning requests for web systems. / Lakew, Ewnetu Bayuh; Birke, Robert; Pérez, Juan F. et al.
Proceedings - 15th IEEE International Conference on Autonomic Computing, ICAC 2018. Institute of Electrical and Electronics Engineers Inc., 2018. p. 31-40 8498124 (Proceedings - 15th IEEE International Conference on Autonomic Computing, ICAC 2018).

Research output: Chapter in Book/Report › Conference contribution

TY - GEN

T1 - SmallTail

T2 - 15th IEEE International Conference on Autonomic Computing, ICAC 2018

AU - Lakew, Ewnetu Bayuh

AU - Birke, Robert

AU - Pérez, Juan F.

AU - Elmroth, Erik

AU - Chen, Lydia Y.

PY - 2018/10/18

Y1 - 2018/10/18

N2 - Users quality of experience on web systems are largely determined by the tail latency, e.g., 95th percentile. Scaling resources along, e.g., the number of virtual cores per VM, is shown to be effective to meet the average latency but falls short in taming the latency tail in the cloud where the performance variability is higher. The prior art shows the prominence of increasing the request redundancy to curtail the latency either in the off-line setting or without scaling-in cores of virtual machines. In this paper, we propose an opportunistic scaler, termed SmallTail, which aims to achieve stringent targets of tail latency while provisioning a minimum amount of resources and keeping them well utilized. Against dynamic workloads, SmallTail simultaneously adjusts the core provisioning per VM and probabilistically replicates requests so as to achieve the tail latency target. The core of SmallTail is a two level controller, where the outer loops controls the core provision per distributed VMs and the inner loop controls the clones in a finer granularity. We also provide theoretical analysis on the steady-state latency for a given probabilistic replication that clones one out of N arriving requests. We extensively evaluate SmallTail on three different web systems, namely web commerce, web searching, and web bulletin board. Our testbed results show that SmallTail can ensure the 95th latency below 1000 ms using up to 53% less cores compared to the strategy of constant cloning, whereas scaling-core only solution exceeds the latency target by up to 70%.

AB - Users quality of experience on web systems are largely determined by the tail latency, e.g., 95th percentile. Scaling resources along, e.g., the number of virtual cores per VM, is shown to be effective to meet the average latency but falls short in taming the latency tail in the cloud where the performance variability is higher. The prior art shows the prominence of increasing the request redundancy to curtail the latency either in the off-line setting or without scaling-in cores of virtual machines. In this paper, we propose an opportunistic scaler, termed SmallTail, which aims to achieve stringent targets of tail latency while provisioning a minimum amount of resources and keeping them well utilized. Against dynamic workloads, SmallTail simultaneously adjusts the core provisioning per VM and probabilistically replicates requests so as to achieve the tail latency target. The core of SmallTail is a two level controller, where the outer loops controls the core provision per distributed VMs and the inner loop controls the clones in a finer granularity. We also provide theoretical analysis on the steady-state latency for a given probabilistic replication that clones one out of N arriving requests. We extensively evaluate SmallTail on three different web systems, namely web commerce, web searching, and web bulletin board. Our testbed results show that SmallTail can ensure the 95th latency below 1000 ms using up to 53% less cores compared to the strategy of constant cloning, whereas scaling-core only solution exceeds the latency target by up to 70%.

UR - http://www.scopus.com/inward/record.url?scp=85061316455&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85061316455&partnerID=8YFLogxK

U2 - 10.1109/ICAC.2018.00013

DO - 10.1109/ICAC.2018.00013

M3 - Conference contribution

AN - SCOPUS:85061316455

T3 - Proceedings - 15th IEEE International Conference on Autonomic Computing, ICAC 2018

SP - 31

EP - 40

BT - Proceedings - 15th IEEE International Conference on Autonomic Computing, ICAC 2018

PB - Institute of Electrical and Electronics Engineers Inc.

Y2 - 3 September 2018 through 7 September 2018

ER -

Lakew EB, Birke R, Pérez JF, Elmroth E, Chen LY. SmallTail: Scaling cores and probabilistic cloning requests for web systems. In Proceedings - 15th IEEE International Conference on Autonomic Computing, ICAC 2018. Institute of Electrical and Electronics Engineers Inc. 2018. p. 31-40. 8498124. (Proceedings - 15th IEEE International Conference on Autonomic Computing, ICAC 2018). doi: 10.1109/ICAC.2018.00013

SmallTail: Scaling cores and probabilistic cloning requests for web systems

Abstract

Publication series

Conference

All Science Journal Classification (ASJC) codes

Access to Document

Other files and links

Fingerprint

Cite this