Delay and Stability Tradeoffs in Distributed Service Systems with Redundancy

Wednesday, November 28, 2018 - 3:00pm to Thursday, November 29, 2018 - 3:55pm

Event Calendar Category

LIDS & Stats Tea

Speaker Name

Martín Zubeldía

Affiliation

LIDS

Building and Room Number

LIDS Lounge

We consider a large distributed service system with N servers that have random service rates, where there is a stream of incoming jobs that consist of K<<N identical tasks that can be executed in parallel. We assume that the K tasks from the same job can be encoded into R>K tasks of the same size (by introducing redundancy) so that the job is considered to be completed when *any* K out of the R tasks finish their service.

From the point of view of a job, using more redundant tasks (i.e., using a larger R) can decrease the job's delay because it only has to wait for the tasks in the K fastest servers (out of the R) to finish their service. However, too many redundant tasks from all incoming jobs increase the overall congestion of the system, increasing the delay of all jobs (we can have a tragedy of the commons). Our goal is to characterize the minimum delay achievable in these systems and to design simple policies that achieve it.

Martín Zubeldía is a 5th year PhD student at LIDS, co-advised by Profs. John N. Tsitsiklis and David Gamarnik. His research interests lie in the field of applied probability theory, seeking to understand fundamental properties and design principles of large-scale stochastic systems, with applications in computer networks and other service systems.