Decomposing Workflow Deadlines for Optimal Job Scheduling in Big Data Analytics | Study Guides, Projects, Research Telecommunication electronics

FlowTime: Dynamic Scheduling of Deadline-Aware Workflows and Ad-hoc Jobs

Zhiming Hu, Baochun Li

Department of Electrical and Computer Engineering

University of Toronto

Email: zhiming@ece.utoronto.ca, bli@ece.toronto.edu

Chen Chen, Xiaodi Ke

Huawei Canada Research Center

Toronto, Canada

Email: {chen.cc, xiaodi.ke}@huawei.com

Abstract—With rapidly increasing volumes of data to be

processed in modern data analytics, it is commonplace to run

multiple data processing jobs with inter-job dependencies in

a datacenter cluster, typically as recurring data processing

workloads. Such a group of inter-dependent data analytic jobs

is referred to as a workflow, and may have a deadline due to its

mission-critical nature. In contrast, non-recurring ad-hoc jobs are

typically best-effort in nature, and rather than meeting deadlines,

it is desirable to minimize their average job turnaround time.

The state-of-the-art scheduling mechanisms focused on meet-

ing deadlines for individual jobs only, and are oblivious to

workflow deadlines. In this paper, we present FlowTime, a new

system framework designed to make scheduling decisions for

workflows so that their deadlines are met, while simultaneously

optimizing the performance of ad-hoc jobs. To achieve this

objective, we first adopt a divide-and-conquer strategy to trans-

form the problem of workflow scheduling to a deadline-aware

job scheduling problem, and then design an efficient algorithm

that tackles the scheduling problem with both deadline-aware

jobs and ad-hoc jobs by solving its corresponding optimization

problem directly using a linear program solver. Our experimental

results have clearly demonstrated that FlowTime achieves the

lowest deadline-miss rates for deadline-aware workflows and 2-

10 times shorter average job turnaround time, as compared to

the state-of-the-art scheduling algorithms.

Index Terms—workflow scheduling, big data processing,

deadline-aware scheduling

I. INTRODUCTION

Due to their growing complexities, modern commercial

applications are commonly represented as a group of inter-

dependent Hadoop [1] or Spark [2] jobs [3]. Such a group

of inter-dependent jobs is referred to as a workflow, and

may be associated with a deadline due to the mission-critical

nature of these commercial applications. These deadline-aware

workflows are typically recurring, running on a daily, weekly

or monthly basis. As a consequence, we have rather complete

knowledge of each workflow, including its direct acyclic graph

(DAG) that represents inter-job dependencies, the resource

demand for each job in the workflow, as well as the estimated

running time of tasks in each job. Such information will be

instrumental for designing new workflow-aware scheduling

algorithms that seek to meet workflow deadlines.

It is common for these mission-critical workflows to share

the datacenter cluster with ad-hoc jobs, which are best effort

and non-recurring in nature [4], [5], [6], without any a priori

knowledge of resource demands or running time estimates. We

will still wish to minimize their job turnaround time, defined

as the time of completion minus the time of submission, as

we meet deadlines for recurring workflows.

Existing work in the literature does not consider dependen-

cies across jobs [4], [5], [6] or the performance of ad-hoc

jobs [3]. Rayon [4], for example, assumed that the deadline

for each job is known, which is not the case when deadlines

are associated with workflows rather than individual jobs. To

resolve this issue, Morpheus [5] proposed to infer the deadlines

of jobs from prior runs of workflows. However, it has not

utilized global information of the entire workflow, such as how

jobs depend upon each other. Li et al. [3] ignored ad-hoc jobs,

which can be severely delayed by deadline-aware workflows.

In this paper, we argue that deadline-aware workflows and

latency-sensitive ad-hoc jobs should be jointly optimized.

We present the design and implementation of a new system

framework, FlowTime, to meet as many deadlines for deadline-

aware workflows as possible, and to optimize the average job

turnaround time of ad-hoc jobs at the same time. To achieve

this objective, FlowTime first decomposes the deadlines of

workflows to estimated deadlines of their constituent jobs,

based on the direct acyclic graph (DAG) within each workflow,

used to represent inter-job dependencies.

After deadlines for individual jobs in a workflow have been

estimated, FlowTime is designed to solve an optimization

problem that is specifically formulated to meet workflow

deadlines and optimize the average job turnaround time at the

same time. Just like typical optimization problems related to

resource scheduling [4], [6], our optimization problem is in the

category of integer programming problems. The highlight of

FlowTime is that our specific problem is formulated in such

a way that it can be directly solved using a linear program

(LP) solver, which we are able to prove theoretically. This

way, FlowTime schedules resources in a theoretically sound

and optimal fashion, which traditional resource scheduling

heuristics [4], [5] may not enjoy.

To demonstrate its performance and resource efficiency, we

have deployed a real-world implementation of FlowTime in

YARN, and conducted extensive experiments with standard

benchmarks and trace-driven simulations. Our experimental

results have clearly shown that FlowTime achieves the lowest

deadline miss rates while reducing the average job turnaround

time of ad-hoc jobs by 2-10 times at the same time.

II. BACKGRO UND AN D MOTIVATION

In this section, we will briefly talk about the system model

and the motivations of our system.

Decomposing Workflow Deadlines for Optimal Job Scheduling in Big Data Analytics, Study Guides, Projects, Research of Telecommunication electronics

Related documents

Partial preview of the text

Download Decomposing Workflow Deadlines for Optimal Job Scheduling in Big Data Analytics and more Study Guides, Projects, Research Telecommunication electronics in PDF only on Docsity!

FlowTime: Dynamic Scheduling of Deadline-Aware Workflows and Ad-hoc Jobs

Zhiming Hu, Baochun Li

Chen Chen, Xiaodi Ke

II. BACKGROUND AND MOTIVATION

III. SYSTEM DESIGN