Home | Industry Information | Business News | Browse by Publication | I | IIE Transactions

Dynamic allocation of communicating tasks in computational grids.

Publication: IIE Transactions
Publication Date: 01-NOV-04
Format: Online
Delivery: Immediate Online Access

Article Excerpt
1. Introduction

Computational resources are commonly used in solving a number of engineering and scientific problems. However, the capacity/speed of affordable computational resources remains inadequate for a variety of large-scale problems, such as aircraft design optimization, weather a...

View more below

Read this article now - Try Goliath Business News - FREE!   
You can view this article PLUS...

  • Over 5 million business articles
  • Hundreds of the most trusted magazines, newswires, and journals (see list)
  • Premium business information that is timely and relevant
  • Unlimited Access

Now for a Limited Time, try Goliath Business News - Free for 7 Days!
Tell Me More   Terms and Conditions

Purchase this article for $4.95

Already a subscriber? Log in to view full article

...pattern simulation and analysis of sky surveys. There are two ways of providing the necessary computational power for large-scale problems. One way is to build faster and bigger computers. The power of these resources are limited by the physics governing the semiconductors. The other way is to couple geographically distributed resources into larger virtual resource that can provide the same computational power at a significantly lower cost. Computational clusters have been developed with the same processing power as a super-computer at 1/15th of the cost (Dietz and Mattox, 2000). Recently, significant research has been conducted in integrating remote computing resources into a computational grid (Foster and Kesselman, 1998; Buyya and Baker, 2000). Figure 1 shows the generic architecture of a computational grid in which the resources/nodes can be accessed by the users through a common interface. For the user, the entire grid acts as a single resource with unified capabilities of all the nodes in the grid. The grid nodes find and communicate with each other through an information server. In this paper, we use the terms agents, resources and nodes interchangeably for the computers in the grid.

A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive and inexpensive access to high-end computational capabilities (Foster and Kesselman, 1998). In this framework, a wide variety of resources such as super-computers, clusters, PCs and workstations are coupled together to act as a single integrated resource for the users. One such grid is the SETI@Home project that uses thousands of PCs to help in the search for extraterrestrial intelligence, which has the computational power equivalent to 10 times that of the largest super-computer in existence (Korpela et al., 2001). The users can utilize the resources in the grid by interacting with a resource broker. Based on the user requirements, the resource broker finds the resources that user can access through a grid information server. The broker also assigns the tasks to the resources by negotiating with the resources through middleware services. Finally, the broker is responsible for collecting the results, monitoring the progress of the application execution, managing changes in the grid infrastructure and resource failures.

[FIGURE 1 OMITTED]

Recently, agencies such as NASA, NSF, DOE and the European Union have launched a number of research initiatives to integrate resources from multiple organizations into a computational grid (Johnston et al., 1999; Song et al., 2000; Nowinski et al., 2001). Table 1 shows a number of applications of the grid computing. Foster and Kesselman (1998) provide an extended list of applications of the computational grids.

In order to utilize the distributed resources in a computational grid, the user has to decompose the overall computational task into a number of smaller interdependent tasks. The agents in the computational grid, which often possess heterogeneous capabilities, collectively complete the set of interdependent tasks. An essential characteristic of the grids is that interdependent tasks enter the system in batches and the individual tasks may be processed by different agents in the grid. The agents communicate with each other to coordinate the processing of the interdependent tasks.

We now describe the task allocation problem in a grid. Consider an infinite time horizon over which a group of tasks enter the system at each time period, as shown in Fig. 2. The characteristics of tasks entering at a given time follows a general distribution. Each task is considered to be indivisible. The tasks can be classified into several categories based on processing requirements. Additionally, we assume tasks of the same category to be equivalent in processing complexity. In a typical computational grid, several alternative resources might exist for processing the same task, each with a different processing cost. Depending on the network connectivity of the grid, the communication cost between any two agents might differ. Hence, processing any group of interdependent tasks incurs a certain operational cost, due to the processing and communication costs, which depends on the task allocation procedures used by the resource broker. We consider that each group of tasks enters the system with a certain payment association with it. The computational grid collectively receives the payment for processing the tasks.

The tasks in the buffer of an agent are processed according to the first-come first-serve (FCFS) principle. The throughput of an agent depends on its processing speed, which follows a general distribution. The processing speeds of agents often differ from between each other. Each agent has a finite buffer capacity and a task can be assigned to an agent only when there is enough buffer space.

[FIGURE 2 OMITTED]

Often, the processing times of different tasks overlap resulting in an overhead cost for the agents. To represent this, we consider the processing cost to be a convex function of the number of tasks an agent is processing. Thus, the allocation of tasks of the current task group affects the allocation costs of a future task in two ways: (i) higher processing cost; and (ii) as a result of the finite buffer capacity of each node or agent.

The performance of the grid is measured by the discounted sum of the expected rewards accrued over an infinite time horizon. Our research focuses on finding an optimal task allocation policy that maximizes the performance of the computational grid. Here, we assume that there is a common social utility for the grid. This model differs from the past literature by considering communication cost between heterogeneous resources with finite buffers in a dynamic and uncertain environment, where the task arrival and task processing are based on general probability distributions. We model the task allocation problem as an infinite horizon Markov Decision Process (MDP).

In this paper, we present an action elimination procedure for the MDP. Moreover, we investigate certain structural properties of the optimal policy. Due to the buffer limitations and general distributions for task arrival and processing speeds, monotonic properties across all the states cannot be identified. However, certain states are found to be exhibiting monotonic conditions. These properties are used to develop a heuristic policy for task allocation that can be computed in real-time. We also present computational results, comparing the performance of the proposed heuristic policy with the optimal policy.

The task allocation problem addressed in this paper can be found in a number of application domains where resources are required to coordinate with each other in order to solve the overall problem. One such application is distributed design optimization where a number of analyses are conducted in a coordinated manner in order to obtain optimal design parameters (Deshmukh et al., 2001). The performance of this design framework is affected by the task allocation policies employed when multiple products are required to be designed. Similarly, this task allocation problem is commonly found in the supplier coalitions that require coordination between the suppliers to satisfy the customer demand. Another application of this research is in project management where organizations are faced with managing multiple projects over time with each project consisting of several interdependent tasks. Even though this paper addresses the task allocation problem in computational grids, this research can be applied to several manufacturing and service problems.

The remainder of this paper is organized as follows. Background research is reviewed in Section 2 and the task allocation problem is formulated in Section 3. In Section 4, the task allocation problem is formulated as a MDP and it is shown to have a stationary optimal policy. Section 5 discusses various structural properties of the optimal task allocation policy and presents a real-time heuristic policy. Computational results are presented in Section 6. Finally, the research findings in this paper are summarized in Section 7.

2. Background

A number of resource allocation mechanisms have been developed to address the task assignment in computational grids. Condor is a distributed resource management system that allows users to describe arbitrary resource requests and resource owners to describe their resources (Litzkow et al., 1988). Utopia is a load sharing system for large clusters of nodes which periodically collects statistics about the participating workstations to determine the resource allocation for a new task (Zhou et al., 1993). Portable Batch System is another scheduling system in which performance gains are achieved by moving away from FCFS scheduling (Henderson, 1995). NOW is a software system designed to run parallel programs easily and efficiently on a network of UNIX workstations (Anderson et al., 1995). Similarly, Nimrod-G allows the users to specify budget and time constraints that are used in the task allocation algorithms for a grid (Abramson et al., 2002). Even though these schedulers are known to function well in practice, theoretical baselines do not exist for comparing the performance of these schedulers. This paper attempts to formalize the task allocation framework allowing us to compare the performance of different schedulers to a baseline. Moreover, this paper explicitly considers the communication cost between different resources in the task allocation policy.

The task assignment in grids can be primarily categorized into centralized and distributed methods. In this paper, we focus only on grids with a single resource broker that assigns tasks to the agents according to a task allocation policy. Centralized task allocation may be difficult to apply for large-scale systems. However, it can be used as a baseline for evaluating the performance of the distributed task allocation methods using multiple resource brokers.

Task allocation in computational grids is closely related to the research conducted in distributed computing (Norman and Thanisch, 1993), multi-processor scheduling, and distributed problem solving (Durfee et al., 1987; Durfee, 1999). The task allocation in distributed and parallel computing focuses on assigning static and communicating tasks to a processor network such that the assignment maximizes some performance measure (Stone, 1977; Bokhari, 1981a; Sinclair, 1987). This problem is known to be NP-complete (Bokhari, 1981b). Due to the complexity of exact solution approaches to the static task allocation problem, a number of heuristics have been proposed (Lo, 1988; Bollinger and Midkiff, 1991). Our research considers the dynamic nature of the tasks, in addition to the explicit consideration of the communication costs between the resources.

Most of the research conducted in distributed problem solving focuses on distributed task allocation. The literature in distributed task allocation methods falls into three categories: (i) general market equilibrium (Wellman, 1993; Ygge and Akkermans, 1995; Cheng and Wellman, 1998); (ii) auctions (McAfee and McMillan, 1987; Walsh et al., 1998; Klemperer, 1999); and (iii) contract nets (Sandholm, 1993; Schwartz and Kraus, 1997; Krothapalli and Deshmukh, 1999). We are not aware of any research in distributed task allocation that explicitly considers communication costs between the resources and dynamic arrival of tasks.

Multi-processor scheduling is a well researched topic in the scheduling literature (McNaughton, 1959; Graham, 1966; Hochbaum and Shmoys, 1988). The static multiprocessor scheduling problem is well known to be NP-hard (Garey and Johnson, 1979). Queueing systems are commonly used for the dynamic assignment of tasks to multiple servers. Winston (1977) proves that the shortest-line rule is optimal for parallel servers with the exponential service times and a Poisson arrival process. Most of the literature in scheduling and queuing systems ignores the communication between resources.

We model...

NOTE: All illustrations and photos have been removed from this article.



More articles from IIE Transactions
A flow metamodel for delivery agents over a spatial area., November 01, 2004
A scale model for fitting object shapes from fixed location data., November 01, 2004
Improving scheduling robustness via preprocessing and dynamic adaptati..., November 01, 2004
A Benders-based heuristic for the robust capacitated international sou..., November 01, 2004
Accounting for input-model and input-parameter uncertainties in simula..., November 01, 2004

Looking for additional articles?
Search our database of over 3 million articles.

Looking for more in-depth information on this industry?
Search our complete database of Industry & Market reports by text, subject, publication name or publication date.

About Goliath
Whether you're looking for sales prospects, competitive information, company analysis or best practices in managing your organization, Goliath can help you meet your business needs.

Our extensive business information databases empower business professionals with both the breadth and depth of credible, authoritative information they need to support their business goals. Whether it be strategic planning, sales prospecting, company research or defining management best practices - Goliath is your leading source for accurate information.