|
Article Excerpt 1. Introduction
Make-to-Order (MTO) systems are successful business strategies to manage responsive supply chains that are characterized by high product variety, highly variable customer demand and short product life cycles. Because of mass customization and competition on product variety, many firms adopt an MTO strategy to offer a variety of products and deal with product proliferation. Dell's manufacturing and distribution of Personal Computers (PCs) is an excellent example of an MTO supply chain (Margretta, 1998; Dell, 2000). Dell typically offers several lines of product, with each allowing at least dozens of "features" from which customers can select when placing an order--different combinations of CPU, hard drive, memory and other peripherals. In Dell's supply chain, multiple components are procured and kept in inventory at various assembly facilities, from which they are assembled into a wide variety of finished products in response to customer orders. Whereas each of these components takes a substantial lead time to manufacture, the time to assemble all these components into a PC is low, provided there is sufficient assembly capacity and the components are available. In traditional Make-To-Stock (MTS) supply chains, the customer orders are met from stocks of an inventory of finished products that are kept at various points of the network. This is done to reduce the delay in fulfilling customer orders, increase sales and avoid stockouts. However, the problems associated with holding inventory of finished products may outweigh the benefits, especially when those products become obsolete as technology advances or fashion changes. While an MTO strategy eliminates finished goods inventories and reduces a firm's exposure to the risk of obsolescence, it usually spells long customer response time (Gupta and Benjaafar, 2004).
In order to reconcile the dual needs of a quick response time and high product variety, many firms such as General Electric, American Standard, Compaq, IBM, BMW and National Bicycle use a hybrid strategy (i.e., mix of MTO and MTS) called the Assemble-To-Order (ATO) strategy, in which a subassembly, or a number of common subassemblies used in several products, are assembled and placed in inventory until an order is received for the finished product (Song and Zipkin, 2003). This allows the firm to customize the orders by having the product ready using the MTO strategy, while taking advantage of the economies of scale using the MTS strategy. Also, investment in the semi-finished product inventory is smaller compared to the option of maintaining a similar amount of finished goods inventory. Furthermore, demand pooling benefits can be realized. Although, maintaining a semi-finished product inventory in ATO systems lowers the customer response time as compared to a pure MTO system, it can be further reduced by minimizing congestion at the point of differentiation. Naturally, the response time to deliver the product is critical and forms the basis for competition. Consumers' willingness to pay a premium for a shorter response time provides further incentives for firms to reduce response time in MTO and ATO supply chains.
Although various integrated models of supply chain design have been proposed in recent years to support lead time reduction, these models have continued to be largely guided by more traditional concerns of efficiency and cost in MTS settings, where the primary focus is on minimizing the fixed cost of facility location and the variable transportation cost under fairly stable and deterministic customer demand settings. This approach is personified by the work of Dogan and Goetschalckx (1999), Vidal and Goetschalckx (2000),. Teo and Shu (2004), Shen (2005), Eskigun et al. (2005) and Elhedhli and Gzara (2008). For example, Vidal and Goetschalckx (2000) present a model that captures the effect of change in transportation lead time and demand on the optimal configuration of the global supply chain network, assuming that the demand is deterministic. Eskigun et al. (2005) incorporate delivery lead time and the choice of transportation mode in the design of a supply chain under a deterministic demand setting. These models tend to ignore congestion at the facilities and its effect on response time. Their solutions prescribe locating facilities whose capacity utilization is very high, resulting in an excessively long response time when subjected to variability in service times and randomness in customer orders. Reviews by Vidal and Goetschalckx (1997), Erenguc et al. (1999) and Sarmiento and Nagi (1999) also point out that most of the existing supply chain design models do not consider measures of customer service such as response time in making location/allocation decisions. Also, refer to the recent review by Klose and Drexl (2005). This is not surprising given the complexity of the model and the interplay of locational and queueing aspects of the problem. To the best of our knowledge, Huang et al. (2005) is one of the first to model the effect of congestion in the design of distribution networks. They model capacity using the mean and variance of the Distribution Centers (DCs) as continuous variables, whereas our model considers capacity as a set of discrete options with known means and variances. They propose solution procedures based on outer approximation and Lagrangean relaxation, and tested on small instances of the problem.
Another growing body of literature that is related to our work and accounts for congestion and its effect on response time in strategic planning is models for facility location with immobile servers, stochastic demand and congestion (such as location of emergency medical facilities, fire stations, telecommunication network design, automated teller machines or internet mirror site location). For an extensive review, refer to Berman and Krass (2002). Due to the complexity of the underlying problem, most papers in this area make very strong assumptions: (i) either the number or capacity of the facilities (or both) are assumed to be fixed; (ii) the facilities are assumed to be identical; (iii) the demand arrival process is assumed to be Poisson; and (iv) the service process is usually assumed to be exponential (see, Amiri (1997), Marianov and Serra (2002), Wang et al. (2003) and Elhedhli (2006) and references therein). Despite that, most of the techniques proposed to date to solve these problems, with the exception of Elhedhli (2006), are either approximate or heuristic based. Our work is also similar in spirit to models for capacity planning with congestion effects, for which only heuristic solution procedures have been reported; see Rajagopalan and Yu (2001) and references therein.
The objective of this paper is to model the effect of congestion on the response time and analyze the tradeoff among response time costs, facility location and capacity acquisition costs, and outbound transportation costs in the design of supply chain networks. More specifically, we present a model to determine the configuration of an MTO supply chain, where the emphasis is on minimizing the customer response time through the acquisition of sufficient assembly capacity and the optimal allocation of workload to the assembly facilities (DCs) under stochastic customer demand settings. The DCs are modeled as spatially distributed queues with Poission arrivals and general service times to capture the dynamics of the response time. The model is formulated as a non-linear Mixed-Integer Programming (MIP) problem and is linearized using piecewise linear functions. We present a cutting plane algorithm that provides the optimal solution to the problem. Furthermore, we present a Lagrangean relaxation heuristic procedure for solving large-scale instances of such integrated models. Then, we present a model for the two-echelon ATO supply chain design problem, where a set of plants and DCs are to be established to distribute various finished products to a set of customers with stochastic demand. DCs act as assembly facilities, where semi-finished products, procured from plants are held in inventories, from which they are assembled into a wide variety of finished products in response to customer demands. We propose a Lagrangean relaxation heuristic that exploits the echelon structure of the problem and uses the solution methodology proposed above for the MTO problem. Explicit consideration of congestion effects and their impact on response time in making location, capacity and allocation decisions in supply chains distinguishes this work from most other supply chain design models.
The rest of the paper is organized as follows. Section 2 provides a non-linear MIP formulation of the MTO supply chain design problem, a piecewise linearization and an exact solution approach based on the cutting plane method. The simplifications resulting from assuming exponentially distributed service times (M/M/1 case) and deterministic service times (M/D/1 case) are also explicitly described. In Section 3, we present the formulation of the two-echelon ATO supply chain design problem and a Lagrangean heuristic. Computational results and managerial insights are reported in Section 4. Finally, Section 5 concludes with some directions for future research.
2. MTO supply chain design
Consider the problem of designing an MTO supply chain, where a set of DCs are to be established and equipped with sufficient capacity to serve a set of customers. Sufficient capacity here implies being able to obtain service without waiting for an excessively long time after the order is placed. The DCs maintain inventory of multiple components and facilitate the assembly and shipment of a wide variety of finished products in a timely fashion without carrying expensive finished-goods inventory and incurring a long response time. Response time refers to the interval between the placing of an order and receipt of the ordered product. In MTO supply chains, because a customer order triggers the assembly of finished product from components, the response time consists of the assembly lead time and the delivery lead time. The delivery time between individual DCs and customers is relatively constant compared to the order fulfilment time at DCs in such settings. Moreover, it can further be reduced (using alternative transportation modes or expedited delivery services) to respond quickly to customer orders on a short-term basis. However, the assembly lead time is highly dependent on the DC capacity and the allocated workload and is difficult to change (on a short-term basis) once the DC is established.
2.1. Model formulation
We consider the setting depicted in Fig. 1. We assume that the demand for each product from each customer is independent and occurs according to a Poisson process. Once the demand for a product is realized at the customers' end, the order is placed at the DCs. DCs will act as assembly facilities and the customers' orders arriving at the DCs are met on a First-Come First-Serve (FCFS) basis. We assume that each DC operates as a single flexible-capacity server with infinite buffers to accommodate customer orders waiting for service.
[FIGURE 1 OMITTED]
Under these assumptions, the MTO supply chain is modeled as a network of independent M/G/1 queues in which the DCs are treated as servers with service rates proportional to their capacity levels, where the capacity levels are discrete. We also assume that there is an unlimited supply of components and their inventory holding costs at the DCs are insignificant. Hence, the model formulated below simultaneously determines the location and capacity of DCs and the assignment of customer to DCs by minimizing the response time costs in addition to the fixed location and capacity acquisition costs, the assembly and transportation costs from DCs to customers. Besides capacity restrictions (steady-state conditions) at the DCs, and the demand requirements, there are constraints which ensure that at most one capacity level is selected at the DCs. To model this problem, we define the following notation.
Indices and parameters:
i = index for customers, i = 1, 2, ..., I;
j = index for potential DCs, j = 1, 2, ..., J;
k = index for potential capacity level at DCs, k = 1, 2, ..., K;
[f.sub.jk] = fixed cost of opening DC j and acquiring capacity level k ($/period);
[c.sub.ij] = unit cost of serving customer i from DC j ($unit);
t = mean response time cost per unit time per customer ($/period/customer);
[[lambda].sub.i] = mean demand rate for the product from customer i (units/period);
[[micro].sub.jk] = mean service rate at DC j, if it is allocated capacity level k (units/period);
[[sigma].sub.jk.sup.2] = variance of service times at DC j, if it is allocated capacity level k.
Decision variables:
[x.sub.ij] = fraction of customer i's demand served by DC j (0 [less than or equal to] [x.sub.ij] [less than or equal to] 1);
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
Let the demand for the product at customer location i be an independent random variable that follows a Poisson process with mean [[lambda].sub.i]. If [x.sub.ij] is the fraction of customer i's demand served by DC j, then the aggregate demand arrival rate at DC j is also a random variable that follows a Poisson process with mean [[lambda] = [[SIGMA].sub.[i = 1].sup.I] [[lambda].sub.i][x.sub.ij], due to the superposition of Poisson processes. If the service times at each DC follow a general distribution and each DC is modeled as an M/G/1 queue, then the mean service rate of DC j, if it is allocated capacity level k, is given by [[micro].sub.j]= [[SIGMA].sub.[k = 1].sup.K] [[micro].sub.jk][y.sub.jk] and the variance in service times is [[SIGMA].sub.j.sup. 2]= [[SIGMA].sub.[k = 1].sup.K] [[sigma].sub.jk.sup.2][y.sub.jk]. This service rate reflects the server capacity or essentially the number of MTO products a DC can assemble and ship in a given time period. Let [[tau].sub.j] represent the mean service time at DC j ([[tau].sub.j] = [1/[[micro].sub.j]]), [[rho].sub.j] be the utilization of DCj ([[rho].sub.j] = [[[lambda].sub.j]/[[micro].sub.j]]) and [CV.sub.j.sup.2] be the squared coefficient of variation of service times ([CV.sub.j.sup.2] = [[[sigma].sub.j.sup.2]/[[tau].sub.j.sup.2]]). Under steady-state conditions ([[lambda].sub.j] < [[micro].sub.j]) and FCFS queuing discipline, the expected average waiting time (including the service time) at DCj is given by the Pollaczek-Khintchine (PK) formula:
E[[W.sub.j](M/G/1)] = ([1 + [CV.sub.j.sup.2]]/2)][[[tau].sub.j][[rho].sub.j]]/[1 - [[rho].sub.j]]] + [[tau].sub.j] = ([1 + [CV.sub.j.sup.2]]/2)[[[lambda].sub.j]/[[micro].sub.j]/([[micro].sub.j] - [[lambda].sub.j]]) + [1/[[micro].sub.j]] [for all]j,
and the...
|