|
Article Excerpt 1. Introduction
The motion picture industry is a very important industry worldwide. Many new products are developed and launched in this industry. More than 4,000 movies are produced worldwide each year (MPAA 2004). In the United States alone, around $9 billion is spent on theatre tickets in 2004 (Eliashberg et al. 2006). Although many of the movies are financed and produced by Hollywood major studios, recently, "a new wave of outsiders rushing to finance movies are to some extent changing the way films are produced" (Mehta 2006). The new players include wealthy financiers, private equity firms, hedge funds, and other institutions that invest in the early stage of movies' production. Their metric is return on investment (ROI).
Despite the market size and investment interests, new movie production is a risky venture. Profitability varies greatly across movies. Although producers sometimes make large amounts of profit from blockbusters, they also lose millions of dollars in movies that end up in oblivion. For example, the movie Gigli costed approximately $54 million to produce, but its box-office revenue was only around $6 million. Considering that studios generally receive a share of around 55% of the gross box-office revenue for their production (Vogel 2004; also at http://www.factbook.net/wbglobal_rev.htm), Gigli generated a ROI (1) of -96.7% for the studio. On the other end of the spectrum, although the movie In the Bedroom costed only $1.7 million to produce, it generated more than $35 million in box-office revenues and thus a ROI of +667%. Across a sample of 281 movies produced between 2001 and 2004, the studio's ROI ranges from -96.7% to over 677%, with a median of -27.2%. As a result of the huge variance in ROI, the selection of movies to be produced is critical to the profitability of a movie studio. (2)
However, deciding which scripts to produce is a dauntingly difficult task, as the number of submissions always greatly exceeds the number of movies that can be made. It has been estimated that each year more than 15,000 screenplays are registered with the Writers Guild of America, whereas only around 700 movies are made in the United States (Eliashberg et al. 2006). Thus, studios need a reliable approach to guide the green-lighting process.
Currently, major studios still employ an age-old, labor-intensive methodology: They hire so-called readers to assist them in evaluating screenplays. Typically, three to four readers are assigned to read each script. After a reader reads a script, he/she writes a synopsis of the story line and makes an initial recommendation on whether the screenplay should be produced into a movie and the changes, if any, that are needed before actual production. These recommendations are made mostly subjectively and by experience. This means that the success of a movie production depends on the quality of the available readers and their acumen in picking out promising scripts. This approach becomes especially problematic when disagreements among readers occur. Indeed, according to some industry insiders we talked to--on top of the disagreements among readers--studio executives, as well as managers at different levels in the development process, also frequently disagree for many reasons to make green-lighting an uncertain or sometimes even arbitrary process. Not surprisingly, the result of this process is highly unpredictable. Even the scripts for successful movies, such as Star Wars and Raiders of the Lost Ark, were initially bounced around at several studios before Twentieth Century Fox and Paramount, respectively, agreed to green-light them (Vogel 2004). For that reason, studios and movie financiers can potentially benefit from a more objective tool to aid their green-lighting processes and to provide a reliable "second opinion" about the potential success or failure of adopting a script.
No such tools, as far as we know, are currently available to aid screenplay screening. The main obstacle in developing such a tool has been the lack of reliable predictors for the financial success of a movie at the green-lighting stage: There are simply too few tangible determinants for the success of a movie before it is produced. In this paper, we propose a new and rigorous approach that can potentially help studios and movie investors screen scripts and make more profitable production decisions. To insure that our approach can help green-lighting decisions, we exclude factors such as promotion and advertising, number of screens, competitors, etc. (even though they play a pivotal role in the success or failure of a movie) because this information becomes available only at a later stage. Our tool forecasts ROI based on the story line only. We extract textual information from story lines using domain knowledge from screenwriting and the bag-of-word model developed in natural-language processing. Once calibrated, these types of textual information are then used to predict the ROI of a movie using bootstrap aggregated classification and regression tree (Bag-CART) methodology developed in statistics (Breiman 1996, Breiman et al. 1984).
The rationale for our approach is simple. As industry insiders acknowledge, a good story line is the foundation for a successful movie production. Sir Ridley Scott, a famous director of motion pictures, once pointed out, "any great film is always driven by script, script, script" (Silver-Lasky 2003, p. 108). Of course, what are in the script are the stories. Peter Gruber, the producer of Batman, suggested the same: "At the end of the day when you get done with all the fancy production design ... what's up on the screen is the script. Plain, old-fashioned words. It all starts there and it all ends there" (Silver-Lasky 2003, p. 108). To the extent that the success of a movie depends on the stories told in a script, a sophisticated textual analysis of the scripts, or their proxies, that are already made into movies will help us identify the hidden "structures" in the texts, which essentially capture what the story is, how it is told, etc. Then, by relating those structures with the subsequent financial return from the movie, we can learn what kinds of stories may resonate with audience and what elements in a story will drive ROI performance. Once those structures or determinants are identified, we can then analyze in the same way a new script and predict its financial return, once it would be made into a movie.
Our approach is developed with movie scripts in mind. Ideally, we would like to implement our approach with movie scripts in electronic form. However, as most movie shooting scripts are not publicly available in electronic form and we cannot collect a sufficient number of them, (3) we restrict our attention to so called spoilers in implementing our proposed approach. A spoiler is an extensive summary of the story line of a movie written by movie viewers after they've watched the movie. Each spoiler is typically around 4-20 pages long. It is essentially a blow-by-blow description of a movie so that its readers do not have to go to a movie theatre to know the story told in the movie, hence the name "spoilers." Examples of spoilers can be found at http://www.themoviespoiler.com. When we develop our prediction tool, we view spoilers as a proxy for the actual shooting script for three reasons.
First, by limiting ourselves to the texts that contain less information, we essentially stack the deck against ourselves in predicting movie successes. Even so, we find that the performance of our approach is very encouraging, showing a great deal of promise for its practical applications. Indeed, with actual scripts in the digital form available to studios that contain the descriptive information in spoilers, the performance of our approach is expected to improve. (4) Second, spoilers in digital form are easily available to us on the Internet, whereas scripts are not. It would be truly a monumental work if we were to produce digital scripts for most of the movies in our sample. To the extent that spoilers tell the stories in actual scripts in a descriptive, detailed way, using spoilers is not only expedient, but also quite reasonable. Indeed, when we test the similarity between spoilers and scripts (when both are available) along the semantic and textual variables we considered, we are able to show positive and significant correlation between a spoiler and a script in all the variables we extract. (5) Third, the main purpose of our paper is to illustrate and expound a new approach. As it would soon become clear, our...
|