Home | Business News | Browse by Publication | M | Management Science

Yahoo! for Amazon: sentiment extraction from small talk on the Web.

Publication: Management Science
Publication Date: 01-SEP-07
Format: Online
Delivery: Immediate Online Access
Full Article Title: Yahoo! for Amazon: sentiment extraction from small talk on the Web.(Report)

Article Excerpt
1. Introduction



Language is itself the collective art of expression, a summary of thousands upon thousands of individual intuitions. The individual gets lost in the collective creation, but his personal expression has left some trace in a certain give and flexibility that are inherent in all collective works of the human spirit--Edward Sapir, cited in Society of Mind by Minsky (1985, p. 270).

We develop hybrid methods for extracting opinions in an automated manner from discussions on stock message boards, and analyze the performance of various algorithms in this task, including that of a widely used classifier available in the public domain. The algorithms are used to generate a sentiment index and we analyze the relationship of this index to stock values. As we will see, this analysis is efficient, and useful relationships are detected.

The volume of information flow on the Web has accelerated. For example, in the case of Amazon Inc., there were cumulatively 70,000 messages by the end of 1998 on Yahoo's message board, and this had grown to about 900,000 messages by the end of 2005. There are almost 8,000 stocks for which message board activity exists, across a handful of message board providers. The message flow comprises valuable insights, market sentiment, manipulative behavior, and reactions to other sources of news. Message boards have attracted the attention of investors, corporate management, and of course, regulators. (1)

In this paper, "sentiment" takes on a specific meaning, that is, the net of positive and negative opinion expressed about a stock on its message board. Hence, we specifically delineate our measure from other market conventions of sentiment such as deviations from the expected put-call ratio. Our measure is noisy because it comprises information, sentiment, noise, and estimation error.

Large institutions express their views on stocks via published analyst forecasts. The advent of stock chat and message boards enables small investors to express their views too, frequently and forcefully. We show that it is possible to capture this sentiment using statistical language techniques. Our algorithms are validated using revealed sentiment on message boards, and from the statistical relationship between sentiment and stock returns, which track each other.

Posted messages offer opinions that are bullish, bearish, and many that are confused, vitriolic, rumor, and spam (null messages). Some are very clear in their bullishness, as is the following message on Amazon's board (Msg 195006):

The fact is.... The value of the company increases because the leader (Bezos) is identified as a commodity with a vision for what the future may hold. He will now be a public figure until the day he dies. That is value.

In sharp contrast, this message was followed by one that was strongly bearish (Msg 195007):

Is it famous on infamous? A commodity dumped below cost without profit, I agree. Bezos had a chance to make a profit without sales tax and couldn't do it. The future looks grim here.

These (often ungrammatical) opinions provide a basis for extracting small investor sentiment from discussions on stock message boards.

While financial markets are just one case in point, the Web has been used as a medium for information extraction in fields such as voting behavior, consumer purchases, political views, quality of information equilibria, etc. (see Godes and Mayzlin 2004, Lam and Myers 2001, Wakefield 2001, Admati and Pfleiderer 2000 for examples). In contrast to older approaches such as investor questionnaires, sentiment extraction from Web postings is relatively new. It constitutes a real-time approach to sentiment polling, as opposed to traditional point-in-time methods.

We use statistical and natural language processing techniques to elicit emotive sentiment from a posted message; we implement five different algorithms, some language dependent, others not, using varied parsing and statistical approaches. The methodology used here has antecedents in the text classification literature (see Koller and Sahami 1997, Chakrabarti et al. 1998). These papers classify textual content into natural hierarchies, a popular approach employed by Web search engines.

Extracting the emotive content of text, rather than factual content, is a complex problem. Not all messages are unambiguously bullish or bearish. Some require context, which a human reader is more likely to have, making it even harder for a computer algorithm with limited background knowledge. For example, consider the following from Amazon's board (Msg 195016):

You're missing this Sonny, the same way the cynics pronounced that "Gone with the Wind" would be a total bust.

Simple, somewhat ambiguous messages like this also often lead to incorrect classification even by human subjects. We analyze the performance of various algorithms in the presence of ambiguity, and explore approaches to minimizing its impact.

The technical contribution of this paper lies in the coupling of various classification algorithms into a system that compares favorably with standard Bayesian approaches, popularized by the phenomenal recent success of spam-filtering algorithms. We develop metrics to assess algorithm performance that are well suited to the finance focus of this work. There are unique contributions within the specific algorithms used as well as accuracy improvements overall, most noticeably in the reduction of false positives in sentiment classification. An approach for filtering ambiguity in known message types is also devised and shown to be useful in characterizing algorithm performance.

Recent evidence suggests a link between small investor behavior and stock market activity. Noticeably, day-trading volume has spurted. (2) Choi et al. (2002) analyze the impact of a Web-based trading channel on the trading activity in corporate 401(k) plans, and find that the "Web effect" is very large--trading frequency doubles, and portfolio turnover rises by over 50%, when investors are permitted to use the Web as an information and transaction channel. Wysocki (1998), using pure message counts, reports that variation in daily message posting volume is related to news and earnings announcements. Lavrenko et al. (2000) use computer algorithms to identify news stories that influence markets, and then trade successfully on this information. Bagnoli et al. (1999) examine the predictive validity of whisper forecasts, and find them to be superior to those of First Call (Wall Street) analysts. (3) Antweiler and Frank (2004) examine the bullishness of messages, and find that while Web talk does not predict stock movements, it is predictive of volatility. Tumarkin and Whitelaw (2001) also find similar results using self-reported sentiment (not message content) on the Raging Bull message board. Antweiler and Frank (2002) argue that message posting volume is a priced factor, and higher posting activity presages high volatility and poor returns. Tetlock (2005) and Tetlock et al. (2006) show that negative sentiment from these boards may be predictive of future downward moves in firm values. These results suggest the need for algorithms that can rapidly access and classify messages with a view to extracting sentiment--the goal of this paper. (4) The illustrative analyses presented in this paper confirm many of these prior empirical findings, and extend them as well.

[FIGURE 1 OMITTED]

Overall, this paper comprises two parts: (i) methodology and validation, in [section]2, which presents the algorithms used and their comparative performance, and (ii) the empirical relationship of market activity and sentiment, in [section]3. Section 4 contains discussion and conclusions.

2. Methodology

2.1. Overview

The first part of the paper is the extraction of opinions from message board postings to build a sentiment index. Messages are classified by our algorithms into one of three types: bullish (optimistic), bearish (pessimistic), and neutral (comprising either spam or messages that are neither bullish nor bearish). We use five algorithms, each with different conceptual underpinnings, to classify each message. These comprise a blend of language features such as parts of speech tagging, and more traditional statistical methods. (5) Before initiating classification, the algorithms are tuned on a training corpus, i.e., a small subset of preclassified messages used for training the algorithms. (6) The algorithms "learn" sentiment classification rules from the preclassified data set, and then apply these rules out-of-sample. A simple majority across the five algorithms is required before a message is finally classified, or else it is discarded. This voting approach results in a better signal to noise ratio for extracting sentiment.

Figure 1 presents the flowchart for the methodology and online Appendix A (provided in the e-companion) (7) contains technical details. The sequence of tasks is as follows. We use a "Web-scraper" program to download messages from the Internet, which are fed to the five classification algorithms to categorize them as buy, sell, or null types. Three supplementary databases support the classification algorithms.

* First, an electronic English "dictionary," which provides base language data. This comes in handy when determining the nature of a word, i.e., noun, adjective, adverb, etc.

* Second, a "lexicon" which is a hand-picked collection of finance words (such as bull, bear, uptick, value, buy, pressure, etc.). These words form the variables for statistical inference undertaken by the algorithms. For example, when we count positive and negative words in a message, we will use only words that appear in the lexicon, where they have been pre-tagged for sign.

* Third, the "grammar" or the preclassified training corpus. It forms the base set of messages for further use in classification algorithms. These preclassified messages provide the in-sample statistical information for use on the out-of-sample messages.

These three databases (described in the online appendix) are used by...



More articles from Management Science
Fair payments for efficient allocations in public sector combinatorial..., September 01, 2007
Should price increases be targeted?--Pricing power and selective vs. A..., September 01, 2007
The fragility of time: time-insensitivity and valuation of the near an..., September 01, 2007
An integrated model for hybrid securities.(Report), September 01, 2007
Valuing R & D projects in a portfolio: evidence from the pharmaceutica..., September 01, 2007

Looking for additional articles?
Search our database of over 3 million articles.

Looking for more in-depth information on this industry?
Search our complete database of Industry & Market reports by text, subject, publication name or publication date.

About Goliath
Whether you're looking for sales prospects, competitive information, company analysis or best practices in managing your organization, Goliath can help you meet your business needs.

Our extensive business information databases empower business professionals with both the breadth and depth of credible, authoritative information they need to support their business goals. Whether it be strategic planning, sales prospecting, company research or defining management best practices - Goliath is your leading source for accurate information.