Kang Zhao, University of Iowa – Predicting Future Box Office Success

kang_zhaoShow me the money.

Kang Zhao, assistant professor of Management Sciences at the University of Iowa, details how his research team came up with a system to determine whether a movie will be a hit…or a flop.

Kang Zhao is an assistant professor at the Tippie College of Business, The University of Iowa. His research focuses on business analytics and social computing, and has been featured in media from more than 20 countries. He received his PhD from Penn State University.

Predicting Future Box Office Success


The average investment of a movie released in North America is 65 million US dollars. However, among movies produced in the U.S., only 1 out of 3 had box office revenues higher than their production budgets. Although many researchers have attempted to predicted box office revenues, few have predicted whether a movie will be profitable, nor provided such predictions at the early stage of movie productions, when investors need to make important decisions based on profitability.

Our research team developed such an early stage profitability prediction system for movies. This automated system uses machine learning algorithms to learn from large-scale historical movie data that is openly available, and predicts whether a new movie make a profit. The system examines the Who, What, and When factors for each movie. The Who factors are about those involved in making a movie, such as actors and directors, their tracks of record, previous collaborations; the What factors reflect what this movie is about, such as genre, rating, and its plot synopses; and the When factors capture the timing of a movie’s planned release. We also matched these factors with each other, and derived additional factors such as whether the cast of a movie has enough experience with the movie’s genre, and whether certain types of movies are becoming more popular in the market.

Overall, based on two different definitions of profitability, our system can achieve accuracy rates close to 90%. We also found some interesting patterns. For example, having the right team of actors and directors emerges as the top predictor for profitability. By contrast, R rating and plot synopses related to “war, mission, government” are top indicators for negative profits. This highlights the value of large-scale data, and the power of analyzing such data with machine learning techniques.