Currently, Machine Learning (ML) is not well applied in behavioral economics or social science, and this lack of use comes from unfamiliarity with this different approach. However, over time the generation of scientists who are growing up with machine learning will move into social science and make the impact that these techniques have made in computer vision.
The interactions between machine learning and behavioral economics can be mutually beneficial. On the one hand, ML can be used to mine a broad set of data and find the behavioral-type variables that contribute to the emergence of different behaviors. On the other hand, ML algorithms that are embedded to identify biases and wrong assumptions would reach higher performance. This article aims to provide a sense for the first-mentioned application of ML in behavioral economics research, which is identifying variables that are important in shaping people’s behavior. More specifically this paper is a simple explanation of how ML techniques such as Random Forest and Gradient Boosting Machine can help in research related to fields such as behavioral economics or social psychology.
Behavioral economics is a field of study that examines the effects of psychological, cognitive, emotional, cultural and social factors on human decisions and how these decisions deviate from those implied by rational thinking. In other words, humans will not be considered as sufficiently rational agents, and psychological variables, and contexts will be emphasized as essential determinants of humans’ decisions. This approach helps to predict human behavior better, and in certain situations where there are directional deviations from rational thinking, allows for designing nudge policies to improve people’s decisions.
As Daniel Kahneman in his excellent book Thinking, Fast and Slow states: “We (humans) are pattern seekers.” In order to understand these patterns, behavioral economists consider the psychological characteristics of individuals and the context in analyzing people behavior. However, it is difficult to apply this individual and context-based approach on a massive scale. Machine learning can significantly resolve this challenge by detecting patterns and searching in a large set of data for variables that are influential in shaping the patterns.
Machine learning is often about pattern recognition, and it helps to automatically detect patterns in data and then use the detected patterns as a tool for predicting future actions. However, the most focus on ML has been on its prediction power, and less attention has been given to its interpretation power. For example, although it is very beneficial that using ML cancer can be diagnosed more accurately and earlier, but the more essential thing that ML can provide is the insight about the variables having more weight in increasing cancer risk. This knowledge helps researchers and policymakers to control those variables better and reduce cancer risk. Put differently ML gives us not only the prediction power over a target variable but also the knowledge to find out which input variables are more crucial in predicting that target variable.
To have a sense how this application of ML to behavioral economics works, let us look at the decision trees models, a family of machine learning algorithms that go from observations about an item to conclusions about the item’s target value. One form of this model where the target variable has a discrete set of values is called classification trees. In these models, branches of trees represent conjunctions of input variables that lead to the leaves which represent the class labels of a target variable. The other kind of decision tree models is regression trees, in which the target variable has continuous values. Two famous models that use an ensemble of decision trees to reach the high accuracy in prediction are Random Forest and Gradient Boosting Machine. The goal of these decision-tree based algorithms is to establish a model that predicts the value of a target variable based on several input variables. The vital aspect of these models is that there is no limitation for the number of input variables in these models. In other words, there is less concern about the curse of dimensionality for them.
The general steps for building a model and making interpretation using these techniques are as follow: first, data is divided into two sets, one set is the training data which typically includes the larger portion of data on which the model is built, and the other set is test data by which the model is validated. After creating the most effective model based on the training and testing data, we will reach to a model that can predict the observed behavior in data. This model now gives us the knowledge to identify the variables that have the most contribution to the prediction of the target variable. This process in data analysis is referred to as feature importance. For example, if we have hundreds of input data regarding a massive amount of observations, including individual and context related variables, by using this methodology we can find out what variables are critical in leading people to show different behaviors.
Figure 1: Decision tree path (source: http://blog.datadive.net/interpreting-random-forests/)
These models also provide the possibility of local interpretation. In other words, we can understand the top input variables that are most important in each individual prediction. For example, two persons may show the same behavior for entirely different reasons. Therefore, looking at each individual prediction for one specific behavior and grouping the top variables that mostly explain this behavior will help us to understand what features are more influential in driving that behavior. This possibility is due to the nature of decision-tree based models, in which there is a specific path from branches to each leave representing the target value for each observation (figure 1). These powerful abilities of machine learning, thus, not only resolve the challenges of data scale but also enable us to match the right policy for a person in a population of thousands or millions of people. In other words, machine learning makes possible targeting the right nudges to the right people, in the right context.