Whatever happened to “majority rule”?
A continual source of democratic frustration today is that public opinion does not seem to directly translate into public policy. For example, a large majority of Americans want to see campaign finance reform, background checks for gun ownership, and reductions in fossil fuel consumption. Yet, while overwhelming public support has fostered some powerful grassroots movements and even some local-level action for these causes, we have yet to see the nationwide public policy action we would expect to be achieved by such large majorities.
This conundrum is a puzzle with many jigsaw pieces. Here, I am not going to try and solve the entire puzzle, but rather propose a specific jigsaw piece that could help other pieces click into place. I call this particular jigsaw piece “issue crowding”: the idea that individual voters usually get represented well for most but not every issue, and the rate at which voters do not get represented well differs on an issue-by-issue basis, especially when there are multiple issues that a single representative might have to represent.
This idea that issues crowd out other issues in representative democracy should not be too surprising — it seems to me to be a cousin of other ideas in social choice theory like the “discursive dilemma” and the “Condorcet paradox”. Rather than trying to argue for this on a theoretical basis, I want to offer instead empirical evidence from machine learning methods in data analysis, suggesting that this is a phenomenon that is measurable and quantifiable.
I begin with two key premises:
- First, opportunities for political representation are limited — that is, there are only so many people running for elected office, and outside of that, only so many advocacy and civil society opportunities available for people to influence the policymaking process beyond Election Day
- and, because of the nature of this political economy, representatives tend towards adopting platforms that reflect “centers of gravity” in terms of the electorate’s political preferences. Furthermore, these representatives will try to persuade an individual voter’s political preferences closer to their own platform, creating a positive feedback loop.
In other words, whether formally represented (by an elected official) or informally represented (by civil society), when voters express their political preferences through a representative they get lumped in with other voters whose political preferences are largely-but-not-exactly like their own. This, in turn, can create a situation where a majority of public opinion on one issue is not enough to guarantee the majority’s preferred public policy outcome.
Consider the following animated graph. Each dot represents one voter, and it might help to imagine the horizontal axis as “economic issues” and the vertical axis as “social issues”.
Difference between their individual voters versus their clustered representation. Informed observers of politics might see a similarity between this relatively benign phenomenon and the more insidious practice of geographic gerrymandering, the key similarity being concentrations of support lead to disadvantages in representation. Later on I use gerrymandered representation as a concrete illustration of what is at stake here.
Here, although the orange dots are the majority (seventeen to thirteen), when representation is stratified along the horizontal axis, representatives of majority-blue “blobs”are the majority (two to three). The blue dots win, despite being the minority, and not necessarily because of anything nefarious beyond being the lucky winners of a mathematical accident.
In what follows, I propose that this mathematical accident, which I am calling “Issue Crowding”, can be measured in a meaningful way thanks to the machine learning technique of cluster analysis. I first give a brief overview of the technique that allows me to calculate what I call “Cluster Advantage”, describe how to calculate and interpret Cluster Advantage scores, and then take a shot at a descriptive analysis of my results that suggests how this measurement may be of value to policy entrepreneurs and political scientists alike.
Cluster analysis — the process of grouping together many objects on the basis of similarity across many different variables — is a computationally intensive process that generally falls under the umbrellas of “Artificial Intelligence” (AI) and “Machine Learning” (ML).
It is worth pausing here for a moment and recognizing that, for the most part, AI and ML are not necessarily a good thing for democracy (see, for example, here, here, here, and here). I believe one reason for this is because AI and ML are often used for prediction that often takes the decision-making power away from the people and puts it in the “black box” of a computer — and, of course, it can be very difficult for humans to debate with a computer! Here, I focus the power of cluster analysis for the purpose of description — that is, using a computer’s power to create a new map of reality that detects patterns which humans would otherwise not be able to observe, and perhaps even point out associations that challenge the assumptions and stereotypes that have been ingrained through culture. This computer-generated description is meant to be an aid to good human decision making, not a substitute for it.
To this end, I use a subset of data from the Views of the Electorate Research Survey (VOTERS) that measures an American voter’s concern for a wide range of issues on a 1–4, or “very important” to “unimportant”, scale (note that this question was asked in both December 2016 and July 2017; when a respondent gave an answer in both years I used the more recent July 2017 response). The particular algorithm I ended up using was Ward’s method to generate an agglomerative hierarchical clustering. If that last sentence means anything to you and you would like to poke around more, feel free to explore this project’s GitHub repository that gets into the more technical aspects of the methodology.
Given contemporary American polarization and experimenting across a variety of parameters, I found that the first divide of the data generates two clusters that are usually associated with “Left and Center”, roughly two-thirds of the sample, and then “Right”, the remaining one-third. But because I use a hierarchical clustering method, the computer can dig deeper and create many more clusters within these.
Displayed below is a graphical interpretation of a one of many clusters generated by this process (when we tell the computer to generate eleven clusters total):
This graph is only one of many being considered simultaneously (that is, number five in a set of eleven). Don’t try to make too much sense of it, other than trying to understand what a single cluster represents! If you really want to go deep, I have a document that visualizes a whole set of clusters along with their family tree (click here).
Green bars (to the right) indicate that, for that particular cluster as a whole, that issue is considered more important than the average for all Americans. Red bars (to the left) indicate that the issue is considered less important than average.
A casual look at this cluster, representative of about 3% of the American population, suggests that it is a sub-cluster of the American political right. In many ways, it fulfills our common stereotypes (i.e., less concerned with gay rights and climate change, more concerned with the size of government and crime). But there is at least one notable exception to our expectations: Religious liberty.
If we look closer still, we see another important dynamic of clustering. Even for issues that trend “very important” in that cluster, e.g., Money in politics, there is still a minority of respondents who consider that issue as “unimportant”. Similarly, for issues that trend “unimportant” in that cluster, e.g., Family and medical leave, there is a minority of respondents who consider that issue “very important”. If we were to create more clusters, it is possible that these outliers would form a new cluster; similarly, this cluster likely represents many of the outliers in the overarching “Right” cluster who consider Religious liberty to be “unimportant”.
Once we have our clusters, we can now calculate the “Cluster Advantage” score.
Cluster Advantage, quite simply, is the combined percentage size of clusters where a particular issue trends more important than the average (i.e., represented by a blue bar to the right in the graphical representation above). We then subtract 50 to make it easier to interpret. Mathematically:
Imagine if we had four clusters, at sizes 10%, 20%, 30%, and 40% of the population, respectively. For a particular issue, let’s say Health care, the 10% and 30% clusters trend negative on the issue and the 20% and 40% clusters trend positive on the issue. The Cluster Advantage score would then be 10 (that is, 20 + 40 − 50).
How might we then interpret a Cluster Advantage score? A positive Cluster Advantage score suggests that support for these issues are well-distributed, while apathy or hostility against these issues is highly concentrated (that is, the detractors are more likely to be idiosyncratic compared to their best option for democratic representation). A negative Cluster Advantage score suggests the opposite.
To imagine what sort of consequence this might have on the democratic process, consider the U.S. House of Representatives elections in 2016 and 2018. It is pretty well-established that gerrymandering has rigged the U.S. House mostly in favor of the Republican Party — that is, in our terms, Republicans have drawn the maps (to distribute their voters out strategically and concentrate the opposition into specific districts) in order to give themselves a positive representation advantage.
Note that some numbers might not add as expected up due to rounding and third-party popular votes.
We see that in 2016, the Republicans had a 1% lead in the popular vote but got an 11% advantage in terms of seats (an 11 to 1 ratio!). Meanwhile, in 2018, when the Democrats had a 9% lead in the popular vote, they only got an 8% advantage in terms of seats (an 8 to 9 ratio!).
The “bowtie plot” from Samuel Wang, “Three Tests for Practical Evaluation of Partisan Gerrymandering”, Stanford Law Review: Volume 68. June 2016. Page 1286. In the terms of our argument, this graph represents a Cluster Advantage of 0: a 50%-50% vote share results in 50%-50% representation; if there was a positive Cluster Advantage, then the intersection point would shift to the left. Notice how, after factoring out third party votes, the 2018 and 1964 U.S. House election results are within the reasonable “eu-proportional” zone, but the 2016 U.S. House election results are in the “dysproportional” zone.
While someone looking merely at the 2018 results might think that this might seems fair and even slightly favorable to the Democrats: 53% of votes results in 54% of representation. What is missed, and the comparison with 2016 makes clear, is that the Democrats had to trounce the Republicans by a 9% margin to get this one-to-one parity of votes-to-representation! In “first-past-the-post” district voting, such as the election of U.S. House representatives or, say, the Canadian federal election, advantages in the popular vote get somewhat of a multiplier bonus in terms of representation. One clear example is the 1964 U.S. House election, where 57% of the popular vote resulted in winning 68% of seats for the Democrats!
This illustration, lifted from the black-and-white (or perhaps red-and-blue?) world of partisan politics helps us better understand what is as stake with regard to our question, where multiple issues are being considered simultaneously. My wager is that the Cluster Advantage metric helps us estimate at what point this representation multiplier starts to kick in. If there was a Cluster Advantage of 0, then when a popular vote is split 50%-50% representation should be similarly split 50%-50%. But if there is a Cluster Advantage of +10 we might expect a 50%-50% popular vote to result in something closer to a 55%-45% representation (that is, a 10% difference); or vice versa, the representation multiplier might start to kick in for any popular vote percentage above 40% (10% lower than 50%) and therefore it might only take 45% of the popular vote to achieve a 1-to-1 ratio between popular vote percentage and representation percentage.
Disclaimer: it would require a mathematician much more talented than myself to better quantify and perhaps even prove this relationship between popular vote, Cluster Advantage, and percentage of representation. In lieu of such expertise, I offer these “rule-of-thumb” estimation guidelines for interpreting Cluster Advantage strictly from my own intuition. If you have the mathematical talent to help verify or prove otherwise, I welcome your feedback!
I used party-line geographic gerrymandering as a concrete comparison to the phenomenon of “issue crowding” that I am trying to describe using the “cluster advantage” metric. There are some important differences, however. First, whereas gerrymandering is a foul attempt to rig the system in a party’s favor, issue crowding is just a mathematical accident. Second, while gerrymandering can be measured essentially by looking at a map and doing some fancy geometry, issue crowding is a much more multidimensional and subjective phenomenon.
To overcome these and other statistical challenges, I had the computer generate upwards of 100 clustering schemes each with slightly different parameters. I then aggregated the 20 most informative sets of Cluster Advantage scores, getting the results below.
Issues highlighted in green are those where a positive Cluster Advantage score exceeds the margin of error, whereas issues highlighted in red are those where a negative Cluster Advantage (or “Cluster Disadvantage”) score exceeds the margin of error. Remember, these numbers do not signal overall public support (if so, the top issues in this list would be “The economy”, “Health care”, and “Jobs”) but rather these numbers give us a way of nuancing how public opinion polls might predict policy outcomes!
Statistics, of course, are not self-interpreting. Below, I give some of these numbers more context to better illustrate what the value of cluster analysis might be for policy entrepreneurs and political scientists alike. But do feel free to look over the numbers above to make sense of them for yourself — my speculation is far from the final word on what these numbers mean!
Topping the chart with a whopping +22.5 Cluster Advantage score, we see that Racial equality is an issue that is particularly well-positioned in the electorate. It is possible that this is partially due to progressive discourses on “intersectionality”, aligning activism around racial concerns with other well-positioned issues (e.g. gender equality, gay rights, the environment, and poverty all have positive cluster advantage scores, albeit to degrees of significance). But it is easier to see the consequences of this cluster advantage on the political right, where many Republican representatives are often finding themselves playing whack-a-mole against the openly racist elements within their own party (e.g., removing Rep. Steve King from committee assignments after a white supremacy remark) and, blatantly racist platforms generally remaining a losing electoral strategy despite current White House rhetoric (note that, given the double-punch of modern primaries and the Electoral College, the American Presidency is not necessarily a representative office for which this type of cluster analysis makes much sense).
Family and Medical Leave
It is not immediately clear to me why this set of issues was able to rack up a +15.9 Cluster Advantage score. There have been some signs of bipartisan consensus, including the advocacy of Ivanka Trump that perhaps reinforces that this is not necessarily an issue that polarizes along typical party lines. However, especially because this has not been a “hot issue” dominating the news cycle the last many years, my best guess is that the favorable cluster advantage score is simply due to chance. This randomness does not, however, reduce the fact that this number is good news for family and medical leave advocates, who can claim that the wind is to their back in their issue campaigning, arguing to their supporters that now is as good a time as any to make a concerted push to turn public opinion into public policy.
Environment and Climate Change
These closely related issues, unsurprisingly, track closely together with Cluster Advantage scores of +12.8 and +12.1 respectively. There are a couple of hypotheses that can be gleaned here:
- Opposition to environmental action is highly concentrated and idiosyncratic. I think of climate denial campaigns, who probably find receptive audiences to their falsehoods in “clumps” rather than across the population as a whole.
- A positive cluster advantage alone won’t predict public policy outcomes. When climate denial campaigns reach a saturation point in the population, they still have the option to influence public policy through institutional plays such as lobbying and offering other non-democratic incentives.
- But a positive cluster advantage does suggest an expansive window of support beyond typical partisan lines. The recent announcement of a bipartisan climate change caucus in the Senate suggests that at least some Republican senators feel like enough the patterns of public opinion are enough in their favor to step out on what typically is seen a Democratic issue and survive the electoral consequences for doing so.
Size of Government
We now turn our focus from the issues with cluster advantages to the issues with cluster dis-advantages. Although the cluster disadvantage for “Size of government” (−10.4) is less than the cluster disadvantage for “Infrastructure investment” (−11.6), I highlight only the former because the cluster disadvantage is greater than the margin of error.
While concern about “Size of government” already does not poll all that well among survey respondents (it is the lowest-ranked stereotypically conservative issue in the set), the cluster disadvantage suggests that those who hold this view are further concentrated unstrategically. This could help explain why small government advocates have been largely ineffective the last many years (it has been nearly two decades since the United States recorded a budget surplus), and it might even help explain why these same conservatives have lost control of the Republican Party they supposedly once steered.
At first glance, the fact that “Jobs” is an issue with a cluster disadvantage of −14.8 suggests that there is something generally idiosyncratic about respondents concerned about jobs. Maybe the sort of people who have idiosyncratic political beliefs and difficult in the workplace and more likely to be out of a job, and therefore this group of outliers are as a whole more concerned about the impact of public policy on job creation? This interpretation, admittedly, is a bit of a stretch.
When I looked at the numbers most deeply, I saw that “Jobs” is a top-ranked issue in the data as a whole, second only to “Economy” and “Health care”. Notably, although only significant in the case of “Jobs”, all three of these top-ranked issues also had negative Cluster Advantage scores. What this might reflect is less a political reality and more of a mathematical one: when the computer forms clusters, it divides people by their specific passions and this can chip away at the edge that the overall top issues would otherwise have if the entire dataset was considered simply as one giant cluster.
Money in politics
One of the premises behind the idea of issue crowding is that opportunities for democratic representation are limited and that this can be a frustrating experience for those voters who have more idiosyncratic policy preferences and therefore weaker representation. If so, the fact that “Money in politics” has the strongest cluster disadvantage might actually confirm this premise!
As the only issue that is directly about politics, the data on “Money in politics” might be picking up on voter frustration (large political contributions, of course, being one of the many things that can get blamed for a democracy that doesn’t feel representative). Insofar as voter frustration is linked to voter idiosyncrasy, and insofar as supporter idiosyncrasy is linked to increasing a cluster disadvantage for that particular issue, it seems that we have a reasonable explanation for why “Money in politics” has the highest disadvantage, and perhaps also a word of caution for political reformers of any sort (say, for example, national popular vote advocates) to take favorable public opinion polling with a grain of salt, as their causes might too suffer from a similar cluster disadvantage.
Immigration & Abortion
This might seem an odd pairing, not only because the underlying issues themselves are radically different, but also because one has a significant cluster disadvantage (“Immigration”, −14.1, ±8.4) and the other is clearly more evenly distributed (“Abortion”, +2.5, ±21.4). The reason I consider these two together is because they both point to an issue with the underlying survey data: are these numbers picking up on a respondent’s desire for rights, or restrictions? Is this answer supposed to signal pro-choice or pro-life? Pro-immigrant or anti-immigrant?
The resulting data was so ambiguous that, at least in the case of abortion, the computer would often create clusters that appeared to be on the respondent’s view of what the question was asking rather than their underlying politics. This led me to always exclude “Abortion” as a variable for the clustering algorithm to fit into the models it generated, and for a handful of clusters I also excluded “Immigration”. I present these numbers for the sake of transparency, but caution against making any sort of meaning out of them.
Gay rights is, interestingly, our most neutral issue in terms of a cluster advantage score (+1.5). This means support for gay rights is pretty evenly distributed throughout a range of potential clusters. I want to hypothesize that one reason for this relatively even distribution is that one of the catalyzing factors for supporting “Gay rights” is having a person who identifies as LGBTQ in one’s family or close friend group. Crucially, LGBTQ people enter into a family system at an age before their sexual orientation can be expressed, or may enter into a friend group while still “in the closet”. Therefore, individuals who otherwise would be prone to dismiss LGBTQ rights may now encounter a trojan horse of open-mindedness in the form of a loved one. Since this experience is largely random and could happen to anyone at a likelihood that is almost entirely independent of their other political preferences, support for Gay rights tends to be evenly distributed and not disproportionately concentrated within any pattern of political preferences.
To finish, I want to consider the issue of religious liberty: not because of the mild Cluster Advantage score (+11.1) but because of the large margin of error (±27.0). In other words, nearly half of the clustering models that I used for the aggregation of these scores had religious liberty scored either in the negative twenties or the positive thirties!
This high amount of variation seems to reflect another phenomenon in our clustering: that is, after the typical “right”/”center and left” divide that characterized the first two clusters that were generated, it seemed to me that the first subcluster division was often characterized by a subcluster where religious liberty had a positive Cluster Advantage score, and another subcluster where religious liberty had a negative Cluster Advantage score. Unsurprisingly, when I dug deeper into the data, these subclusters were dominated by religious respondents (e.g., “Protestant”) and irreligious respondents (e.g. “Agnostic”) respectively.
In a time when American politics is polarized, it might be interesting to consider if there is an opportunity here for building a consensus across the poles, in order to move the needle on particular issues. Where might religious conservatives (on the right) and social gospelers (on the center and left) work together to achieve policies of love, joy, and peace? And where, despite a slight cluster disadvantage, might irreligious libertarians (on the right) and secularists (on the center and left) combine forces to correct some of the imbalances of power in our society that have sprung from the wake of corrupted religious institutions?
The data nor the computer has the answers to these sort of questions. All this technology provides for us is simply another layer to an already complicated map. In terms of this “Cluster Advantage” score being at all rooted in political realities, this particular layer of the map can tell policy entrepreneurs whether or not, for their particular cause, if the road ahead is uphill or downhill.
This essay was an effort to suggest that there is a democratically significant phenomenon, “issue crowding”, where certain issues get disproportional representation because a significant number of the voters who consider that issue to be very important (or unimportant) do so despite what would be expected given their other political preferences and therefore contrary to the limited opportunities for political representation that is actually available to them.
I believe I have made a strong enough case that there is an opportunity for Machine Learning techniques to be applied towards this question about patterns of representation that has value for political scientists and policy entrepreneurs alike. I do want to be more modest about the theoretical or practical implications that might be drawn from this essay though: before doing so, I would want data that is more specifically designed for this question rather than improvising off another dataset, in addition to being able to run more iterations of the clustering model (which takes either time or money to afford the computations required), and some peer review from people in the political science field (or, since my formal training is in public policy and not political science, I am not even sure the word peer-review even applies). That said, my hope is that this essay is enough to spark interest in this phenomenon and generate exactly these conversations about what it would take to get the data, funding, and some expert review to better understand the value being able to theorize “issue crowding” and quantify any “cluster advantage”.
This analysis grew out of a project that was completed as part of a Data Science Fellowship with the Flatiron School’s Washington DC Campus. Special thanks to the numerous instructors, coaches, curriculum designers and cohort-mates who guided me through the learning and development process.