Filtered data for a belief can rationally push you away from that belief
The backfire effect is a recently-discovered bias where arguments contrary to a person’s belief leads to them believing even more strongly in that belief; this is taken as obviously “irrational”. The “rational” update can be statistically modeled as a shift in the estimated mean of a normal distribution where each randomly distributed datapoint is an argument: new datapoints below the mean cause a shift of the inferred mean downward and likewise if above. When this model is changed to include the “censoring” of datapoints, then the valid inference changes and a datapoint below the mean can lead to a shift of the mean upwards. This suggests that providing a person with anything less than the best data contrary to, or decisive refutations of, one of their beliefs may result in them becoming even more certain of that belief. If it is enjoyable or profitable to argue with a person while one does less than one’s best, it is bad to hold false beliefs, and this badness is not shared between both parties, then arguing online may constitute a negative externality: an activity whose benefits are gained by one party but whose full costs are not paid by the same party. In many moral systems, negative externalities are considered selfish and immoral; hence, lazy or half-hearted arguing may be immoral because it internalizes any benefits while possibly leaving the other person epistemically worse off.
“Don’t answer the foolish arguments of fools, or you will become as foolish as they are.”
Book of Proverbs 26:4
“Be sure to answer the foolish arguments of fools, or they will become wise in their own estimation.”
Book of Proverbs 26:5
A famous psychology paper: “When Corrections Fail: The Persistence of Political Misperceptions”, Nyhan & Reifler 2010 (floating around since 2006)
We conducted four experiments in which subjects read mock news articles that included either a misleading claim from a politician, or a misleading claim and a correction. Results indicate that corrections frequently fail to reduce misperceptions among the targeted ideological group. We also document several instances of a “backfire effect” in which corrections actually increase misperceptions among the group in question.
Questions about Nyhan & Reifler 2010: marginal-looking effect; convenience sample; small sample size (eg. n = 130 split over at least 4 groups in study 1; how many people, exactly, supposedly ‘backfired’? could this be explained by simple test-retest-consistency/reliability issues on multiple questionnaires?); has anyone replicated this? People invoke it everywhere, but just one study is not much given psychology’s horrible methodological quality and the instability of effects…
Failed replication: “The limitations of the backfire effect”, Haglin 2017
Nyhan and Reifler (2010, 2015) document a “backfire effect”, wherein attempts to correct factual misperceptions increase the prevalence of false beliefs. These results are widely cited both in and outside of political science. In this research note, I report the results of a replication of Nyhan and Reifler’s (2015) flu vaccine study that was embedded in a larger study about flu vaccines. The backfire effect was not replicated in my experiment. The main replication result suggests the need for additional studies to verify the backfire effect and identify conditions under which it occurs.
…For example, Weeks and Garrett (2014) do not find evidence for the backfire effect in a study about correcting rumors in the 2008 presidential campaign. Similarly, Ecker et al.’s (2014) study of racial attitudes finds those attitudes do not change the effectiveness of discounting information. Looking at similar attitudes, Garrett et al. (2013) find no evidence of these backfire effects in a study about a proposed Islamic cultural center in New York City. By contrast, Nyhan and Reifler (2010, 2015) find evidence for a backfire effect in a vaccines context as well as in the case of being correctly informed about the presence of weapons of mass destruction in Iraq.
Big failed replication: “The Elusive Backfire Effect: Mass Attitudes’ Steadfast Factual Adherence”, Wood & Porter 2016
Can citizens heed factual information, even when such information challenges their partisan and ideological attachments? The “backfire effect,” described by Nyhan and Reifler (2010), says no: rather than simply ignoring factual information, presenting respondents with facts can compound their ignorance. In their study, conservatives presented with factual information about the absence of Weapons of Mass Destruction in Iraq became more convinced that such weapons had been found. The present paper presents results from five experiments in which we enrolled more than 10,100 subjects and tested 52 issues of potential backfire. Across all experiments, we found no corrections capable of triggering backfire, despite testing precisely the kinds of polarized issues where backfire should be expected. Evidence of factual backfire is far more tenuous than prior research suggests. By and large, citizens heed factual information, even when such information challenges their ideological commitments.
Although some scholars view such bias as irrational behavior, 48 it is perfectly rational if the goal is not to get at the “truth” of a given issue in order to be a better voter, but to enjoy the psychic benefits of being a political “fan.”…See, e..g, Taber and Lodge, “Motivated Skepticism,”; Shenkman, Just How Stupid Are We?, ch. 3.
In some cases, unwelcome information can backfire and strengthen their previous beliefs (eg. Redlawsk 2002, Nyhan and Reifler 2010). - Redlawsk, David. “Implications of Motivated Reasoning for Voter Information Processing.” International Society of Political Psychology, 2001
However, Berinsky (N.d.) conducted several studies of myths about President Obama’s health care plan and found that corrections were generally effective in reducing misperceptions among Republicans (the partisans most likely to hold those beliefs), particularly when those corrections were attributed to a Republican source.
- Berinsky, Adam. N.d. “Rumors, Truth, and Reality: A Study of Political Misinformation.” Unpublished manuscript.
Only a few studies have directly tested the effects of providing citizens with correct information about political issues or topics about which they may be misinformed (see Nyhan and Reifler 2012 for a review). Most of these focus on the effects of false information on policy attitudes. Kuklinski et al. (2000; study 1), Gilens (2001), Berinsky (2007), Sides and Citrin (2007), Howell and West (2009), Mettler and Guardino (2011) and Sides (N.d.) all provided experimental participants with correct factual information about an issue and then asked about their policy preferences on the issue. The results have been mixed - Gilens, Howell and West, Mettler and Guardino, and Sides found that correct factual information changed participants’ policy preferences, but Kuklinski et al Sides and Citrin, and Berinsky did not.
- Kuklinski, James H., Paul J. Quirk, Jennifer Jerit, David Schweider, and Robert F. Rich. 2000. “Misinformation and the Currency of Democratic Citizenship.” The Journal of Politics, 62(3):790-816
- Gilens, Martin. 2001. “Political Ignorance and Collective Policy Preferences.” American Political Science Review, 95(2): 379-396
- Sides, John and Jack Citrin. 2007. “How Large the Huddled Masses? The Causes and Consequences of Public Misperceptions about Immigrant Populations.” Paper presented at the 2007 annual meeting of the Midwest Political Science Association, Chicago, IL
- Howell, William G. and Martin R. West. 2009. “Educating the Public.” Education Next 9(3): 41-47
- Mettler, Suzanne and Matt Guardino, “From Nudge to Reveal,” in Suzanne Mettler, 2011, The Submerged State: How Invisible Government Policies Undermine American Democracy. University of Chicago Press
- Sides, John. N.d. “Stories, Science, and Public Opinion about the Estate Tax.” Unpublished manuscript
Wogalter, Marwitz, and Leonard (1992) presented another argument against selecting fillers on the basis of their resemblance to the suspect: The “backfire effect” refers to the idea that, somewhat ironically, the suspect might stand out if he or she was the basis for selecting the fillers in the lineup, because the suspect represents the central tendency or origin of the lineup. Clark and Tunnicliff (2001) reported evidence for the backfire effect. However, eyewitnesses’ descriptions of the target are often sparse and sometimes do not even match the characteristics of the suspect (Lindsay, Martin, & Webber, 1994; Meissner, Sporer, & Schooler, in press; Sporer, 1996, in press).
- Wogalter, M.S., Marwitz, D.B., & Leonard, D.C. (1992). “Suggestiveness in photospread lineups: Similarity induces distinctiveness”. Applied Cognitive Psychology, 6, 443-453.
see also “Bayesian Belief Polarization”, Jern et al 2009
Empirical studies have documented cases of belief polarization, where two people with opposing prior beliefs both strengthen their beliefs after observing the same evidence. Belief polarization is frequently offered as evidence of human irrationality, but we demonstrate that this phenomenon is consistent with a fully Bayesian approach to belief revision. Simulation results indicate that belief polarization is not only possible but relatively common within the set of Bayesian models that we consider.
Counter-argument: people overweight even when knowing selection is random; Shulman “Don’t Revere The Bearer of Good News”:
One of the classic demonstrations of the Fundamental Attribution Error is the ‘quiz study’ of Ross, Amabile, and Steinmetz (1977). In the study, subjects were randomly assigned to either ask or answer questions in quiz show style, and were observed by other subjects who were asked to rate them for competence/knowledge. Even knowing that the assignments were random did not prevent the raters from rating the questioners higher than the answerers. Of course, when we rate individuals highly the affect heuristic comes into play, and if we’re not careful that can lead to a super-happy death spiral of reverence. Students can revere teachers or science popularizers (even devotion to Richard Dawkins can get a bit extreme at his busy web forum) simply because the former only interact with the latter in domains where the students know less. This is certainly a problem with blogging, where the blogger chooses to post in domains of expertise.
Arguing is a negative externality, per the backfire effect: https://www.motherjones.com/kevin-drum/2008/09/backfire-effect http://www.nd.edu/~ghaeffel/Lilienfeld2009%20Perspectives%20on%20Psychological%20Science.pdf http://www.desmogblog.com/want-sway-climate-change-skeptics-ask-about-their-personal-strengths-and-show-pictures https://www.nytimes.com/2012/09/18/opinion/balanced-news-reports-may-only-inflame.html
Remember simple mode: arguer selects best argument he has; experimental confirmation? https://www.overcomingbias.com/2012/10/we-add-near-average-far.html https://www.lesswrong.com/posts/YgNLfytckSyKTnDXN/fallacies-as-weak-bayesian-evidence https://en.wikipedia.org/wiki/Censoring_%28statistics%29
“damning with faint praise”
It may seem strange and prima facie irrational that receiving information in support of a belief may actually increase your belief in its opposite, but…
Let us play a little game I call the Debate Game. You are trying to guess a real number P between 0 and 1; you opponent periodically tells you a real which is drawn from a normal distribution around P.
Let us play a different game called the Random Debate Game, where the real is drawn randomly from said normal distribution; you hold no opinion on what P is, and in the first round are told 0.75. what do you conclude?
Now let us play a third game, called the Worst Debate Game; now the real is not drawn randomly from said normal, but is at least one standard deviation between P; you are told 0.55, what do you conclude about P?
nshepperd> if you’re talking to a proud supporter of position ¬X, and they only give you one bad argument against X, that’s evidence for X
gwern> nshepperd: note the crucial assumption there: you’re assuming something about the distribution of the ~X arguments
gwern> after all, what if they’re biased to giving you only bad arguments? are you really going to conclude that them providing a bad argument is evidence for X?
nshepperd> yeah, the assumption is that proud supporters will know all the good arguments
gwern> nshepperd: anyway, so that’s the core argument: the naive belief that the backfire effect is ‘perverse’ is assuming the arguments are selected randomly so any argument of non-zero value ought to move them that direction, whereas the reality is you expect them to select their best argument, in which case you get an upper bound which can easily them the opposite direction
nshepperd> I think this was mentioned in one of the sequence posts on filtered evidence
gwern> nshepperd: the math part is treating beliefs as reals 0-1, assuming a normal distribution of sampling arguments around the underlying belief, and showing that these different distributions yield the actual effects we claim (random selection = any arguments worth >0.5 increases confidence and vice-versa, biased selection toward max argument can decreases confidence even if >0.5)
nialo> gwern: does that still preserve expected evidence = 0? that is, sum(p(argument) * confidenceChange(argument)) still zero?
nialo> (I ask cause it seems like it might and I can’t prove it either way quickly)
nialo> (er, might not*)
gwern> nialo: I think it does, it’s just shifting a huge amount of probability mass toward the upper ranges
gwern> nialo: my intuition is that, let’s say your estimate of their belief P is currently 0.5; if they produce a 0.5, you take this as an upper bound and conclude it’s <0.5, but if they produce a 0.9, then you will be shocked and revise massively upward
nshepperd> gwern: also if you expect them to list off a large number of arguments, then they only come up with two, then that is evidence they have less arguments on total
gwern> nshepperd: yeah. there’s a clear connection to the hope function too, although I’m not sure what it is
gwern> it’s really a kinda nifty model of political arguing
clone_of_saturn> someone should see what happens when someone states a weak argument first, then a much stronger one
gwern> if only I were e.t. Jaynes, I could probably write this all out in an hour
gwern> clone_of_saturn: an interesting question. going weak then strong sort of implies that they either aren’t good at judging arguments since then they would start with the strong, or it’s not actually strong in some way you aren’t judging right and that’s why they left it for later
nshepperd> or maybe they’re listing their arguments in alphabetical order
nialo> or in order by length
gwern> well, there’s always alternative models available…
Namegduf> People like to start with arguments they feel emotionally attached to
Namegduf> Snappy ones, “heavy” ones, etc
Namegduf> These are not always ones which are perceived as stronger
Namegduf> That’s my observation, anyway.
nialo> also gwern, this seems to intersect with Yvain’s thing about mainstream ideas have worse average arguments: https://web.archive.org/web/20131008192032/http://squid314.livejournal.com/333353.html (I’m not sure how precisely, but I think there’s a useful distribution for expected argument quality in there somewhere)
gwern> the best I can think of is that beliefs which are really wrong will have a relatively narrow spread past the underlying belief - if you’re a Klansman, it’s hard to make an argument even worse than the underlying falsity of Klanism
nshepperd> well, to do this properly, you need a distribution over the set of arguments the clever arguer is likely to possess
gwern> analogous to Gould’s complexity argument
nshepperd> given which hypothesis is actually true
gwern> you can’t be less complex than viruses - there’s a censorship of the data
gwern> so Klan arguments in practice will come from the right part of the Klan bell curve and be better than expected
gwern> if Klanism is at 0.2, I probably won’t see the 0-0.2 arguments, but the 0.2-0.4 arguments, if you follow me
nialo> that roughly tracks what I’d expect also, yes
gwern> but my brain is shutting down from all the statistics I’ve been learning today, so maybe that is not a very good point to extract from Yvain’s post into my backfire model
nshepperd> Well, suppose there’s a fixed supply of arguments for each side, whose number and strength depends only on whether Klanism is true
nshepperd> If Klanism is true there should be more and better arguments in favor of it
nshepperd> A particular person possess some subset of this supply of arguments.
nshepperd> However, Klansman, being non-mainstream, will have spent more time “mining” this argument supply for good arguments
nshepperd> This affects your distribution over arguments you expect the Klansman you’re talking with to have.
nshepperd> Which in turn affects the arguments they’ll actually tell you. So when you update on what they’ve said you have to sort of propagate that information back through each layer of the model to the factual truth of Klanism
nshepperd> (Klansman says only one good argument, meaning he probably doesn’t have many good arguments because he would select all his best, meaning there probably aren’t many good arguments because Klansman spend a lot of time looking for arguments, meaning Klanism is probably false because otherwise there’d be more good arguments for it)
gwern> so how many good Klan arguments will finally be evidence for it?
nshepperd> gwern: approximately, more arguments than you’d expect there to be if Klanism were wrong
gwern> nshepperd: which would be?
nialo> insufficient data
gwern> nialo: my challenge wasn’t for a real number but pseudocode for estimating it
nshepperd> depends on your distributions of P(existing arguments | K)
gwern> nshepperd: that sounds like a boring model then
nshepperd> you could make a toy model with gaussians with different means, maybe
nialo> I think what nshepperd is saying is approximately equivalent to Klan arguments coming from the right half of the bell curve, as above
nshepperd> say if Klanism is true there exist a large number of arguments distributed in quality according to Norm(u_true, sigma), otherwise the arguments are distributed according to Norm(u_false, sigma) where u_true > u_false
nshepperd> then the Klansmen draw a bunch of arguments from whichever distribution is correct, and give you the top N they find
gwern> how does that differ from my model?
gwern> aside from collapsing multiple steps
nialo> I think you end up with a slightly different distribution at the end?
nshepperd> aside from being mathematically precise
gwern> (believing in conspiracy theories means always being hopeful - that at least there’s someone to blame)
nshepperd> toy example I just calculated: if P(argument quality | H) ~ Normal(mu = 1.05, sigma = 0.2) and P(argument quality | ¬H) ~ Normal(mu = 0.95, sigma = 0.2) and the clever arguer possesses ten arguments drawn from whichever distribution is correct, and tells you his four best, which are [1.1, 1.09, 1.08, 1.07]
nshepperd> then the likelihood ratio P(E|H)/P(E|¬H) for that evidence is 0.40, and it’s evidence against H
nshepperd> even though the arguments are on the high side of both distributions
nshepperd> but if you make the clever arguer work less hard to find arguments, and say he only has 6 arguments, of which he tells you the best four, then that same set becomes evidence for H (likelihood ratio 1.29)
clone_of_saturn> p(a|b) is pretty intuitive if you read the | as “given”
nshepperd> because then you no longer expect that he would have found much better arguments
Frederic Bastiat: “The worst thing that can happen to a good cause is not to be skillfully attacked, but to be ineptly defended.”
Relevant to bad updating on hurricanes?
“How Near-Miss Events Amplify or Attenuate Risky Decision Making”, Tinsley et al 2012
In the aftermath of many natural and man-made disasters, people often wonder why those affected were under-prepared, especially when the disaster was the result of known or regularly occurring hazards (eg. hurricanes). We study one contributing factor: prior near-miss experiences. Near misses are events that have some nontrivial expectation of ending in disaster but, by chance, do not. We demonstrate that when near misses are interpreted as disasters that did not occur, people illegitimately underestimate the danger of subsequent hazardous situations and make riskier decisions (eg. choosing not to engage in mitigation activities for the potential hazard). On the other hand, if near misses can be recognized and interpreted as disasters that almost happened, this will counter the basic “near-miss” effect and encourage more mitigation. We illustrate the robustness of this pattern across populations with varying levels of real expertise with hazards and different hazard contexts (household evacuation for a hurricane, Caribbean cruises during hurricane season, and deep-water oil drilling). We conclude with ideas to help people manage and communicate about risk.
In the lead-up to the storm, Governor Haley Barbour of Mississippi warned of “hurricane fatigue”-the possibility that his constituents would not evacuate because they had successfully weathered earlier storms; similarly, one former Federal Emergency Management Agency official said people in the agency unfortunately approached the Katrina response as it had other responses, though the aftermath of Katrina was clearly “unusual” (Glasser and Grunwald 2005). Such complacency is not exclusive to hurricanes. Citizens who survive natural disasters in one season often fail to take actions that would mitigate their risk in future seasons (eg. moving off a Midwestern flood plain or clearing brush to prevent wildfires in the West; see Lindell and Perry 2000)….Organizations experience near misses as well. For example, in the deep-sea oil drilling industry, dozens of Gulf of Mexico wells in the past two decades suffered minor blowouts during cementing; however, in each case chance factors (eg. favorable wind direction, no one welding near the leak at the time, etc.) helped prevent an explosion (Tinsley et al. 2011).
- Glasser SB, Grunwald M (2005) The steady buildup to a city’s chaos: Confusion reigned at every level of government. Washington Post (September 11) A01
- Lindell MK, Perry RW (2000) Household adjustment to earthquake hazard: A review of research. Environ. Behav. 32(4):590-630
- Tinsley CH, Dillon RL, Madsen PM (2011) How to avoid catastrophe. Harvard Bus. Rev. 89(4):90-97
We show that a particular type of personal experience, near misses, have an undue influence on how people evaluate risk and can lead to questionable choices when people face an impending hazard with which they have had prior near-miss experience. We show that this near-miss effect is robust because it seems to implicitly influence the thoughts people use as inputs to their decision making. This near-miss effect can be countered, but doing so needs to use the same kind of implicit mechanism.
Dillon and Tinsley (2008) found that near misses in completing a space project encouraged people to choose a riskier strategy when faced with a future hazard threat to the mission. Although highly contextualized and specific, their research showed that near misses are events that alter evaluations of risk, and thus a near-miss bias should generalize to many kinds of hazards and be relevant to a large array of natural and man-made hazard environments.
- Dillon RL, Tinsley CH (2008) How near-misses influence decision making under risk: A missed opportunity for learning. Management Sci. 54(8):1425-1440
For example, in their discussion of aviation near misses, March et al. (1991, p. 10) essentially argue that near collisions can produce two different types of salient associations. They describe:
Every time a pilot avoids a collision, the event provides evidence both for the threat [of a collision] and for its irrelevance. It is not clear whether the organization came [close] to a disaster or that the disaster was avoided.
- March JG, Sproull LS, Tamuz M (1991) Learning from samples of one or fewer. Organ. Sci. 2(1):1-13
See also Kahneman and Varey (1990) for arguments on the critical distinction between an event that did not occur and an event that did not but almost occurred.
- Kahneman D, Varey CA (1990) Propensities and counterfactuals: The loser that almost won. J. Personality Soc. Psych. 59(6):1101-1110
We predict that near misses change the negativity associated with a bad event rather than changing probability assessments. This is consistent with Windschitl and Chambers’ (2004) finding that people are more likely to change their feelings about a choice than their explicit beliefs about the probabilities. Furthermore, in the domain of near misses, Dillon and Tinsley (2008) showed that people changed their perceptions of risk without changing their probabilities.
- Windschitl PD, Chambers JR (2004) The dud-alternative effect in likelihood judgment. J. Experiment. Psych.: Learning, Memory, Cognition 30(1):198-215
Study 1 looked for evidence of the near-miss effect using a field survey of households in coastal counties of Louisiana and Texas who experienced Hurricane Lili.3 We examined how previous storm experience as well as prior near-miss experiences (in the form of unnecessary evacuations) influenced whether or not the individuals surveyed evacuated for Hurricane Lili. Studies 2-6 used the laboratory to discover how the near-miss phenomenon operates. Study 2 examined how encoding near misses as resilient or vulnerable led to different evacuation rates for a hypothetical hurricane and demonstrated that the addition of vulnerability information to the near-miss stimulus can counteract the complacency effect. Study 3 examined the components of people’s SEU [subjective expected utility] assessments. It probed people’s assessments of probabilities (P ), outcome attractiveness (O), and their ultimate judgments of risk versus safety (R) to test our hypothesized mediation. Study 4 generalizes our basic finding by changing the context from a house to a cruise ship; in doing so we address a concern that participants may be updating their calculations of the risk after a resilient near miss. Additionally, in Study 4, we examine the role counterfactuals have in the risky decision. Study 5 offered evidence that near misses do in fact change the hazard category, and hence the knowledge associated with a hazard, by examining what participants’ thought about a hazardous situation. This study removed the need to make a decision, thereby (a) providing evidence for the first (implicit) step in our sequence of how near misses affect cognitive processes and (b) discounting a concern that people first chose what to do and then, when forced to answer questions, generate assessments of probabilities, outcomes, and risk to justify their choice (ie. reverse causality). Study 6 corroborated the findings of Studies 2-5 with actual behavior by having participants’ decisions regarding a risky situation have financial consequences for their compensation.
On power laws: given the rareness of events out on the tail, but also their overwhelming magnitude, how do we perform inference on them? Does a Katrina increase our belief in super-storms? How much? How weak does a storm have to be before it ceases to increase our estimate of the power parameter and instead lower it?
I guess my point here is that part of the reason I stayed in Mormonism so long was that the people arguing against Mormonism were using such ridiculously bad arguments. I tried to find the most rigorous reasoning and the strongest research that opposed LDS theology, but the best they could come up with was stuff like horses in the Book of Mormon. It’s so easy for a Latter-Day Saint to simply write the horse references off as either a slight mistranslation or a gap in current scientific knowledge that that kind of “evidence” wasn’t worth the time of day to me. And for every horse problem there was something like Hugh Nibley’s “Two Shots in the Dark” or Eugene England’s work on Lehi’s alleged travels across Saudi Arabia, apologetic works that made Mormon historical and theological claims look vaguely plausible. There were bright, thoughtful people on both sides of the Mormon apologetics divide, but the average IQ was definitely a couple of dozen points higher in the Mormon camp.
James Richardson, “Even More Aphorisms and Ten-Second Essays from Vectors 3.0”
19. The peril of arguing with you is forgetting to argue with myself. Don’t make me convince you: I don’t want to believe that much.