World Catnip Surveys

International population online surveys of cat owners about catnip and other cat stimulant use.
statistics, psychology, R, survey, Bayes, Google, cats
2015-11-152018-12-02 in progress certainty: likely importance: 4


In com­pil­ing a meta-analy­sis of reports of , yield­ing an meta-an­a­lytic aver­age of ~2⁄3, the avail­able data sug­gests het­ero­gene­ity from cross-coun­try differ­ences in rates (pos­si­bly for genetic rea­sons) but is insuffi­cient to defin­i­tively demon­strate the exis­tence of or esti­mate those differ­ences (par­tic­u­larly a pos­si­ble extremely high cat­nip response rate in Japan). I use Google Sur­veys August–Sep­tem­ber 2017 to con­duct a brief 1-ques­tion online sur­vey of a pro­por­tional pop­u­la­tion sam­ple of 9 coun­tries about cat own­er­ship & cat­nip use, specifi­cal­ly: Canada, the USA, UK, Japan, Ger­many, Brazil, Spain, Aus­tralia, & Mex­i­co. in total, I sur­veyed n = 31,471 peo­ple, of whom n = 9,087 are cat own­ers, of whom n = 4,402 report hav­ing used cat­nip on their cat, and of whom n = 2996 report a cat­nip response.

The sur­vey yields cat­nip response rates of Canada (82%), USA (79%), UK (74%), Japan (71%), Ger­many (57%), Brazil (56%), Spain (54%), Aus­tralia (53%), and Mex­ico (52%). The differ­ences are sub­stan­tial and of high pos­te­rior prob­a­bil­i­ty, sup­port­ing the exis­tence of large cross-coun­try differ­ences. In addi­tional analy­sis, the other con­di­tional prob­a­bil­i­ties of cat own­er­ship and try­ing cat­nip with a cat appear to cor­re­late with cat­nip response rates; this inter­cor­re­la­tion sug­gests a “cat fac­tor” of some sort influ­enc­ing respons­es, although what causal rela­tion­ship there might be between pro­por­tion of cat own­ers and pro­por­tion of cat­nip-re­spon­der cats is unclear.

An addi­tional sur­vey of a con­ve­nience sam­ple of pri­mar­ily US Inter­net users about cat­nip is report­ed, although the improb­a­ble cat­nip response rates com­pared to the pop­u­la­tion sur­vey sug­gest the respon­dents are either highly unrep­re­sen­ta­tive or the ques­tions caused demand bias.

2017 Google Surveys: Probability Population Sample

Google Con­sumer Surveys/ (GS) is a online sur­vey ser­vice offered by Google to the gen­eral pop­u­la­tion; it is pow­ered by media pub­lish­ers where their arti­cle pay­walls are replaced by a short typ­i­cally sin­gle-ques­tion sur­vey of a vari­ety of types which the reader must answer to see the desired con­tent. Com­bin­ing meta­data with Google’s adver­tis­ing pro­files, Google attempts to weight selected read­ers for the “sur­vey­wall” such that they form an unbi­ased ran­dom of a spec­i­fied pop­u­la­tion, such as the gen­eral pop­u­la­tion. Each GS response costs $0.10-$3 depend­ing on the length of the sur­vey & diffi­culty of recruit­ing responses (eg a coun­try-wide all-ages all-gen­der prob­a­bil­ity sam­ple sur­vey with one ques­tion would cost $0.10 per respon­se, so $10 would yield 100 answer­s).

As of August 2017, GS allows tar­get­ing:

  1. by age brack­et:

    • 18-24
    • 25-34
    • 35-44
    • 44-54
    • 55-64
    • 65+
  2. gen­der: women/men

  3. coun­try (and by sub­-coun­try, spe­cific regions depend­ing on coun­try):

    • United States (Eng­lish)
    • Canada (Eng­lish)
    • United King­dom (Eng­lish)
    • Ger­many (Ger­man)
    • Mex­ico (Span­ish)
    • Japan (Japan­ese)
    • Aus­tralia (Eng­lish)
    • Brazil (Por­tugue­se)
    • France (French)
    • Spain (Span­ish)
    • for Android smart­phone read­ers only, Italy (Ital­ian) and the Nether­lands (Dutch but not Eng­lish)
  4. by pre­de­fined groups, “audi­ence pan­els” (cost­ing more, >=$0.30); eg for the United States, avail­able audi­ence pan­els include

    • “Hispanics/Latinos”
    • “Online dat­ing users (Sites and apps)”
    • “Small/Medium Busi­ness Own­ers and Man­agers”
    • “Mobile social media users (Face­book, Twit­ter, Google+)”
    • “Stream­ing video sub­scrip­tion users (Net­flix, Ama­zon, Hulu Plus, Google Play)”
    • “Stu­dents”
  5. by user-de­fined “screen­ing ques­tion” which pro­vides a con­di­tional ques­tion, and one is charged only for the users who answer pos­i­tively & pro­ceed to the main sur­vey ques­tion

For cat­nip sur­vey­ing, GS has advan­tages and dis­ad­van­tages. The pri­mary advan­tages are that a well-pow­ered sin­gle-ques­tion sur­vey about cat­nip response could poten­tially be cheap, while deliv­er­ing an unbi­ased esti­mate from the gen­eral pop­u­la­tion unin­ter­ested in cats or cat­nip (par­tic­u­larly in Japan, given the Japan­ese anom­aly in the ini­tial meta-analy­sis), and inci­den­tally mea­sur­ing pro­por­tions that might also be of inter­est (such as inter­na­tional differ­ences in cat own­er­ship or famil­iar­ity with cat­nip). The dis­ad­van­tages are that the cost advan­tage may be illu­sory as most respon­dents sim­ply will not have a cat or will not have tried to use cat­nip (never mind any of the more obscure cat stim­u­lants like sil­vervine), and the sur­vey must be kept as sim­ple as pos­si­ble to keep respon­dents hon­est & costs down.

Given the screen­ing ques­tion fea­ture, a sin­gle-ques­tion cat­nip sur­vey could be defined either as a two-stage sur­vey with a screen­ing ques­tion ask­ing about cat own­er­ship and then results of cat­nip use if any, or as a sin­gle-ques­tion ask­ing about cat­nip with the set of responses no-cat/cat-but-never-catnip/cat-and-response/cat-and-no-response. If half of Amer­i­cans have a cat, and half of cat-own­ers have tried cat­nip, then one would need 4 responses for every use­ful respon­se, so it would cost $0.40/response. The screen­ing ques­tion might be more effi­cient.

Pilot surveys

I exper­i­mented with two screen­ing-ques­tion GS sur­veys, “Do you have a cat?” and “Have you ever given cat­nip to a cat?”; after its auto­matic pilot exper­i­ment, GS priced responses to both at $3.00 each, which is unac­cept­ably expen­sive. I then exper­i­mented with two sin­gle-ques­tion GS sur­veys, where the set of responses cov­ers the 2x2=4 pos­si­bil­i­ties of cat own­er­ship & cat­nip use, to see if they would be suffi­ciently cheap as to be more cost-effec­tive after omit­ting the unin­for­ma­tive cells (of not own­ing a cat, and of own­ing a cat but never try­ing cat­nip).

The first GS sur­vey was the straight­for­ward sin­gle-ques­tion ver­sion to try to esti­mate what frac­tion of respon­dents would not have a cat & not have tried cat­nip, to see what the cost might be; the sec­ond reversed the responses to ask about cat­nip immu­nity rather than respon­se, to try to reduce any :

  1. 2017-07-28–2017-07-30, 100 responses x $0.10 = $10, USA gen­eral pop­u­la­tion (raw responses)

    1. Have you ever given cat­nip to your cat?

    [ran­dom­ized order]

    • No: I do not have a cat [n = 65 / 62%]
    • No: I have a cat but have not tried cat­nip [n = 12 / 16%]
    • Yes: but they did not respond to cat­nip [n = 6 / 5%]
    • Yes: they responded to cat­nip [n = 18 / 17%]

    24 use­ful respons­es, imply­ing a 25% immu­nity rate.

  2. 2017-07-29–2017-08-01, 200 responses x $0.10 = $20, USA gen­eral pop­u­la­tion (raw responses):

    1. Have you ever given cat­nip to your cat?

    [ran­dom­ized order]

    • No: I do not have a cat [n = 136 / 72%]
    • No: I have a cat but have not tried cat­nip [n = 19 / 9.9%]
    • Yes: and they were cat­nip-im­mune [n = 11 / 4.7%]
    • Yes: but they were not immune [n = 34 / 13.3%]

    45 use­ful respons­es, imply­ing a 24% immu­nity rate.

  3. com­bined, n = 300:

    • No cat: n = 201 / 67%

    • Cat but no tries: n = 31 / 10%

    • Cat and tried: n = 69 / 23% ; result:

      • none/immune: n = 17 / 24%
      • respon­der: n = 52 / 75%

Pool­ing, about 1 in 7 or 15% () of the responses were use­ful, so the true cost of one response was ~$0.67; so the screen­ing ques­tion sur­vey would cost 4.5x more than the sin­gle-ques­tion ver­sion. (Google Sur­veys some­times errs on the side of col­lect­ing too many respons­es, but does not appear to charge you for the excess, so the cost-effec­tive­ness will be a lit­tle bit bet­ter than appears.) Demand bias wise, the pro­por­tions are iden­ti­cal between the nor­mal & negated respons­es, pro­vid­ing some evi­dence that respon­dents are not mind­lessly respond­ing.

Com­bin­ing the pilot sur­vey sam­ples and com­par­ing with the ear­lier meta-an­a­lytic esti­mate of 63% response rate / 37% immu­nity observed in lab animals/research set­tings, the differ­ence would be just sta­tis­ti­cal­ly-sig­nifi­cant when ignor­ing the meta-an­a­lytic uncer­tainty (binom.test(c(17, 52), p=0.37)p = 0.034, CI 0.15-0.365), so it may be the case that the cat own­ers are more likely to say their cat responds to cat­nip This could be seen as a good or bad thing: there might be demand bias or selec­tive mem­ory where respon­dents think more of cat­nip-re­spon­ders, or it could be that cat own­ers are inher­ently bet­ter judges of their cats’ responses because they are famil­iar with the cats in ques­tion, the cats are com­fort­able with the own­ers pro­vid­ing the cat­nip, and own­ers likely try cat­nip on mul­ti­ple occa­sions to observe if any response is ever elicited (whereas in a lab­o­ra­tory set­ting, all of these vari­ables are reversed in a way that would tend to hide cat­nip respons­es: the inves­ti­ga­tor may not be famil­iar with or good at han­dling the cat, the cat may be scared of them or the lab set­ting, and most stud­ies report only 1 test occa­sion despite respon­ders not always respond­ing & requir­ing mul­ti­ple tests to be sure).

Full International Surveys

This differ­ence, and the appar­ent suc­cess of the pilot sur­vey, sug­gests the need for a larger sur­vey sam­ple to nail down the esti­mate more accu­rate­ly, and exploit the abil­ity to sam­ple other coun­tries (espe­cially Japan). To sam­ple all 9 non-US coun­tries (omit­ting Italy & the Nether­lands because of the unknown biases in sam­pling only Android smart­phone user­s), I will need trans­la­tions of the ques­tions into: French, Ger­man, Japan­ese, Por­tugue­se, & Span­ish.

International survey results

Results by coun­try (note that per­cent­ages do not equal raw counts due to pop­u­la­tion weight­ing; orig­i­nal Excel spread­sheet exports):

  1. Canada: 2017-08–07-2017-08-14, 3000 responses x $0.10 = $300, Canada gen­eral pop­u­la­tion (raw responses)

    1. Have you ever given cat­nip to your cat? [n = 3001]

    [ran­dom­ized order]

    • No: I do not have a cat [n = 1938 / 63%]
    • No: I have a cat but have not tried cat­nip [n = 260 / 9%]
    • Yes: but they did not respond to cat­nip [n = 139 / 5.3%]
    • Yes: they responded to cat­nip [n = 664 / 23%]

    immu­nity rate. Cana­di­ans appear more likely to have cats, try cat­nip, and receive a cat­nip response.

  2. Canada: 2017-08-19–2017-08-22, 150 responses (raw responses); adap­tive sam­pling fol­lowup:

    • no: 99 (62%)
    • non-tri­er: 12 (9.5%)
    • non-re­spon­se: 8 (5.9%)
    • respon­se: 31 (23%)

    immu­nity rate.

  3. Aus­tralia: 2017-08-07–2017-08-10, 3000 responses x $0.10 = $300, Aus­tralia gen­eral pop­u­la­tion (raw responses)

    1. Have you ever given cat­nip to your cat? [n = 3000]

    [ran­dom­ized order]

    • No: I do not have a cat [n = 2134 / 71%]
    • No: I have a cat but have not tried cat­nip [n = 446 / 14.3%]
    • Yes: but they did not respond to cat­nip [n = 196 / 6.8%]
    • Yes: they responded to cat­nip [n = 224 / 7.7%]

    immu­nity rate. This is sur­pris­ingly high: almost twice the US/UK esti­mates even though one would expect that Aus­tralian rates would be sim­i­lar as it was col­o­nized by the Eng­lish & pre­sum­ably Eng­lish cats. It raises the same ques­tion as the Japan­ese anom­aly: are there large cross-na­tional differ­ences in cat­nip response rates, and if so, might they be genetic in orig­in, per­haps due to founder effects or genetic drift?

  4. Aus­tralia: 12-2017-08-15, 2000 responses x $0.10 = $200, Aus­tralia gen­eral pop­u­la­tion (raw responses); fol­lowup sur­vey to con­firm the 47% anom­aly in the first Aus­tralian sur­vey:

    1. Have you ever given cat­nip to your cat? [n = 2001]

    [ran­dom­ized order]

    • No: I do not have a cat [n = 1438 / 71%]
    • No: I have a cat but have not tried cat­nip [n = 293 / 15%]
    • Yes: but they did not respond to cat­nip [n = 120 / 7.4%]
    • Yes: they responded to cat­nip [n = 150 / 6.5%]

    . Pooled: .

  5. Aus­tralia: 2017-08-22–2017-08-25, 500 responses x $0.10 = $50, Aus­tralia gen­eral pop­u­la­tion (raw responses); adap­tive sam­pling fol­lowup:

    • non-own­er: 361 (74%)
    • non-tri­er: 75 (16%)
    • respon­der: 29 (5%)
    • non-re­spon­der: 25 (4.9%)

    46% immu­nity rate, no sur­prise there.

  6. Aus­tralia: 2017-08-29–2017-09-01, 150 responses x $0.10 = $15, Aus­tralia gen­eral pop­u­la­tion (raw responses); adap­tive sam­pling fol­lowup:

    • Non-own­er: n = 108 (75.1%)
    • Non-tri­er: n = 28 (15.7%)
    • Immune: n = 11 (7.4%)
    • Respon­der: n = 4 (1.8%)
  7. UK: 2017-08-07–2017-08-10, 3000 responses x $0.10 = $300, United King­dom gen­eral pop­u­la­tion (raw responses)

    1. Have you ever given cat­nip to your cat? [n = 3021]

    [ran­dom­ized order]

    • No: I do not have a cat [n = 2131 / 70%]
    • No: I have a cat but have not tried cat­nip [n = 265 / 9%]
    • Yes: but they did not respond to cat­nip [n = 162 / 5.5%]
    • Yes: they responded to cat­nip [n = 463 / 15.6%]

    immu­nity rate.

  8. USA: 2017-08-09–2017-08-11, 2700 responses x $0.10 = $270, USA gen­eral pop­u­la­tion (raw responses)

    1. Have you ever given cat­nip to your cat?

    [ran­dom­ized order]

    • No: I do not have a cat [n = 1826 / 65%]
    • No: I have a cat but have not tried cat­nip [n = 269 / 9.6%]
    • Yes: but they did not respond to cat­nip [n = 151 / 5.1%]
    • Yes: they responded to cat­nip [n = 563 / 20.5%]

    $=22%$ immu­nity rate. Pool­ing with pre­vi­ous 2 USA pilot sur­veys for a com­bined n = 3000:

    • No cat: n = 2027 / 68%

    • Cat but no tries: n = 300 / 10%

    • Cat and tried: n = 783 / 26% ; result:

      • none/immune: n = 168 / 21%
      • respon­der: n = 615 / 79%
  9. Mex­i­co: 2017-08-11–2017-08-13, 3000 responses x $0.10 = $300, Mex­i­can gen­eral pop­u­la­tion (raw responses); Span­ish ver­sion pro­vided by David Figuera:

    1. ¿Al­guna vez le ha dado nébeda (catnip/menta gatu­na) a su gato? [n = 3011]
    • No: No tengo gato. [n = 2245 / 74.2%]
    • No: Tengo gato pero no lo he proba­do. [n = 594 / 19.8%]
    • Sí: Pero no le hizo efec­to. [n = 82 / 3.3%]
    • Sí: Le hizo efec­to. [n = 90 / 2.7%]

    immu­nity rate. This is, like the first Aus­tralian sam­ple, inter­est­ingly high but com­pro­mised by rel­a­tively small sam­ple size: Mex­i­cans appar­ently tend to own cats less and to be less likely to try cat­nip, so barely 5% of responses are use­ful.

  10. Mex­i­co: 2017-08-19–2017-08-22, 150 responses (raw responses); adap­tive sam­pling fol­lowup:

    • Non-own­er: 112 (75%)
    • Non-tri­er: 30 (20%)
    • Respon­der: 4 (2.5%)
    • Immune: 4 (2.2%)

    .

  11. Spain: 2017-08-11–2017-08-14, 3000 responses x $0.10 = $300, Span­ish gen­eral pop­u­la­tion (raw responses)

    1. ¿Al­guna vez le ha dado nébeda (catnip/menta gatu­na) a su gato? [n = 3000]
    • No: No tengo gato. [n = 2203 / 73.4%]
    • No: Tengo gato pero no lo he proba­do. [n = 607 / 21.2%]
    • Sí: Pero no le hizo efec­to. [n = 87 / 2.4%]
    • Sí: Le hizo efec­to. [n = 103 / 3%]

    immu­nity rate. Sim­i­lar to Mex­i­co: low rates of cat own­er­ship & cat­nip try­ing, high immu­nity rate.

  12. Spain: 2017-08-19–2017-08-22 (raw responses); adap­tive sam­pling fol­lowup:

    • Non-own­er: 97 (68%)
    • Non-tri­er: 43 (24%)
    • Immune: 5 (4.7%)
    • Respon­der: 5 (3.7%)

    immu­nity rate.

  13. Ger­many: 2017-08-13–2017-08-16, 3000 responses x $0.10 = $300, Ger­man gen­eral pop­u­la­tion (raw responses); Ger­man ver­sion pro­vided by r0k­it, checked by Feep­ingCrea­ture & gehme­hgeh1:

    1. Haben Sie Ihrer Katze jemals Katzenminze/Catnip gegeben? [n = 3009]
    • Ich habe keine Katze. [n = 2093 / 71.3%]
    • Nein, bisher habe ich ihr keine Katzenminze/Catnip gegeben. [n = 536 / 16.4%]
    • Ja, aber sie reagiert nicht darauf. [n = 164 / 4.8%]
    • Ja, ich habe ihr Katzenminze/Catnip gegeben und sie reagiert darauf. [n = 216 / 7.4%]

    immu­nity rate. Some­what inter­me­di­ate Spain & UK.

  14. Japan: 2017-08-22–2017-08-25, 200 responses x $0.10 = $20, Japan­ese gen­eral pop­u­la­tion (raw responses); Japan­ese ver­sion pro­vided by Juju Kuri­hara

    1. 自分のネコにキャットニップあげたことある? [n = 203]

    [ran­domly reverse answer order; GS for­bade full ran­dom­iza­tion]

    • いいえ。ネコは飼ってない。 [n = 162 / 84%]
    • いいえ。ネコはいるけど、キャットニップをあげたことない。 [n = 21 / 9%]
    • はい。でも反応がなかった。 [n = 6 / 1.9%]
    • はい。反応した。 [n = 14 / 4.7%]

    immu­nity rate. Con­sis­tent with a low immu­nity rate but too small to rule out 10%/90%, requir­ing addi­tional sam­pling.

  15. Japan: 2017-08-25–2017-08-28, 3800 x $0.10 = $380, Japan­ese gen­eral pop­u­la­tion (raw responses); mod­i­fied Japan­ese ver­sion with a cat­nip syn­onym for clar­i­ty, larger sam­ple size:

    1. 自分のネコにキャットニップ(イヌハッカ)あげたことある? [n = 3828]

    [ran­domly reverse answer order]

    • いいえ。ネコは飼ってない。 [n = 3158 / 83%]
    • いいえ。ネコはいるけど、キャットニップ(イヌハッカ)をあげたことない。 [n = 383 / 10.4%]
    • はい。でも反応がなかった。 [n = 82 / 2.4%]
    • はい。反応した。 [n = 205 / 4.5%]

    $=$29% immu­nity rate; pool­ing: 88⁄307=29%. This is an immu­nity rate sim­i­lar to the UK and not oth­er­wise remark­able, so the Japan­ese anom­aly has been fal­si­fied.

  16. Japan: 2017-08-29–2017-08-31, 150 x $0.10 = $15, Japan­ese gen­eral pop­u­la­tion (raw responses); adap­tive fol­lowup:

    • Non-own­er: n = 127 (89.7%)
    • Non-tri­er: n = 11 (3.1%)
    • Immune: n = 4 (4.4%)
    • Respon­der: n = 9 (2.8%)
  17. Brazil: 2017-08-25–2017-08-27, 3000 x $0.10 = $300, Brazil­ian gen­eral pop­u­la­tion (raw responses): Por­tuguese trans­la­tion pro­vided by Glad­stone:

    1. Você alguma vez deu erva-dos-gatos para seu gato? [n = 3045]

    [ran­dom­ized order]

    • Não: Não tenho gato. [n = 1951 / 64%]
    • Não: Tenho gato, mas nunca ten­tei dar. [n = 781 / 26%]
    • Sim: Mas não fez efeito. [n = 139 / 4.5%]
    • Sim: E fez efeito. [n = 174 / 5.7%]
  18. France & Por­tu­gal: omit­ted because I could­n’t find some­one to check my French trans­la­tion & ran out of mon­ey.

Adaptive sampling

After com­plet­ing my first pass over the avail­able coun­tries and while wait­ing on Japanese/French/Portuguese trans­la­tions, I asked myself where should I spend my 4 $15 GS coupons? This con­sti­tutes a clas­sic adap­tive sam­pling prob­lem: choos­ing what dat­a­points to col­lect, based on pre­vi­ously col­lected data, in order to min­i­mize a loss or max­i­mize a reward, such as min­i­miz­ing entropy or vari­ance.

The Bayesian esti­ma­tion of the bino­mial pro­por­tion’s para­me­ter P fol­lows the based on successes/totals ; one way to quan­tify the size of the pos­te­rior dis­tri­b­u­tion over P is to esti­mate its entropy, which allows com­par­i­son of differ­ent pos­te­ri­ors and eval­u­a­tion of what actions would reduce entropy the most:

bEntropy <- function(a,b) { lbeta(a, b) - (a-1) * digamma(a) - (b-1)*digamma(b) + (a+b-2)*digamma(a+b) }

Esti­mat­ing a bino­mial is influ­enced by the total sam­ple size (big­ger n means smaller pos­te­rior uncer­tain­ty) but also by the rate of suc­cesses (the closer to P = 0.5, the more infor­ma­tive about the pro­por­tion a given n is, while the closer to P = 0/1, the less we learn from each sam­ple). So in the cat­nip sur­veys, Spain/Mexico sam­ples yielded few sam­ples (espe­cially com­pared to USA/Canada) and are poorly esti­mated so we might want to col­lect more data there; but on the other hand, this small n is par­tially off­set by the rel­a­tively high pro­por­tion P of immune responses which is eas­ier to esti­mate; and on the grip­ping hand, each Spain/Mexico sam­ple is 4x more expen­sive. It’s hard to say how it nets out.

We can ask by cal­cu­lat­ing the entropy reduc­tion and rel­a­tive cost of tak­ing a hypo­thet­i­cal addi­tional sam­ple of n:

survey$Entropy <- unlist(Map(bEntropy, survey$Immune, survey$Triers))
n <- 50
survey$Entropy.hypothetical <- unlist(Map(function(a,b) { rate <- (a+1)/(b+1); bEntropy(a+(n*rate), n+b) },
    survey$Immune, survey$Triers))
survey$Entropy.reduction <- survey$Entropy - survey$Entropy.hypothetical
survey$Entropy.cost <- survey$Entropy.reduction / survey$Cost
survey[order(survey$Entropy.cost),]
#     Country Triers Immune Responders Cost      Entropy Entropy.hypothetical Entropy.reduction  Entropy.cost
# 3 Australia    759    352        407 0.74 -2.853405426         -2.885262214     0.03185678744 0.04304971276
# 8     Japan    320     92        228 1.29 -2.470193598         -2.542250267     0.07205666904 0.05585788297
# 5    Mexico    180     86         94 1.75 -2.135260348         -2.257230589     0.12197024173 0.06969728099
# 6     Spain    200     92        108 1.58 -2.188585162         -2.299631826     0.11104666446 0.07028269903
# 9    Brazil    313    139        174 0.96 -2.412869913         -2.486723486     0.07385357304 0.07693080525
# 4    Canada    842    147        695 0.37 -3.064821325         -3.093474193     0.02865286749 0.07744018239
# 7   Germany    380    164        216 0.79 -2.510847462         -2.572473801     0.06162633865 0.07800802360
# 2        UK    625    162        463 0.48 -2.822265042         -2.860566405     0.03830136288 0.07979450600
# 1       USA    783    168        615 0.38 -2.975320699         -3.006112449     0.03079175050 0.08103092236

I repeated this cal­cu­la­tion as new data arrived to decide where to sam­ple next:

  • 19 August: the next sam­ple should come from Aus­tralia or Mex­i­co. Due to a mis­take in max­i­miz­ing rather than min­i­miz­ing, I began sam­pling from Canada/USA; delet­ing the USA sur­vey did not seem to refund my coupon so I let Canada con­tinue run­ning and ran Mexico/Spain instead.
  • 22 August: Aus­tralia
  • 29 August: Aus­tralia, Japan
  • 1 Sep­tem­ber: stopped sam­pling, but the entropy con­tin­ues to rec­om­mend Australia/Japan/Mexico.

Results

Over­all results as a table:

Pooled results of 11 inter­na­tional cat­nip sur­veys esti­mat­ing national cat response rates.
Coun­try Start End Total Own­ers Non-own­ers Non-tri­ers Tri­ers Immune Respon­ders Immu­nity rate $/response
Canada 2017-08-07 2017-08-22 3151 1114 2037 (65%) 272 (9%) 842 147 695 18% $0.37
USA 2017-07-28 2017-08-11 3110 1083 2027 (68%) 300 (10%) 783 168 615 21% $0.38
UK 2017-08-07 2017-08-10 3021 890 2131 (70%) 265 (9%) 625 162 463 26% $0.48
Japan 2017-08-22 2017-08-31 4182 735 3447 (82%) 415 (10%) 320 92 228 29% $1.29
Ger­many 2017-08-13 2017-08-16 3009 916 2093 (71%) 536 (16%) 380 164 216 43% $0.79
Brazil 2017-08-25 2017-08-28 3045 1094 1951 (64%) 781 (26%) 313 139 174 44% $0.96
Spain 2017-08-11 2017-08-22 3150 1601 2300 (73%) 650 (21%) 200 92 108 46% $1.58
Aus­tralia 2017-08-07 2017-09-01 5642 850 4041 (72%) 842 (15%) 759 356 403 47% $0.74
Mex­ico 2017-08-11 2017-08-22 3161 804 2357 (75%) 624 (20%) 180 86 94 48% $1.75
31471 9087 22384 4685 4402 1406 2996 (68%) (32%) $0.71

Meta-an­a­lyz­ing & plot­ting results:

survey <- read.csv(stdin(), header=TRUE, colClasses=c("factor", rep("integer", 6), "numeric", "numeric", "numeric"))
Country,Total,Owners,Non-owners,Non-triers,Triers,Immune,Responders,Immunity rate,Cost
Canada,3151,1114,2037,272,842,147,695,0.18,0.37
USA,3110,1083,2027,300,783,168,615,0.21,0.38
UK,3021,890,2131,265,625,162,463,0.26,0.48
Japan,4182,735,3447,415,320,92,228,0.29,1.29
Germany,3009,916,2093,536,380,164,216,0.43,0.79
Brazil,3045,1094,1951,781,313,139,174,0.44,0.96
Australia,5642,1601,4041,842,759,356,403,0.47,0.74
Spain,3150,850,2300,650,200,92,108,0.46,1.58
Mexico,3161,804,2357,624,180,86,94,0.48,1.75


library(metafor)
rer <- rma(xi=Responders, ni=Triers, measure="PR", slab=Country, data=survey); rer
# Random-Effects Model (k = 9; tau^2 estimator: REML)
#
# tau^2 (estimated amount of total heterogeneity): 0.0144 (SE = 0.0075)
# tau (square root of estimated tau^2 value):      0.1202
# I^2 (total heterogeneity / total variability):   97.17%
# H^2 (total variability / sampling variability):  35.37
#
# Test for Heterogeneity:
# Q(df = 8) = 314.4524, p-val < .0001
#
# Model Results:
#
# estimate      se     zval    pval   ci.lb   ci.ub
#   0.6447  0.0409  15.7536  <.0001  0.5645  0.7249
reo <- rma(xi=Owners, ni=Total,  measure="PR", slab=Country, data=survey); reo
# Random-Effects Model (k = 9; tau^2 estimator: REML)
#
# tau^2 (estimated amount of total heterogeneity): 0.0033 (SE = 0.0017)
# tau (square root of estimated tau^2 value):      0.0578
# I^2 (total heterogeneity / total variability):   98.31%
# H^2 (total variability / sampling variability):  59.18
#
# Test for Heterogeneity:
# Q(df = 8) = 559.7712, p-val < .0001
#
# Model Results:
#
# estimate      se     zval    pval   ci.lb   ci.ub
#   0.2936  0.0195  15.0921  <.0001  0.2554  0.3317
ret <- rma(xi=Triers, ni=Owners, measure="PR", slab=Country, data=survey); ret
# Random-Effects Model (k = 9; tau^2 estimator: REML)
#
# tau^2 (estimated amount of total heterogeneity): 0.0440 (SE = 0.0221)
# tau (square root of estimated tau^2 value):      0.2097
# I^2 (total heterogeneity / total variability):   99.53%
# H^2 (total variability / sampling variability):  212.85
#
# Test for Heterogeneity:
# Q(df = 8) = 1798.3983, p-val < .0001
#
# Model Results:
#
# estimate      se    zval    pval   ci.lb   ci.ub
#   0.4723  0.0701  6.7399  <.0001  0.3350  0.6097

forest(rer, order="obs")
forest(reo, order="obs")
forest(ret, order="obs")
Full Google Sur­vey results of ‘does your cat respond to cat­nip’ across 9 coun­tries world­wide shows large differ­ences in cat­nip immu­nity rates, from 17% to 48% of local cats reports to be immune.
Google Sur­vey results for whether respon­dents own a cat, across 9 coun­tries (range: 17-35%, meta-an­a­lytic mean 29%)
Google Sur­vey results for whether respon­dents own­ing a cat have ever tried to use cat­nip, across 9 coun­tries (range: 22-76%, meta-an­a­lytic mean 47%)

In the con­text of pre­lim­i­nary hypothe­ses from the , there are 4 main con­clu­sions offered by this large-n sur­vey data (which increases the avail­able sam­ple size by >10x):

  1. between-coun­try differ­ences exist

  2. Japan does indeed have a high cat­nip response rate, but it is not extra­or­di­nar­ily high: Cana­di­ans report a higher cat­nip rate.

  3. the over­all aver­age cat­nip response rate of 64% is almost iden­ti­cal to the prior meta-an­a­lytic result, sug­gest­ing that the sur­vey is mea­sur­ing the same thing as the research papers

    • fur­ther imply­ing there is no tem­po­ral decline in cat­nip response rate, because then there would not be near-i­den­tity between 2017 results and the meta-an­a­lytic result

Intercorrelation

The extra data beyond the cat­nip response may itself be inter­est­ing. I noticed that beyond the expected differ­ences in cat­nip response rate, the con­di­tional prob­a­bil­i­ties of try­ing cat­nip given a cat, and hav­ing a cat given being sur­veyed, appeared to also differ con­sid­er­ably by coun­try. That’s unex­pected because you might think that what­ever the num­ber of cat own­ers, they will still use cat­nip at the same rate, and why would there be any cor­re­la­tion between pro­por­tion of cat own­ers and pro­por­tion of respond­ing cats?

But the cor­re­la­tions and cross-coun­try differ­ences do seem to be there if I extract the esti­mated odds for each of the 3 tran­si­tions and cor­re­late them:

library(brms)
bf_owner <- bf(Owners | trials(Total) ~ (1|Country))
bf_trier <- bf(Triers | trials(Owners) ~ (1|Country))
bf_response <- bf(Responders | trials(Triers) ~ (1|Country))
b2 <- brm(mvbf(bf_owner, bf_trier, bf_response, rescor=TRUE), data=survey, family=binomial()); summary(b2)
#  Family: MV(binomial, binomial, binomial)
#   Links: mu = logit
#          mu = logit
#          mu = logit
# Formula: Owners | trials(Total) ~ (1 | Country)
#          Triers | trials(Owners) ~ (1 | Country)
#          Responders | trials(Triers) ~ (1 | Country)
#    Data: survey (Number of observations: 9)
# Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
#          total post-warmup samples = 4000
#
# Group-Level Effects:
# ~Country (Number of levels: 9)
#                          Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
# sd(Owners_Intercept)         0.36      0.12     0.21     0.66       1279 1.00
# sd(Triers_Intercept)         1.10      0.35     0.65     1.99       1095 1.01
# sd(Responders_Intercept)     0.69      0.23     0.39     1.25        879 1.00
#
# Population-Level Effects:
#                      Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
# Owners_Intercept        -0.90      0.13    -1.15    -0.65       1200 1.00
# Triers_Intercept        -0.12      0.39    -0.91     0.66       1397 1.00
# Responders_Intercept     0.64      0.24     0.14     1.12       1684 1.00
#
# Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample
# is a crude measure of effective sample size, and Rhat is the potential
# scale reduction factor on split chains (at convergence, Rhat = 1).
ranef(b2)
# $Country
# , , Owners_Intercept
#
#                 Estimate    Est.Error         2.5%ile       97.5%ile
# Australia -0.03044781685 0.1282204238 -0.284757218749  0.22717181030
# Brazil     0.31244611158 0.1292416296  0.054818130450  0.56888661855
# Canada     0.28798422855 0.1287040846  0.029973397220  0.54094918802
# Germany    0.06678857684 0.1316210852 -0.192722752913  0.32869035807
# Japan     -0.64167774729 0.1296152408 -0.908276721935 -0.38242695654
# Mexico    -0.17791601189 0.1310846382 -0.444852555408  0.07788323597
# Spain     -0.09876368533 0.1298648872 -0.357686230169  0.16053092922
# UK         0.02140104376 0.1310367170 -0.242747351985  0.28399207800
# USA        0.26481992173 0.1299328820  0.001917056751  0.52942282197
#
# , , Triers_Intercept
#
#                 Estimate    Est.Error       2.5%ile         97.5%ile
# Australia  0.01813788471 0.3885912273 -0.7845149100  0.8040397117941
# Brazil    -0.78819571684 0.3897722924 -1.5862075480  0.0000443689147
# Canada     1.24883013037 0.3910430281  0.4505231495  2.0407819682118
# Germany   -0.22112710148 0.3918324328 -1.0328674986  0.5774945667231
# Japan     -0.13662809470 0.3923860668 -0.9279937053  0.6697218032907
# Mexico    -1.11467805618 0.3925868958 -1.9009480273 -0.3266023072477
# Spain     -1.05038700392 0.3941149256 -1.8671042045 -0.2640132172055
# UK         0.97769227307 0.3925479698  0.1787478323  1.7702763329297
# USA        1.07751793313 0.3914604603  0.2712072935  1.8478001989406
#
# , , Responders_Intercept
#
#                Estimate    Est.Error        2.5%ile       97.5%ile
# Australia -0.5044582315 0.2493370825 -1.00260681092 0.001552354279
# Brazil    -0.3994089762 0.2626963885 -0.93710540902 0.129480329954
# Canada     0.8979007384 0.2520710317  0.41513927176 1.406509591116
# Germany   -0.3499764734 0.2544537933 -0.83667455510 0.158186916451
# Japan      0.2625833744 0.2637053887 -0.25947489277 0.794264287381
# Mexico    -0.5169044777 0.2728149763 -1.05988425248 0.016712965363
# Spain     -0.4517022250 0.2690081731 -0.99083076217 0.059135851124
# UK         0.4037822551 0.2535945318 -0.09353189908 0.907456763495
# USA        0.6483597746 0.2498113059  0.15065734268 1.150894764620
predict(b2)
# , , Owners
#
#         Estimate   Est.Error  2.5%ile 97.5%ile
#  [1,] 1111.27925 37.58129665 1040.975     1185
#  [2,] 1080.40375 37.51611121 1007.975     1153
#  [3,]  889.20850 34.94478233  822.000      957
#  [4,]  740.43200 35.03746255  674.000      810
#  [5,]  915.39250 36.13518205  844.000      988
#  [6,] 1091.52625 36.50583488 1019.000     1162
#  [7,] 1602.28800 47.89174209 1511.000     1698
#  [8,]  851.73325 34.99259514  783.000      920
#  [9,]  806.16650 34.95533216  736.000      874
#
# , , Triers
#
#        Estimate   Est.Error 2.5%ile 97.5%ile
#  [1,] 840.94325 19.34487233     804  879.025
#  [2,] 781.85675 20.74558566     740  822.000
#  [3,] 624.06650 19.51606510     585  661.000
#  [4,] 319.90550 19.20555344     282  357.000
#  [5,] 380.15950 21.12151958     340  422.000
#  [6,] 314.26450 21.08638657     274  355.025
#  [7,] 758.91825 28.71753369     702  814.000
#  [8,] 201.20300 17.40087609     168  235.000
#  [9,] 180.57550 16.36839274     149  213.000
#
# , , Responders
#
#        Estimate    Est.Error 2.5%ile 97.5%ile
#  [1,] 692.56225 15.156518563     661      721
#  [2,] 613.27625 16.416586491     581      644
#  [3,] 461.62275 15.665416808     431      491
#  [4,] 227.27925 11.224427218     205      249
#  [5,] 216.88525 13.221357098     191      243
#  [6,] 174.94050 12.507720436     150      199
#  [7,] 404.40200 19.807409709     365      444
#  [8,] 109.10050  9.863428489      90      128
#  [9,]  95.48250  9.350965399      77      113
owners <- ranef(b2)$Country[,,1][,1]
triers <- ranef(b2)$Country[,,2][,1]
responders <- ranef(b2)$Country[,,3][,1]
odds <- data.frame(Country=rownames(ranef(b2)$Country[,,3]), Owners=owners, Triers=triers, Responders=responders)
odds
#                   Owners         Triers    Responders
# Australia -0.03044781685  0.01813788471 -0.5044582315
# Brazil     0.31244611158 -0.78819571684 -0.3994089762
# Canada     0.28798422855  1.24883013037  0.8979007384
# Germany    0.06678857684 -0.22112710148 -0.3499764734
# Japan     -0.64167774729 -0.13662809470  0.2625833744
# Mexico    -0.17791601189 -1.11467805618 -0.5169044777
# Spain     -0.09876368533 -1.05038700392 -0.4517022250
# UK         0.02140104376  0.97769227307  0.4037822551
# USA        0.26481992173  1.07751793313  0.6483597746
cor(odds)
#                  Owners       Triers   Responders
# Owners     1.0000000000
# Triers     0.3638358320 1.0000000000
# Responders 0.2072588658 0.8879400319 1.0000000000

library("GGally")
ggpairs(odds, columns=1:3, lower=list(mapping=aes(color=Country)))
Scat­ter­plot of log odds of cat own­er­ship, cat­nip exper­i­men­ta­tion, and cat­nip response by coun­try

Convenience Sampling: 2016 Google Docs survey

In an attempt to get richer infor­ma­tion about cat­nip respon­se, includ­ing age/sex/gender/country, and inves­ti­gate cat­nip alter­na­tives like valer­ian & sil­vervine, I set up a Google Docs sur­vey with incen­tives (Ama­zon gift card) and adver­tised online in var­i­ous places I fre­quent or which are cat-re­lat­ed.

Questions

Stan­dard demo­graph­ics:

  • gen­der
  • age
  • coun­try

Google Docs does­n’t make it easy to han­dle responses about poten­tially mul­ti­ple cats an owner might have exper­i­mented with, so I hard­wired it to a max of 5 cats with repeated blocks of ques­tions. For each of 5 cats:

  • sex (M/F)

  • spayed/neutered

  • breed: var­i­ous

  • fur col­or; used color list from “The Rela­tion­ship Between Coat Color and Aggres­sive Behav­iors in the Domes­tic Cat”, Stelow et al 2016:

    • black
    • black­-and-white
    • cal­ico
    • color points
    • gray
    • gray-and-white
    • Tabby (black, brown, and gray)
    • Tabby (or­ange, cream, and buff)
    • tor­toise­shell
    • white
    • Tor­bie
    • other
    • per­son­al­ity 1-5: (“Cat­nip and the Cat Response”: “..The per­son­al­ity and emo­tional fac­tors were found to be the most impor­tant: with­drawn cats react poorly while friend­ly, out­go­ing cats react best.”)
  • age at first admin­is­tra­tion

  • stim­u­lants, bina­ry: cat­nip respon­se, valer­ian respon­se, hon­ey­suckle respon­se, sil­vervine respon­se, cat thyme

Human cat­nip use (com­mon his­tor­i­cal­ly):

  • has the per­son ever con­sumed cat­nip in the form of: tea / leaves or an herb / roots / smoked / poul­tice
  • if so, what was the pur­pose? relax­ation / stim­u­la­tion / eupho­ria or intox­i­ca­tion / hal­lu­ci­na­tion or visual dis­tor­tion / dreams / insom­nia / stom­ach aches / mos­quito repel­lent / headaches / colds or flu or fever / hives / arthri­tis / increas­ing uri­na­tion / treat­ment of worms / hem­or­rhoids / other
  • effi­ca­cy: 1-5

Launch

The sur­vey was launched 2016-09-29, planned to run all year, incen­tivized with 3 $50 Ama­zon gift cards, and adver­tised in the fol­low­ing places:

Post-launch edits:

  • removed free response from gen­der ques­tion in demo­graphic sec­tion due to abuse
  • added age con­straints to the human and cat age ques­tions due to abuse
  • edited intro­duc­tion to rephrase it as “cat non-re­sponses” & began adver­tis­ing as a “cat­nip non-re­sponse” sur­vey: the first 49 respon­dents claimed that 45 of 47 cats (cat #1 respons­es) responded to cat­nip, which is grossly dis­crepant from the 62% esti­mate - it’s impre­cise, but there’s no way the true cat­nip rate is 96%! Fur­ther, the cat­nip non-re­sponse rate goes up in the addi­tional cat entries. All this indi­cates either a bias in respon­dent rec­ol­lec­tions (they only remem­ber the cats which do respond, or they pro­vide a respond­ing cat’s data first and then don’t include all the rest they know of) or a response bias to exper­i­menter demand (in­fer­ring that I want to hear only about cats which do respond). The response rates for the other sub­stances like sil­vervine are more rea­son­able thus far (although with far smaller sam­ple sizes) and the small sam­ple size is what I expected because those sub­stances are much rar­er, so this may not be a gen­eral acqui­es­cence bias (be­cause you would expect many more peo­ple to be claim­ing their cat respond to all of catnip/silvervine/valerian/thyme). If it’s a rec­ol­lec­tion bias, I don’t know of any­way to cor­rect that after­wards or change the sur­vey to elim­i­nate it, so sur­veys on this topic might be futile.
  • final sur­vey: Google Docs sur­vey pre­view; mir­ror of final sur­vey

The sur­vey col­lected 241 responses from 2016-09-29 to 2017-01-02, when I closed it and awarded the gift cards. After fil­ter­ing for qual­i­ty, there were 223 respons­es.

Results

Cleaning

Google Sur­veys, as men­tioned, does­n’t seem to sup­port any ele­gant way of repeat­ing a series of ques­tions, so I hard­coded ques­tions for up to 5 cats, pro­duc­ing a ‘wide’ sur­vey for­mat in which each row is a sin­gle respon­dent with 5 sets of age/breed/fur/neuter/sex/personality+catnip/valerian/silvervine/thyme/honeysuckle rat­ings. For almost all tasks, it’s bet­ter to have the sur­vey in a ‘long’ for­mat, where each row is instead a sin­gle cat’s set of covari­ates & rat­ings, with the owner infor­ma­tion (and a unique ID) repeated across the rows for all the cats they pro­vided infor­ma­tion on. (This then per­mits straight­for­ward analy­ses like regres­sions of the form Catnip ~ (1|ID) + Sex + Breed etc.) This is com­pli­cated enough I could­n’t fig­ure out how to use the usual reshap­ing libraries to con­vert wide to long, so I did it by brute force. As well, an addi­tional vari­able is added to inves­ti­gate the demand bias, not­ing whether a cat entry is the “first” cat pro­vided by a user; if there is a demand bias as I hypoth­e­size based on the extremely high reported cat­nip response rates for first cats, then first vs the rest (second/third/fourth/fifth) should pre­dict cat­nip respons­es.

After the data is reshaped, the free response fields need to be cleaned up and sim­i­lar responses com­bined (eg “UK” com­bined with “United King­dom”).

catnip <- read.csv("https://www.gwern.net/docs/catnip/2017-01-02-catnipsurvey-conveniencesample.csv")

catnip$ID <- 1:nrow(catnip); catnip$Timestamp <- NULL
catnipLong <- data.frame(ID=integer(), Owner.age=integer(), Owner.sex=factor(), Owner.education=factor(),
    Owner.country=factor(), Catnip.types=factor(), Nth.cat=integer(), Cat.age=numeric(), Cat.breed=factor(),
    Cat.fur.color=factor(), Cat.neuter=logical(), Cat.personality=integer(), Cat.sex=factor(),
    Cat.response.Catnip=logical(), Cat.response.Valerian=logical(), Cat.response.Silvervine=logical(),
    Cat.response.Thyme=logical(), Cat.response.Honeysuckle=logical())
for (i in 1:nrow(catnip)) {
  catnipLong <- with(catnip[i,],
      rbind(catnipLong,
      data.frame(ID=ID, Owner.age=Owner.age, Owner.sex=Owner.sex, Owner.education=Owner.education, Owner.country=Owner.country, Catnip.types=Catnip.types,
          Nth.cat=1, Cat.age=Cat1.age, Cat.breed=Cat1.breed, Cat.fur.color=Cat1.fur.color, Cat.neuter=Cat1.neuter, Cat.personality=Cat1.personality,
          Cat.sex=Cat1.sex, Cat.response.Catnip=Cat1.response.Catnip, Cat.response.Valerian=Cat1.response.Valerian, Cat.response.Silvervine=Cat1.response.Silvervine,
          Cat.response.Thyme=Cat1.response.Thyme, Cat.response.Honeysuckle=Cat1.response.Honeysuckle),
      data.frame(ID=ID, Owner.age=Owner.age, Owner.sex=Owner.sex, Owner.education=Owner.education, Owner.country=Owner.country, Catnip.types=Catnip.types,
              Nth.cat=2, Cat.age=Cat2.age, Cat.breed=Cat2.breed, Cat.fur.color=Cat2.fur.color, Cat.neuter=Cat2.neuter, Cat.personality=Cat2.personality,
              Cat.sex=Cat2.sex, Cat.response.Catnip=Cat2.response.Catnip, Cat.response.Valerian=Cat2.response.Valerian,
              Cat.response.Silvervine=Cat2.response.Silvervine, Cat.response.Thyme=Cat2.response.Thyme, Cat.response.Honeysuckle=Cat2.response.Honeysuckle),
      data.frame(ID=ID, Owner.age=Owner.age, Owner.sex=Owner.sex, Owner.education=Owner.education, Owner.country=Owner.country, Catnip.types=Catnip.types,
          Nth.cat=3, Cat.age=Cat3.age, Cat.breed=Cat3.breed, Cat.fur.color=Cat3.fur.color, Cat.neuter=Cat3.neuter, Cat.personality=Cat3.personality,
          Cat.sex=Cat3.sex, Cat.response.Catnip=Cat3.response.Catnip, Cat.response.Valerian=Cat3.response.Valerian, Cat.response.Silvervine=Cat3.response.Silvervine,
          Cat.response.Thyme=Cat3.response.Thyme, Cat.response.Honeysuckle=Cat3.response.Honeysuckle),
      data.frame(ID=ID, Owner.age=Owner.age, Owner.sex=Owner.sex, Owner.education=Owner.education, Owner.country=Owner.country, Catnip.types=Catnip.types,
          Nth.cat=4, Cat.age=Cat4.age, Cat.breed=Cat4.breed, Cat.fur.color=Cat4.fur.color, Cat.neuter=Cat4.neuter, Cat.personality=Cat4.personality, Cat.sex=Cat4.sex,
          Cat.response.Catnip=Cat4.response.Catnip, Cat.response.Valerian=Cat4.response.Valerian, Cat.response.Silvervine=Cat4.response.Silvervine,
          Cat.response.Thyme=Cat4.response.Thyme, Cat.response.Honeysuckle=Cat4.response.Honeysuckle),
      data.frame(ID=ID, Owner.age=Owner.age, Owner.sex=Owner.sex, Owner.education=Owner.education, Owner.country=Owner.country, Catnip.types=Catnip.types,
          Nth.cat=5, Cat.age=Cat5.age, Cat.breed=Cat5.breed, Cat.fur.color=Cat5.fur.color, Cat.neuter=Cat5.neuter, Cat.personality=Cat5.personality,
          Cat.sex=Cat5.sex, Cat.response.Catnip=Cat5.response.Catnip, Cat.response.Valerian=Cat5.response.Valerian, Cat.response.Silvervine=Cat5.response.Silvervine,
          Cat.response.Thyme=Cat5.response.Thyme, Cat.response.Honeysuckle=Cat5.response.Honeysuckle)))
   }

## filter out the empty data-frame rows by filtering on `Cat.sex` - potential false positives, but anyone who doesn't even know the gender of their cat can't be a good judge of their responses anyway...
catnipLong <- catnipLong[!is.na(catnipLong$Cat.sex),]
## Test response bias:
catnipLong$First <- catnipLong$Nth.cat==1

catnipLong[!is.na(catnipLong$Cat.fur.color) &
    catnipLong$Cat.fur.color=="Orange and white on left here http://b.robnugen.com/cats/kawasaki/cats_2015-03-26_09.22.26.jpg",]$Cat.fur.color <-
    "Tabby (orange, cream, and buff)"
catnipLong[!is.na(catnipLong$Cat.fur.color) & catnipLong$Cat.fur.color=="Orange and white splotched",]$Cat.fur.color <- "Tabby (orange, cream, and buff)"
catnipLong[!is.na(catnipLong$Cat.fur.color) & catnipLong$Cat.fur.color=="tortoiseshell",]$Cat.fur.color <- "Torbie (tortoiseshell colors with tabby pattern)"
catnipLong[!is.na(catnipLong$Cat.fur.color) & catnipLong$Cat.fur.color=="white",]$Cat.fur.color <- "gray-and-white"
catnipLong[!is.na(catnipLong$Cat.fur.color) & catnipLong$Cat.fur.color=="Orange tabby with paint/large white areas",]$Cat.fur.color <- "Tabby (orange, cream, and buff)"
catnipLong[!is.na(catnipLong$Cat.fur.color) & catnipLong$Cat.fur.color=="Tabby/Tuxedo pattern mix (black and white)",]$Cat.fur.color <- "black-and-white"
catnipLong[!is.na(catnipLong$Cat.fur.color) & catnipLong$Cat.fur.color=="tuxedo black & white longhair",]$Cat.fur.color <- "black-and-white"
levels(catnipLong$Cat.fur.color) <- c(levels(catnipLong$Cat.fur.color), "Other")
usableFurColors <- row.names(sort(table(catnipLong$Cat.fur.color)))[22:29]
catnipLong[!is.na(catnipLong$Cat.fur.color) & !(catnipLong$Cat.fur.color %in% usableFurColors),]$Cat.fur.color <- "Other"

## use only breeds with n>3, lump the rest together:
usableBreeds <- row.names(sort(table(catnipLong$Cat.breed))[26:30])
levels(catnipLong$Cat.breed) <- c(levels(catnipLong$Cat.breed), "Other")
catnipLong[!is.na(catnipLong$Cat.breed) & !(catnipLong$Cat.breed %in% usableBreeds),]$Cat.breed <- "Other"

## Clean up country free responses:
replaceFactor <- function(df, wrong,right) { df[!is.na(df$Owner.country) & df$Owner.country==wrong,]$Owner.country <- right
    return(df) }
catnipLong <- replaceFactor(catnipLong, "Czech republic", "Czech Republic")
catnipLong <- replaceFactor(catnipLong, "Czech Republic ", "Czech Republic")
catnipLong <- replaceFactor(catnipLong, "india", "India")
catnipLong <- replaceFactor(catnipLong, "latvia", "Latvia")
catnipLong <- replaceFactor(catnipLong, "N/A", NA)
catnipLong <- replaceFactor(catnipLong, "Sweden ", "Sweden")
catnipLong <- replaceFactor(catnipLong, "UK", "United Kingdom")

write.csv(catnipLong, file="catnip-long-clean.csv", row.names=FALSE)

Descriptive

Roughly 391 cat entries sur­vive clean­ing. In gen­er­al, the data suffers from high lev­els of miss­ing­ness on the owner/cat covari­ates, and unfor­tu­nate­ly, there are very few responses deal­ing with non-cat­nip (I had hoped more cat own­ers would’ve tried them but appar­ently not):

  • Cat­nip: n = 352
  • Hon­ey­suck­le: 43
  • Sil­vervine: 16
  • Thyme: 31
  • Vale­ri­an: 47
catnip <- read.csv("https://www.gwern.net/docs/catnip/2017-01-02-catnipsurvey-conveniencesample-long-clean.csv")
library(skimr)
skim(catnip)
# Skim summary statistics
#  n obs: 391
#  n variables: 19
# Note: no visible binding for global variable 'self'
#
# Variable type: factor
#         variable missing complete   n n_unique                          top_counts ordered
#        Cat.breed      31      360 391        6  dom: 263, dom: 44, Oth: 34, NA: 31   FALSE
#    Cat.fur.color      29      362 391        9 Oth: 100, bla: 60, bla: 53, Tab: 44   FALSE
#     Catnip.types       5      386 391       17 dry: 201, dry: 49, dry: 44, fre: 20   FALSE
#          Cat.sex       0      391 391        2           mal: 201, fem: 190, NA: 0   FALSE
#    Owner.country       1      390 391       24  USA: 247, Uni: 40, Can: 39, Fra: 7   FALSE
#  Owner.education      17      374 391        6 bac: 156, hig: 79, ass: 46, mas: 44   FALSE
#        Owner.sex      19      372 391        3  mal: 254, fem: 113, NA: 19, oth: 5   FALSE
#
# Variable type: integer
#         variable missing complete   n   mean    sd p0  p25 p50   p75 p100     hist
#  Cat.personality      29      362 391   3.58  1.21  1  3     4   5      5 ▂▅▁▅▁▇▁▇
#               ID       0      391 391 111.92 64.56  1 56.5 113 168.5  223 ▇▆▇▇▇▆▇▇
#          Nth.cat       0      391 391   1.59  0.8   1  1     1   2      5 ▇▅▁▁▁▁▁▁
#        Owner.age      36      355 391  28.8   8.93 14 23    28  32     85 ▃▇▃▁▁▁▁▁
#
# Variable type: logical
#                  variable missing complete   n mean                     count
#                Cat.neuter      25      366 391 0.94 TRU: 345, NA: 25, FAL: 21
#       Cat.response.Catnip      39      352 391 0.85 TRU: 300, FAL: 52, NA: 39
#  Cat.response.Honeysuckle     348       43 391 0.4  NA: 348, FAL: 26, TRU: 17
#   Cat.response.Silvervine     375       16 391 0.25  NA: 375, FAL: 12, TRU: 4
#        Cat.response.Thyme     360       31 391 0.39 NA: 360, FAL: 19, TRU: 12
#     Cat.response.Valerian     344       47 391 0.66 NA: 344, TRU: 31, FAL: 16
#                     First       0      391 391 0.57 TRU: 222, FAL: 169, NA: 0
#
# Variable type: numeric
#  variable missing complete   n mean   sd p0 p25 p50 p75 p100     hist
#   Cat.age      59      332 391 2.15 2.23  0   1   1   3   18 ▇▂▁▁▁▁▁▁
## Visualize missingness:
library(visdat)
catnipMissing <- catnip
catnipMissing$Cat.sex <- catnipMissing$ID <- catnipMissing$Nth.cat <- catnipMissing$First <- NULL
vis_miss(catnipMissing)
Visu­al­iz­ing degree of miss­ing­ness in the con­ve­nience sam­ple sur­vey: pri­mar­ily con­cen­trated in the non-cat­nip response ques­tions.

Pre­sent­ing the cat­nip raw data in more detail (split by the covari­ates from the meta-analy­sis):

s <- aggregate(Cat.response.Catnip ~ Owner.country + Cat.age + Cat.breed + Cat.sex,
    function(x){c(sum(x), length(x), round(digits=2,mean(x)))}, data=catnip)
s[order(s$Owner.country, s$Cat.age, s$Cat.breed),]
Con­tin­gency table of raw data from cat­nip con­ve­nience sur­vey, split by coun­try, age, breed, and sex.
Owner coun­try Cat age Cat breed Cat sex Respon­ders N Per­cent­age
Aus­tralia 1.00 domes­tic long-haired (mixed) male 1 1 1.00
Aus­tralia 1.00 domes­tic short­-haired (mixed) female 1 1 1.00
Aus­tralia 2.00 domes­tic short­-haired (mixed) female 0 1 0.00
Aus­tralia 6.00 domes­tic short­-haired (mixed) male 1 1 1.00
Belarus 3.00 domes­tic long-haired (mixed) male 1 1 1.00
Canada 0.50 domes­tic short­-haired (mixed) female 0 1 0.00
Canada 0.50 domes­tic short­-haired (mixed) male 0 1 0.00
Canada 0.50 Other male 1 1 1.00
Canada 1.00 domes­tic short­-haired (mixed) female 1 1 1.00
Canada 1.00 domes­tic short­-haired (mixed) male 3 3 1.00
Canada 1.00 Maine coon male 1 1 1.00
Canada 2.00 domes­tic short­-haired (mixed) female 1 1 1.00
Canada 2.00 domes­tic short­-haired (mixed) male 2 3 0.67
Canada 3.00 domes­tic short­-haired (mixed) female 2 3 0.67
Canada 3.00 domes­tic short­-haired (mixed) male 2 3 0.67
Canada 4.00 domes­tic short­-haired (mixed) female 2 2 1.00
Canada 5.00 Other male 1 1 1.00
Canada 6.00 domes­tic short­-haired (mixed) female 1 1 1.00
Canada 6.00 domes­tic short­-haired (mixed) male 1 1 1.00
Canada 7.00 domes­tic short­-haired (mixed) female 1 1 1.00
Canada 7.00 domes­tic short­-haired (mixed) male 1 2 0.50
Czech Repub­lic 1.00 domes­tic short­-haired (mixed) female 1 1 1.00
Czech Repub­lic 2.00 domes­tic short­-haired (mixed) female 1 1 1.00
Fin­land 3.00 domes­tic short­-haired (mixed) female 1 1 1.00
Fin­land 5.00 domes­tic short­-haired (mixed) female 1 1 1.00
France 0.00 domes­tic short­-haired (mixed) female 2 2 1.00
France 3.00 Other male 1 1 1.00
Ger­many 0.60 domes­tic short­-haired (mixed) female 1 1 1.00
Ger­many 1.00 domes­tic long-haired (mixed) male 0 1 0.00
Ger­many 1.00 Other male 0 1 0.00
Ger­many 4.00 Other female 0 1 0.00
Ger­many 6.00 domes­tic short­-haired (mixed) female 1 1 1.00
India 0.66 domes­tic long-haired (mixed) female 1 1 1.00
India 1.00 domes­tic short­-haired (mixed) female 0 1 0.00
India 1.50 domes­tic short­-haired (mixed) female 0 1 0.00
India 2.00 domes­tic short­-haired (mixed) male 1 1 1.00
Ire­land 0.00 Other female 1 1 1.00
Italy 1.00 domes­tic long-haired (mixed) male 0 2 0.00
Italy 1.00 domes­tic short­-haired (mixed) female 2 2 1.00
Japan 2.00 domes­tic short­-haired (mixed) female 1 1 1.00
Japan 2.00 Other male 1 1 1.00
Latvia 1.00 domes­tic short­-haired (mixed) female 2 2 1.00
Latvia 1.00 domes­tic short­-haired (mixed) male 1 1 1.00
Latvia 5.00 Other male 1 1 1.00
Nether­lands 0.00 domes­tic long-haired (mixed) male 1 1 1.00
Nether­lands 0.00 domes­tic short­-haired (mixed) male 1 1 1.00
New Zealand 0.50 domes­tic short­-haired (mixed) male 1 1 1.00
New Zealand 2.00 domes­tic short­-haired (mixed) female 1 1 1.00
Panama 18.00 domes­tic short­-haired (mixed) female 1 1 1.00
Rus­sia 2.00 domes­tic short­-haired (mixed) female 0 1 0.00
Slove­nia 1.00 domes­tic short­-haired (mixed) female 0 1 0.00
Slove­nia 2.00 domes­tic short­-haired (mixed) male 1 1 1.00
Slove­nia 4.00 domes­tic short­-haired (mixed) female 1 1 1.00
Spain 0.00 Maine coon female 0 1 0.00
Spain 0.00 Other male 1 1 1.00
Swe­den 0.00 domes­tic short­-haired (mixed) female 1 1 1.00
Swe­den 0.00 domes­tic short­-haired (mixed) male 1 1 1.00
Swe­den 1.00 Maine coon male 1 1 1.00
Swe­den 2.00 domes­tic short­-haired (mixed) male 1 1 1.00
United King­dom 0.50 domes­tic long-haired (mixed) female 1 1 1.00
United King­dom 0.50 domes­tic short­-haired (mixed) female 1 1 1.00
United King­dom 0.50 domes­tic short­-haired (mixed) male 0 1 0.00
United King­dom 1.00 domes­tic long-haired (mixed) female 0 1 0.00
United King­dom 1.00 domes­tic short­-haired (mixed) female 5 5 1.00
United King­dom 1.00 domes­tic short­-haired (mixed) male 5 5 1.00
United King­dom 2.00 domes­tic long-haired (mixed) male 0 1 0.00
United King­dom 2.00 domes­tic short­-haired (mixed) female 1 2 0.50
United King­dom 2.00 domes­tic short­-haired (mixed) male 3 4 0.75
United King­dom 3.00 domes­tic short­-haired (mixed) male 3 3 1.00
United King­dom 3.00 Other male 1 1 1.00
United King­dom 4.00 domes­tic short­-haired (mixed) male 1 2 0.50
United King­dom 4.00 Other male 2 2 1.00
United King­dom 5.00 domes­tic short­-haired (mixed) female 1 1 1.00
United King­dom 5.00 domes­tic short­-haired (mixed) male 1 1 1.00
United King­dom 6.00 domes­tic short­-haired (mixed) female 1 1 1.00
USA 0.00 domes­tic long-haired (mixed) male 1 1 1.00
USA 0.00 domes­tic short­-haired (mixed) female 3 3 1.00
USA 0.00 domes­tic short­-haired (mixed) male 2 2 1.00
USA 0.10 domes­tic long-haired (mixed) male 1 1 1.00
USA 0.10 domes­tic short­-haired (mixed) female 1 1 1.00
USA 0.10 domes­tic short­-haired (mixed) male 1 1 1.00
USA 0.40 domes­tic short­-haired (mixed) female 1 1 1.00
USA 0.40 Maine coon male 1 1 1.00
USA 0.50 domes­tic short­-haired (mixed) female 3 5 0.60
USA 0.50 domes­tic short­-haired (mixed) male 7 7 1.00
USA 0.50 Other male 1 1 1.00
USA 0.75 domes­tic short­-haired (mixed) female 1 1 1.00
USA 0.75 domes­tic short­-haired (mixed) male 1 1 1.00
USA 0.80 domes­tic long-haired (mixed) male 1 1 1.00
USA 0.80 domes­tic short­-haired (mixed) male 1 1 1.00
USA 1.00 domes­tic long-haired (mixed) female 6 8 0.75
USA 1.00 domes­tic long-haired (mixed) male 4 4 1.00
USA 1.00 domes­tic short­-haired (mixed) female 29 33 0.88
USA 1.00 domes­tic short­-haired (mixed) male 27 33 0.82
USA 1.00 Maine coon male 3 3 1.00
USA 1.00 Other female 4 4 1.00
USA 1.00 Other male 2 3 0.67
USA 1.00 Russ­ian Blue female 0 1 0.00
USA 1.00 Russ­ian Blue male 2 2 1.00
USA 2.00 domes­tic long-haired (mixed) female 4 4 1.00
USA 2.00 domes­tic long-haired (mixed) male 1 2 0.50
USA 2.00 domes­tic short­-haired (mixed) female 14 14 1.00
USA 2.00 domes­tic short­-haired (mixed) male 12 15 0.80
USA 2.00 Maine coon male 2 2 1.00
USA 2.00 Other female 3 4 0.75
USA 2.00 Other male 3 3 1.00
USA 3.00 domes­tic long-haired (mixed) female 1 2 0.50
USA 3.00 domes­tic long-haired (mixed) male 3 3 1.00
USA 3.00 domes­tic short­-haired (mixed) female 9 12 0.75
USA 3.00 domes­tic short­-haired (mixed) male 10 10 1.00
USA 3.00 Other female 1 1 1.00
USA 3.00 Rag­doll male 1 1 1.00
USA 4.00 domes­tic short­-haired (mixed) female 4 4 1.00
USA 4.00 domes­tic short­-haired (mixed) male 1 1 1.00
USA 4.00 Rag­doll female 1 1 1.00
USA 4.00 Russ­ian Blue female 1 1 1.00
USA 5.00 domes­tic short­-haired (mixed) female 2 3 0.67
USA 5.00 domes­tic short­-haired (mixed) male 1 1 1.00
USA 5.00 Other female 0 1 0.00
USA 6.00 domes­tic long-haired (mixed) male 1 1 1.00
USA 6.00 domes­tic short­-haired (mixed) female 2 2 1.00
USA 6.00 domes­tic short­-haired (mixed) male 1 1 1.00
USA 7.00 domes­tic short­-haired (mixed) female 1 1 1.00
USA 8.00 domes­tic long-haired (mixed) male 1 1 1.00
USA 9.00 domes­tic short­-haired (mixed) male 1 1 1.00
USA 12.00 domes­tic short­-haired (mixed) female 1 1 1.00
USA 17.00 Maine coon male 1 1 1.00
Viet­nam 4.00 Other male 0 1 0.00

Analysis

The pri­mary ques­tions here are:

  1. what is the rate of cat­nip response by coun­try?
  2. is there a demand bias? If so, what is #1 adjusted for that?
  3. do any of the covari­ates pre­dict cat­nip respon­se? In descend­ing order of plau­si­bil­i­ty: age, breed, sex, neuter sta­tus, fur col­or, and owner sex/education/age.
  4. how does the cat­nip response cor­re­late with responses to the alter­na­tives like sil­vervine? Can responses to one be used to pre­dict the oth­ers, giv­ing insight into the bio­log­i­cal mech­a­nisms or guide own­ers in selec­tion?
  5. what do humans think of cat­nip as a herbal rem­e­dy?
Catnip Response Rates

Since the other drugs are too rare to be worth ana­lyz­ing, I focus on the cat­nip respons­es.

I set up a Bayesian mul­ti­level logis­tic regres­sion model in brms/Stan to inves­ti­gate. Notes on model details:

  • each row is a sin­gle cat’s respon­se, true/false, so it uses the bino­mial fam­ily (Bernoulli to be more speci­fic, which is faster in Stan appar­ent­ly) & is a logis­tic regres­sion

  • non-cat­nip responses are excluded due to miss­ing­ness (although one could try to use brm_multiple’s sup­port for data impu­ta­tion, I ran into prob­lems installing MICE)

  • coun­try & ID are treated as ran­dom effects to nest rat­ings in

  • most of the covari­ates are categorical/logical, but the two age vari­ables are con­tin­u­ous; in many datasets, age is a non­lin­ear vari­able, and treated as a qua­dratic or higher poly­no­mi­al, but brms sup­ports splines so the human/cat age vari­ables are given splines in case of any non­lin­ear­i­ty. Non­lin­ear­ity is also plau­si­ble here because cat­nip response only emerges at a par­tic­u­lar age and the spec­u­la­tion about being related to feline sex­ual func­tions would also sug­gest pos­si­ble trends like a qua­drat­ic.

  • strong infor­ma­tive pri­ors are used to keep esti­mates in ranges we know to be true and make the model stabler/faster:

    • a horse­shoe prior is put on most of the covari­ates: the horse­shoe is like the lasso in induc­ing spar­si­ty, as is appro­pri­ate in this case because the cat­e­gor­i­cal vari­ables have many lev­els, and I a pri­ori expect that most of the covari­ates are irrel­e­vant (ex­cept for coun­try and first-rat­ing). Con­ve­niently brms’s horse­shoe allows spec­i­fy­ing expected frac­tion of nonzero coeffi­cients, which I set at 20% (which is being gen­er­ous).
    • a nor­mal prior of is put on the coun­try-level ran­dom effects. The para­me­ter­i­za­tion of the logis­tic model uses log-odds/logits, so the prob­a­bil­i­ties must be trans­formed. 0.3 here roughly reflects the observed dis­tri­b­u­tion of coun­try-level prob­a­bil­i­ties in the cur­rent meta-analy­sis, from 40-80%, which is 1 log­it, and divided by 3, gives ~0.3. This will pro­vide rea­son­able per-coun­try results, espe­cially in coun­tries for which few respon­dents are avail­able (most of them, given the skew to the USA).
    • another nor­mal of is put on the over­all intercept/base-rate of cat­nip response rate. The meta-analy­sis gives a global prob­a­bil­ity of cat­nip response of ~0.66, the logit of which is also 0.66. As the meta-analy­sis is fairly large, it is highly unlikely it’s off by too much, so it gets a nar­rower SD.
    • final­ly, a stronger prior is put on the spline degrees: if there is non­lin­ear­i­ty, it should be of quite low degree, not much more than qua­dratic - more is implau­si­ble and can­not be esti­mated from this dataset any­way
library(brms)
c <- brm(Cat.response.Catnip ~ (1|ID) + (1|Owner.country) + # random effects
         s(Cat.age) + s(Owner.age) + # splines for possible nonlinearities
         First + Owner.education + Owner.sex + Cat.neuter + Cat.breed + Cat.fur.color + Cat.sex, # covariates
     # informative priors:
     prior=c(set_prior("horseshoe(1, par_ratio=0.2)"), prior(student_t(3,0,1), class="sds"), prior(normal(0,0.3), class="sd"),
             prior(normal(0.66,0.1), class="Intercept")),
     family=bernoulli(), iter=20000, control=list(adapt_delta=0.90), data=catnip); summary(c)
# ...   Data: catnip (Number of observations: 288)
# Samples: 4 chains, each with iter = 20000; warmup = 10000; thin = 1;
#          total post-warmup samples = 40000
#
# Smooth Terms:
#                   Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
# sds(sCat.age_1)       0.88      0.82     0.03     2.98      40000 1.00
# sds(sOwner.age_1)     0.76      0.63     0.03     2.33      29984 1.00
#
# Group-Level Effects:
# ~ID (Number of levels: 184)
#               Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
# sd(Intercept)     0.22      0.17     0.01     0.62      22873 1.00
#
# ~Owner.country (Number of levels: 22)
#               Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
# sd(Intercept)     0.62      0.16     0.35     0.96      40000 1.00
#
# Population-Level Effects:
#                                                        Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
# Intercept                                                  0.66      0.25     0.03     1.02      26509 1.00
# FirstTRUE                                                  0.09      0.23    -0.04     0.84      19389 1.00
# Owner.educationbachelors                                   0.03      0.11    -0.07     0.38      30613 1.00
# Owner.educationhighschool                                 -0.02      0.10    -0.26     0.09      33568 1.00
# Owner.educationmasters                                    -0.01      0.09    -0.18     0.12      40000 1.00
# Owner.educationPhD                                         0.00      0.10    -0.14     0.18      40000 1.00
# Owner.educationprofessionaldegreeMDDJDDetc                -0.02      0.12    -0.29     0.11      36396 1.00
# Owner.sexmale                                              0.00      0.07    -0.11     0.16      40000 1.00
# Owner.sexother                                             0.01      0.19    -0.16     0.23      40000 1.00
# Cat.neuterTRUE                                            -0.02      0.14    -0.28     0.12      23840 1.00
# Cat.breeddomesticshortMhairedmixed                         0.01      0.09    -0.09     0.25      40000 1.00
# Cat.breedMainecoon                                         0.01      0.13    -0.15     0.21      40000 1.00
# Cat.breedOther                                            -0.00      0.09    -0.15     0.14      12387 1.00
# Cat.breedRagdoll                                           0.01      0.19    -0.18     0.22      35306 1.00
# Cat.breedRussianBlue                                      -0.01      0.16    -0.26     0.14      18718 1.00
# Cat.fur.colorblackMandMwhite                               0.07      0.25    -0.07     0.92      20627 1.00
# Cat.fur.colorcalico                                       -0.04      0.21    -0.67     0.09      24589 1.00
# Cat.fur.colorcolorpointsdarkextremitylighterbody           0.00      0.12    -0.17     0.17      40000 1.00
# Cat.fur.colorgray                                         -0.01      0.11    -0.23     0.12      36522 1.00
# Cat.fur.colorgrayMandMwhite                               -0.03      0.14    -0.39     0.09      29663 1.00
# Cat.fur.colorOther                                         0.04      0.15    -0.07     0.50      24049 1.00
# Cat.fur.colorTabbyorangecreamandbuff                      -0.00      0.08    -0.15     0.15      40000 1.00
# Cat.fur.colorTorbietortoiseshellcolorswithtabbypattern    -0.02      0.12    -0.30     0.10      34896 1.00
# Cat.sexmale                                                0.01      0.08    -0.08     0.20      40000 1.00
# sCat.age_1                                                 0.01      0.07    -0.06     0.21      34748 1.00
# sOwner.age_1                                              -0.01      0.07    -0.19     0.08      14852 1.00

## Grand mean/intercept converted to probability:
library(boot)
inv.logit(0.66)
# [1] 0.6592603885

## Country-level random-effects:
re <- ranef(c)$Owner.country
round(digits=2, inv.logit(re))
# , , Intercept
#
#                Estimate Est.Error 2.5%ile 97.5%ile
# Australia          0.58      0.64    0.31     0.83
# Belarus            0.53      0.65    0.25     0.80
# Canada             0.59      0.60    0.41     0.77
# Czech Republic     0.55      0.64    0.29     0.81
# Finland            0.56      0.65    0.29     0.82
# France             0.58      0.64    0.32     0.83
# Germany            0.37      0.65    0.14     0.64
# India              0.45      0.63    0.21     0.71
# Ireland            0.53      0.65    0.26     0.80
# Italy              0.45      0.63    0.21     0.70
# Japan              0.57      0.65    0.30     0.83
# Latvia             0.53      0.65    0.25     0.80
# Netherlands        0.56      0.65    0.29     0.82
# New Zealand        0.55      0.65    0.28     0.81
# Panama             0.53      0.65    0.25     0.80
# Russia             0.44      0.65    0.18     0.72
# Slovenia           0.50      0.64    0.24     0.76
# Spain              0.47      0.64    0.21     0.74
# Sweden             0.60      0.64    0.34     0.84
# United Kingdom     0.62      0.59    0.44     0.78
# USA                0.77      0.56    0.68     0.85
# Vietnam            0.44      0.65    0.17     0.72

## Posterior for a generic set of covariates, by country:
round(digits=2, inv.logit(fitted(c, data.frame(Cat.response.Catnip=NA, ID=3, Owner.country=row.names(re), First=FALSE,
    Owner.education="masters", Owner.age=20, Cat.neuter=TRUE, Cat.breed="Other", Cat.sex="male", Owner.sex="male", Cat.fur.color="Other", Cat.age=3))))
#                Estimate Est.Error 2.5%ile 97.5%ile
# Australia          0.68      0.53    0.61     0.72
# Belarus            0.67      0.53    0.60     0.72
# Canada             0.68      0.52    0.63     0.71
# Czech Republic     0.67      0.53    0.60     0.72
# Finland            0.67      0.53    0.60     0.72
# France             0.68      0.53    0.61     0.72
# Germany            0.64      0.54    0.56     0.70
# India              0.66      0.54    0.58     0.71
# Ireland            0.67      0.53    0.60     0.72
# Italy              0.66      0.54    0.58     0.71
# Japan              0.68      0.53    0.61     0.72
# Latvia             0.67      0.53    0.59     0.72
# Netherlands        0.68      0.53    0.61     0.72
# New Zealand        0.67      0.53    0.60     0.72
# Panama             0.67      0.54    0.59     0.72
# Russia             0.65      0.54    0.57     0.71
# Slovenia           0.67      0.53    0.59     0.71
# Spain              0.66      0.54    0.58     0.71
# Sweden             0.68      0.53    0.62     0.72
# United Kingdom     0.69      0.52    0.64     0.72
# USA                0.71      0.51    0.68     0.72
# Vietnam            0.65      0.54    0.57     0.71

The mod­el-fit­ting shows that most of the covari­ates are unable to pre­dict the cat­nip response and are best esti­mated with near-zero coeffi­cients and that the splines are lin­ear & also irrel­e­vant; the only vari­ables which appear to mat­ter are the owner/country ran­dom effects, and being the first cat. This agrees with the meta-analy­sis find­ing age/sex/breed unhelp­ful but coun­try impor­tant, and with my belief that some sort of bias is dri­ving the anom­alously high raw cat­nip rates: the first covari­ate is one of the strongest, and after adjust­ing for it, the cat­nip rate looks like it should. (The coun­try-level esti­mates also look rea­son­ably con­sis­tent with the GS sur­vey, given their large uncer­tain­ties eg USA is 71% here and 79% there.)

Since we can rule out most of the vari­ables, it would be a lot eas­ier to work with a sub­set of the vari­ables (which also avoids miss­ing­ness, rais­ing the sam­ple size from n = 288 to n = 351, giv­ing a dou­ble boost of fewer irrel­e­vant vari­ables + more data):

c2 <- brm(Cat.response.Catnip ~ (1|ID) + (1|Owner.country) + First,
    prior=c(prior(normal(0,0.3), class="sd"), prior(normal(0.66,0.1), class="Intercept")),
    family=bernoulli(), iter=20000, data=catnip); summary(c2)
# ...Data: catnip (Number of observations: 351)
# Samples: 4 chains, each with iter = 20000; warmup = 10000; thin = 1;
#          total post-warmup samples = 40000
#
# Group-Level Effects:
# ~ID (Number of levels: 218)
#               Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
# sd(Intercept)     0.24      0.18     0.01     0.65      14720 1.00
#
# ~Owner.country (Number of levels: 23)
#               Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
# sd(Intercept)     0.64      0.15     0.37     0.97      25337 1.00
#
# Population-Level Effects:
#           Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
# Intercept     0.18      0.22    -0.25     0.60      40000 1.00
# FirstTRUE     0.93      0.31     0.33     1.53      40000 1.00

## Country-level posterior estimates:
re2 <- ranef(c2)$Owner.country
predictions <- round(digits=2, inv.logit(fitted(c2, data.frame(Cat.response.Catnip=NA, ID=1, Owner.country=row.names(re2), First=FALSE))))
cbind(Country=as.factor(row.names(re2)), as.data.frame(predictions))
#           Country Estimate Est.Error 2.5%ile 97.5%ile
# 1       Australia     0.63      0.54    0.56     0.70
# 2         Austria     0.64      0.54    0.56     0.70
# 3         Belarus     0.64      0.54    0.56     0.70
# 4          Canada     0.66      0.53    0.61     0.70
# 5  Czech Republic     0.65      0.54    0.57     0.71
# 6         Finland     0.64      0.54    0.57     0.70
# 7          France     0.65      0.54    0.58     0.71
# 8         Germany     0.61      0.54    0.54     0.68
# 9           India     0.62      0.54    0.55     0.69
# 10        Ireland     0.64      0.54    0.56     0.70
# 11          Italy     0.63      0.54    0.55     0.69
# 12          Japan     0.65      0.54    0.57     0.71
# 13         Latvia     0.66      0.53    0.59     0.71
# 14    Netherlands     0.65      0.54    0.57     0.71
# 15    New Zealand     0.64      0.54    0.57     0.70
# 16         Panama     0.64      0.54    0.56     0.70
# 17         Russia     0.62      0.54    0.54     0.69
# 18       Slovenia     0.63      0.54    0.56     0.70
# 19          Spain     0.63      0.54    0.55     0.69
# 20         Sweden     0.66      0.53    0.59     0.71
# 21 United Kingdom     0.67      0.52    0.62     0.71
# 22            USA     0.69      0.51    0.66     0.71
# 23        Vietnam     0.62      0.54    0.54     0.69

The First vari­able emerges beyond any doubt as a pow­er­ful bias in the results. The rest remains largely as before: heavy shrink­age around the com­mon mean, as there’s insuffi­cient data to esti­mate most coun­tries with any rea­son­able accu­racy (even the USA).

Intercorrelations of catnip & catnip alternatives

Since the cat-level rat­ings will be affected by miss­ing­ness as well, any attempt to cor­re­late responses or extract a latent fac­tor will be even more impre­cise than the miss­ing­ness per­cent­ages sug­gest. Still, we can look at the cor­re­la­tions and the s as well (since cat­nip response is treat­able as a lia­bil­ity thresh­old mod­el, it’s rea­son­able to imag­ine the other drugs like­wise):

## Examine intercorrelations of the drug responses: simple Pearson's r, then more sophisticated tetrachoric correlation:
responses <- subset(catnip, select=c(Cat.response.Catnip, Cat.response.Valerian, Cat.response.Silvervine,
                                     Cat.response.Thyme, Cat.response.Honeysuckle))
colnames(responses) <- c("Catnip", "Valerian", "Silvervine", "Thyme", "Honeysuckle")
round(digits=2, cor(use="pairwise.complete.obs", responses))
#             Catnip Valerian Silvervine Thyme Honeysuckle
# Catnip        1.00
# Valerian      0.07     1.00
# Silvervine    0.13     0.71       1.00
# Thyme         0.25     0.72       1.00  1.00
# Honeysuckle   0.21     0.75       1.00  0.92        1.00
library(psych)
tc <- tetrachoric(responses); tc
# tetrachoric correlation
#             Catnp Valrn Slvrv Thyme Hnysc
# Catnip       1.00
# Valerian     0.13  1.00
# Silvervine  -0.18  0.68  1.00
# Thyme        0.36  0.76  0.84  1.00
# Honeysuckle  0.39  0.81  0.82  0.98  1.00
#
#  with tau of
#      Catnip    Valerian  Silvervine       Thyme Honeysuckle
#       -1.05       -0.41        0.67        0.29        0.27
## cutpoints for responder percentages reported:
round(digits=2, 1-pnorm(tc$tau))
#     Catnip    Valerian  Silvervine       Thyme Honeysuckle
#       0.85        0.66        0.25        0.39        0.40
fa(nfactors=1, responses)
#  ...Warning: A Heywood case was detected.
# Standardized loadings (pattern matrix) based upon correlation matrix
#              MR1    h2     u2 com
# Catnip      0.18 0.033  0.967   1
# Valerian    0.73 0.536  0.464   1
# Silvervine  1.00 1.003 -0.003   1
# Thyme       0.98 0.952  0.048   1
# Honeysuckle 0.98 0.969  0.031   1
#
#                 MR1
# SS loadings    3.49
# Proportion Var 0.70
# ...

The cor­re­la­tions are inter­est­ing: there might be a clus­ter of Valerian/thyme/honeysuckle/silvervine respon­ders, and then cat­nip is rel­a­tively unre­lat­ed. This might be con­nected to the demand bias - true cat­nip response data might show larger cor­re­la­tions. Fac­tor analy­sis is not a good idea here, but if one extracts a sin­gle fac­tor, it looks sim­i­lar to the clus­ter. But these inter­cor­re­la­tions con­trast with Bol et al 2017, :

bol2017 <- read.csv("https://www.gwern.net/docs/catnip/2017-bol-cats.csv")
bol2017[,5:8] <- (bol2017[,5:8] > 0) # treat '5'/'10' (weak/strong response) as binary
bol2017Responses <- subset(bol2017, select=c("Catnip", "Valerian", "Silver.vine", "Tatarian.honeysuckle"))
colnames(bol2017Responses) <- c("Catnip", "Valerian", "Silvervine", "Honeysuckle") # rename for consistency
library(skimr)
skim(bol2017Responses)
#  n obs: 100
#  n variables: 4
#
# Variable type: logical
#     variable missing complete   n mean                   count
#       Catnip       1       99 100 0.68 TRU: 67, FAL: 32, NA: 1
#  Honeysuckle       3       97 100 0.53 TRU: 51, FAL: 46, NA: 3
#   Silvervine       0      100 100 0.79 TRU: 79, FAL: 21, NA: 0
#     Valerian       4       96 100 0.47 FAL: 51, TRU: 45, NA: 4

## calculate correlations & repeat factor analysis:
library(psych)
round(digits=2, cor(use="pairwise.complete.obs", bol2017Responses))
#             Catnip Valerian Silvervine Honeysuckle
# Catnip        1.00
# Valerian      0.38     1.00
# Silvervine    0.14     0.26       1.00
# Honeysuckle   0.27     0.22       0.13        1.00
tetrachoric(bol2017Responses)
#             Catnp Valrn Slvrv Hnysc
# Catnip      1.00
# Valerian    0.60  1.00
# Silvervine  0.24  0.47  1.00
# Honeysuckle 0.43  0.35  0.23  1.00
#
#  with tau of
#      Catnip    Valerian  Silvervine Honeysuckle
#      -0.459       0.078      -0.806      -0.065
fa(bol2017Responses, nfactors=1)
# ...Standardized loadings (pattern matrix) based upon correlation matrix
#              MR1   h2   u2 com
# Catnip      0.58 0.34 0.66   1
# Valerian    0.65 0.43 0.57   1
# Silvervine  0.32 0.10 0.90   1
# Honeysuckle 0.40 0.16 0.84   1
#
#                 MR1
# SS loadings    1.03
# Proportion Var 0.26
# ...

But since the sam­ple sizes are small, per­haps pool­ing the data for fac­tor analy­sis (pos­si­ble since we have indi­vid­u­al-level data) will be help­ful:

## Merge Bol et al 2017 + convenience sample data:
responsesAll <- merge(responses, bol2017Responses, all=TRUE)
## delete Thyme as causing too much missingness:
responsesAll$Thyme <- NULL
skim(responsesAll)
# Skim summary statistics
#  n obs: 545
#  n variables: 4
#
# Variable type: logical
#     variable missing complete   n mean                      count
#       Catnip      40      505 545 0.83  TRU: 421, FAL: 84, NA: 40
#  Honeysuckle     350      195 545 0.46 NA: 350, FAL: 106, TRU: 89
#   Silvervine     375      170 545 0.61 NA: 375, TRU: 103, FAL: 67
#     Valerian     347      198 545 0.48 NA: 347, FAL: 102, TRU: 96
round(digits=2, cor(use="pairwise.complete.obs", responsesAll))
#             Catnip Valerian Silvervine Honeysuckle
# Catnip        1.00
# Valerian      0.21     1.00
# Silvervine   -0.09     0.56       1.00
# Honeysuckle   0.16     0.54       0.53        1.00
tetrachoric(responsesAll)
# tetrachoric correlation
#             Catnp Valrn Slvrv Hnysc
# Catnip       1.00
# Valerian     0.38  1.00
# Silvervine  -0.16  0.82  1.00
# Honeysuckle  0.28  0.74  0.77  1.00
#
#  with tau of
#      Catnip    Valerian  Silvervine Honeysuckle
#      -0.969       0.038      -0.269       0.109
fa(responsesAll, nfactors=1)
# Standardized loadings (pattern matrix) based upon correlation matrix
#              MR1    h2   u2 com
# Catnip      0.13 0.018 0.98   1
# Valerian    0.78 0.610 0.39   1
# Silvervine  0.70 0.494 0.51   1
# Honeysuckle 0.73 0.528 0.47   1
#
#                 MR1
# SS loadings    1.65
# Proportion Var 0.41

The com­bined data hints at a cat­nip vs non-cat­nip fac­tor, but in gen­eral there are enough cor­re­la­tions that it may be pos­si­ble to use­fully include responses in decid­ing what to try: a non-re­sponse to cat­nip sug­gests try­ing sil­vervine next as sil­vervine response is com­mon and may be inversely cor­re­lated with cat­nip respon­se, while a cat­nip response would sug­gest valer­ian as an alter­na­tive if an owner wanted to mix things up or use dur­ing tol­er­ance.

Human consumption

For kicks, I added ques­tions about human use of cat­nip as a herbal rem­e­dy. Cat­nip tea has been used to calm infants for cen­turies, among other things, and an old mar­i­jua­na-re­lated rumor claims you can get high off it (turns out that smok­ing cat­nip does­n’t work well).

catnipOld <- read.csv("https://www.gwern.net/docs/catnip/2017-01-02-catnipsurvey-conveniencesample.csv")
catnipOld[!is.na(catnipOld$Owner.catnip.consumption) & catnipOld$Owner.catnip.consumption=="Wait, is this survey for me or my cat? I'm even more confused",]$Owner.catnip.consumption <- NA

library(qdapTools)
## split each string-list by comma-space delimiter, and unfold the list into a dataframe of dummy variables, one per food type:
catnipUse <- mtabulate(strsplit(as.character(catnipOld$Owner.catnip.consumption), ', '))
catnipUse$Rating <- catnipOld$Owner.catnip.consumption.efficacy
catnipUse <- catnipUse[!catnipUse$"NA",]
catnipUse <- catnipUse[!catnipUse$"FALSE",]
catnipUse$Smoked <- catnipUse$"Smelled burning catnip leaves" + catnipUse$"smoked catnip leaves"
catnipUse$Tea <- catnipUse$"valerian tea" + catnipUse$"catnip tea"
catnipUse$"NA" <- catnipUse$"FALSE" <- catnipUse$"Smelled burning catnip leaves" <- catnipUse$"smoked catnip leaves" <- catnipUse$"valerian tea" <- catnipUse$"catnip tea" <- NULL
colnames(catnipUse) <- c("Eaten.leaves.dry", "Eaten.leaves.fresh", "Skin.essential.oil", "Skin.leaves", "Eaten.leaves.food", "Rating", "Smoked", "Tea")
skim(catnipUse)
# Skim summary statistics
#  n obs: 221
#  n variables: 8
#
# Variable type: integer
#            variable missing complete   n   mean    sd p0 p25 p50 p75 p100     hist
#    Eaten.leaves.dry       0      221 221 0.0045 0.067  0   0   0   0    1 ▇▁▁▁▁▁▁▁
#   Eaten.leaves.food       0      221 221 0.0045 0.067  0   0   0   0    1 ▇▁▁▁▁▁▁▁
#  Eaten.leaves.fresh       0      221 221 0.0045 0.067  0   0   0   0    1 ▇▁▁▁▁▁▁▁
#              Rating     178       43 221 2.33   1.19   1   1   2   3    5 ▇▃▁▇▁▂▁▁
#  Skin.essential.oil       0      221 221 0.0045 0.067  0   0   0   0    1 ▇▁▁▁▁▁▁▁
#         Skin.leaves       0      221 221 0.0045 0.067  0   0   0   0    1 ▇▁▁▁▁▁▁▁
#              Smoked       0      221 221 0.063  0.24   0   0   0   0    1 ▇▁▁▁▁▁▁▁
#                 Tea       0      221 221 0.1    0.31   0   0   0   0    1 ▇▁▁▁▁▁▁▁
colSums(catnipUse)
# Eaten.leaves.dry Eaten.leaves.fresh Skin.essential.oil        Skin.leaves  Eaten.leaves.food             Rating             Smoked
#                1                  1                  1                  1                  1                 NA                 14
#              Tea
#               23
sort(decreasing=TRUE, colSums(mtabulate(strsplit(as.character(catnipOld$Owner.catnip.consumption.reason), ', '))))
#                                                                     curiosity
#                                                                            15
#                                                         euphoria/getting high
#                                                                             9
#                                                           relaxation/sedation
#                                                                             9
#                                                                   indigestion
#                                                                             2
#                                                                      insomnia
#                                                                             2
#                                                                 Experimenting
#                                                                             1
#                                                                     headaches
#                                                                             1
#                                                                    I like tea
#                                                                             1
#                                                             I like the taste
#                                                                             1
# i was under the age of 15 thinking it would get me NAbuzzedNA. It did nothing
#                                                                             1
#                                                                 Just because
#                                                                             1
#                                                                 Just to do it
#                                                                             1
#                                               Mixed with lactation supplement
#                                                                             1
#                                                                            NA
#                                                                             1
#                                                                stomach cramps
#                                                                             1
#                             Trying it out as a substitute/mixer for marijuana
#                                                                             1

The “mixed with lac­ta­tion sup­ple­ment” rea­son throws me for a loop (how exactly does that work?), but the rest of the rea­sons make sense, although I’m a lit­tle sur­prised that the rumor of cat­nip get­ting you high is enough rea­son for quite a large frac­tion of cat­nip users to give it a try. (Per­haps cat­nip is much more acces­si­ble than a lot of other things, so peo­ple fig­ure they might as well.)

Does the kind of cat­nip con­sump­tion affect the user’s esti­mate of how well it works? Using the 43 rat­ings, we can fit an ordi­nal regres­sion model to the 1-5 Lik­ert scale rat­ing of effi­cacy against the method/kind of cat­nip con­sump­tion report­ed:

h <- brm(Rating ~ Eaten.leaves.dry + Eaten.leaves.fresh + Skin.essential.oil + Skin.leaves + Eaten.leaves.food + Smoked + Tea,
    prior=c(set_prior("horseshoe(1, par_ratio=0.1)")), family=cumulative(), iter=20000, control=list(adapt_delta=0.9), data=catnipUse)
summary(h)
#    Data: catnipUse (Number of observations: 43)
# Samples: 4 chains, each with iter = 20000; warmup = 10000; thin = 1;
#          total post-warmup samples = 40000
#
# Population-Level Effects:
#                    Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
# Intercept[1]          -0.79      0.44    -1.76    -0.02      21342 1.00
# Intercept[2]          -0.02      0.40    -0.87     0.74      26825 1.00
# Intercept[3]           1.89      0.50     0.97     2.95      40000 1.00
# Intercept[4]           3.42      0.85     2.00     5.33      40000 1.00
# Eaten.leaves.dry       0.03      0.34    -0.45     0.87      35826 1.00
# Eaten.leaves.fresh     0.10      0.48    -0.32     1.60      23173 1.00
# Skin.essential.oil    -0.03      0.33    -0.79     0.44      33813 1.00
# Skin.leaves           -0.11      0.58    -1.67     0.33      21175 1.00
# Eaten.leaves.food      0.02      0.32    -0.49     0.78      38330 1.00
# Smoked                -0.39      0.66    -2.13     0.07      10463 1.00
# Tea                    0.08      0.27    -0.16     0.96      23907 1.00

The most strik­ing one is smok­ing, pre­sum­ably because cat­nip does­n’t get you high. And it looks plau­si­ble that fresh leaves or tea are prob­a­bly the best ways to get the relax­ant effects.


  1. One inter­est­ing issue with run­ning the Ger­man ver­sion of the sur­vey in Google Sur­veys was deal­ing with a bug in Google Trans­late (GT). I noticed while exam­in­ing the sug­gested trans­la­tions in GT that it was mis­tak­enly trans­lat­ing both “I have a cat” and “I do not have a cat” to the same Ger­many sen­tence, Ich habe eine Katze, which trans­lated back as the sin­gle sen­tence “I have a cat” (fail­ing the roundtrip cri­te­ria of iden­ti­ty, that En==En(De(En))); eine/Keine are the pos­i­tive & neg­a­tive Ger­man ver­sions of “have”, so it was a major error. I thought noth­ing of it until GS refused to run the Ger­many sur­vey, warn­ing me that it needed edit­ing so all respon­dents could answer the ques­tion, and sug­gest­ing that the response option “I have a cat” should be changed or a “NA” response added. Appar­ently GS auto­mat­i­cally trans­lates the non-Eng­lish sur­veys back into Eng­lish & checks that the responses are mutu­ally exhaus­tive, so the first response was erro­neously trans­lated into Eng­lish as “I have a cat” (rather than “I do not have a cat”) and the set of responses was indeed mal­formed (as all the responses then pre­sume hav­ing a cat). After sub­mit­ting a cor­rec­tion to GT & explain­ing this to the GS help desk, I was sur­prised & pleased they restarted the sur­vey within a few min­utes.↩︎