Alerts Over Time

Does Google Alerts return fewer results each year? A statistical investigation
statistics, shell, R, Google
2013-07-012013-11-26 finished certainty: likely importance: 4

Has Google Alerts been send­ing fewer re­sults the past few years? Yes. Re­spond­ing to ru­mors of its demise, I in­ves­ti­gate the num­ber of re­sults in my per­sonal Google Alerts no­ti­fi­ca­tions 2007-2013, and find no over­all trend of de­cline un­til I look at a tran­si­tion in mid-2011 where the re­sults fall dra­mat­i­cal­ly. I spec­u­late about the cause and im­pli­ca­tions for Alert­s’s fu­ture.

While re­search­ing my es­say on how long Google prod­ucts sur­vive be­fore be­ing killed (in­spired by Read­er), I came across spec­u­la­tion that , a ser­vice which runs search queries on your be­half & emails you about any new match­ing web­pages (ex­tremely use­ful for keep­ing abreast of top­ics and one of the old­est Google ser­vices), had bro­ken badly in 2012. When I saw this, I re­mem­bered think­ing that my own alerts did not seem to be as use­ful as they did, but I had­n’t been sure if this was Alert­s’s fault or if my par­tic­u­lar key­words were just less ac­tive than the past. Google’s offi­cial com­ments on the topic have been min­i­mal1.

Alerts dy­ing would be a prob­lem for me as I have used Alerts ex­ten­sively since 2007-01-28 (2347 days) with 23 cur­rent Alerts (and many more in the past) - of my 501,662 to­tal emails, 3,815 were Alert emails - and there did not seem to be any us­able al­ter­na­tives2. Trou­bling­ly, Alert­s’s RSS feeds were un­avail­able be­tween July & Sep­tem­ber 2013.

As it hap­pened, the sur­vival model sug­gested that Alerts had a good chance of sur­viv­ing a long time, and I put it from mind un­til I re­mem­bered that since I had used Alerts for so many years and had so many emails, I could eas­ily check the claims em­pir­i­cal­ly—did Alerts abruptly stop re­turn­ing many hits? This is a straight­for­ward ques­tion to an­swer: ex­tract the subject/date/number of links from each Alerts email, strat­ify by unique alert, and regress over time. So I did.


As part of my backup pro­ce­dures, I fetch daily my Gmail emails us­ing getmail4 into a . Alerts uses a un­chang­ing sub­ject line like Subject: Google Alert - "Frank Herbert" -mason, so it is easy to find all its emails and sep­a­rate them out.

find ~/mail/ -type f -exec fgrep -l {} 'Google Alert -' \;
# /home/gwern/mail/new/1282125775.M532208P12203Q683Rb91205f53b0fec0d.craft
# /home/gwern/mail/new/1282125789.M55800P12266Q737Rd98db4aa1e58e9ed.craft
# ...
find ~/mail/ -type f -exec fgrep -l 'Google Alert -' {} \; > alerts.txt
mkdir 2013-09-25-gwern-googlealertsemails/
mv `cat alerts.txt` 2013-09-25-gwern-googlealertsemails/

I deleted emails from a few alerts which were pri­vate; the re­main­ing 72M of emails are avail­able at 2013-09-25-gwern-googlealertsemails.tar.xz. Then a loop & ad hoc shel­l-script­ing ex­tracts the sub­jec­t-line, the date, and how many in­stances of “http://” there are in each email:

cd 2013-09-25-gwern-googlealertsemails/

echo "Search,Date,Links" >> alerts.csv # set up the header
for EMAIL in *.craft *.elan; do

    SUBJECT="`egrep '^Subject: Google Alert - ' $EMAIL  | sed -e 's/Subject: Google Alert - //'`"
    DATE="`egrep '^Date: ' $EMAIL | cut -d ' ' -f 3-5 | sed -e 's/<b>..*/ /'`"
    COUNT="`fgrep --no-filename --count 'http://' $EMAIL`"

    echo $SUBJECT,$DATE,$COUNT >> alerts.csv

The script­ing is­n’t per­fect and I had to delete sev­eral spu­ri­ous lines be­fore I could read it into R and for­mat it into a clean CSV:

alerts <- read.csv("alerts.csv", quote=c(), colClasses=c("character","character","integer"))
alerts$Date <- as.Date(alerts$Date, format="%d %b %Y")
write.csv(alerts, file="2013-09-25-gwern-googlealerts.csv", row.names=FALSE)



alerts <- read.csv("",
#                      Search          Date                Links
#  wikipedia              : 255   Min.   :2007-01-28   Min.   :  0.0
#  Neon Genesis Evangelion: 247   1st Qu.:2008-12-29   1st Qu.: 10.0
#  "Gene Wolfe"           : 246   Median :2011-02-07   Median : 22.0
#  "Nick Bostrom"         : 224   Mean   :2010-10-06   Mean   : 37.9
#  modafinil              : 186   3rd Qu.:2012-06-15   3rd Qu.: 44.0
#  "Frank Herbert" -mason : 184   Max.   :2013-09-25   Max.   :563.0
#  (Other)                :2585

# So many because I have deleted many topics I am no longer interested in,
# and refined the search criteria of others
# [1] 68

plot(Links ~ Date, data=alerts)
Links in each email, graphed over time

The first thing I no­tice is that it looks like the num­ber of links per email is go­ing up over time, with a spike in mid-2010. The sec­ond is that there’s quite a bit of vari­a­tion from email to email - while most are around 0, some are as high as 300. The third is that there’s a weird early anom­aly where emails are recorded as hav­ing 0 links; look­ing at those emails, they are en­cod­ed, for no ap­par­ent rea­son, and then all sub­se­quent emails are in more sen­si­ble HTML/text for­mats. An il­l-fated ex­per­i­ment by Google? I have no idea. The high­est num­ber is 563, which is­n’t very big; so de­spite the skew, I did­n’t bother to log-trans­form Links.

Linear model

The spik­i­ness prompts me to ad­just for vari­a­tion in send­ing rate and what­not by buck­et­ing emails into months, and over­lay a lin­ear re­gres­sion:

alerts$Date <- floor_date(alerts$Date, "month")
alerts <- aggregate(Links ~ Search + Date, alerts, "sum")

# a simple linear model agrees with *small* monthly increase, but notes that there is a tremendous
# amount of unexplained variation
lm <- lm(Links ~ Date, data=alerts); summary(lm)
# ...
# Residuals:
#    Min     1Q Median     3Q    Max
# -175.8 -110.0  -60.5   43.3  992.6
# Coefficients:
#              Estimate Std. Error t value Pr(>|t|)
# (Intercept) -4.13e+02   1.10e+02   -3.76  0.00018
# Date         3.73e-02   7.38e-03    5.06  4.9e-07
# Residual standard error: 175 on 1046 degrees of freedom
# Multiple R-squared:  0.0239,    Adjusted R-squared:  0.023
# F-statistic: 25.6 on 1 and 1046 DF,  p-value: 4.89e-07

plot(Links ~ Date, data=alerts)
To­tal links in each search, by month

No big differ­ence with the orig­i­nal plot: still a generic in­creas­ing trend. Here is a ba­sic prob­lem with this re­gres­sion: does this in­crease re­flect an in­crease in the num­ber of alerts I am sub­scribed to, tweaks to each alert to make each re­turn more hits (shift­ing from old alerts to new alert­s), or a in­crease in links per unique alert? It is only the last claim we are in­ter­ested in, but any of these or other phe­nom­e­non could pro­duce an in­crease.

Per alert

We could try treat­ing each alert sep­a­rately and do­ing a lin­ear re­gres­sion on them, and com­par­ing with the lin­ear model on all data in­dis­crim­i­nate­ly:

qplot(Date, Links, color=Search, data=alerts) +
    stat_smooth(method="lm", se=FALSE, fullrange=TRUE, size=0.2) +
    geom_abline(aes(intercept=lm$coefficients[1], slope=lm$coefficients[2], color=c()), size=1) +
    ylim(0,1130) +
    theme(legend.position = "none")
Split­ting data by alert, re­gress­ing in­di­vid­u­ally

The re­sult is chaot­ic. In­di­vid­ual alerts are point­ing every which way. Re­gress­ing on every alert to­gether con­founds is­sues, and re­gress­ing on in­di­vid­ual alerts pro­duces no agree­ment. We want some in­ter­me­di­ate ap­proach which re­spects that alerts have differ­ent be­hav­ior, but yields a mean­ing­ful over­all state­ment.

Multi-level model

What we want is to look at each unique alert, es­ti­mate its increase/decrease over time, and per­haps sum­ma­rize all the slopes into a sin­gle grand slope. There is a hi­er­ar­chi­cal struc­ture to the data: the over­all slope of Google in­flu­ences the slope of each alert, which in­flu­ences the dis­tri­b­u­tion of the data points around each slope.

We can do this with a , us­ing lme4.

We’ll start by fit­ting & com­par­ing 2 mod­els:

  1. only the in­ter­cept varies be­tween each alert, but all alerts in­crease or de­crease at the same rate
  2. the in­ter­cept varies be­tween each alert, and also alerts differ in their wax­ing or wan­ing
mlm1 <- lmer(Links ~ Date + (1|Search), alerts); mlm1
# Random effects:
#  Groups   Name        Variance Std.Dev.
#  Search   (Intercept) 28943    170
#  Residual             12395    111
# Number of obs: 1048, groups: Search, 68
# Fixed effects:
#              Estimate Std. Error t value
# (Intercept) 427.93512  139.76994    3.06
# Date         -0.01984    0.00928   -2.14
# Correlation of Fixed Effects:
#      (Intr)
# Date -0.988

mlm2 <- lmer(Links ~ Date + (1+Date|Search), alerts); mlm2
# Random effects:
#  Groups   Name        Variance Std.Dev. Corr
#  Search   (Intercept) 6.40e+06 2529.718
#           Date        2.78e-02    0.167 -0.998
#  Residual             8.36e+03   91.446
# Number of obs: 1048, groups: Search, 68
# Fixed effects:
#             Estimate Std. Error t value
# (Intercept) 295.5469   420.0588    0.70
# Date         -0.0090     0.0278   -0.32
# Correlation of Fixed Effects:
#      (Intr)
# Date -0.998

# compare the models: does model 2 buy us anything?
anova(mlm1, mlm2)
# mlm1: Links ~ Date + (1 | Search)
# mlm2: Links ~ Date + (1 + Date | Search)
#      Df   AIC   BIC logLik deviance Chisq Chi Df Pr(>Chisq)
# mlm1  4 13070 13089  -6531    13062
# mlm2  6 12771 12801  -6379    12759   303      2     <2e-16

Model 2 is bet­ter on both simplicity/fit cri­te­ria, so we’ll look at that closer:

# $Search
#                                                                (Intercept)      Date
#                                                                     763.48 -0.046763
# adult iodine supplementation (IQ OR intelligence OR cognitive)      718.80 -0.043157
# AMD pacifica virtualization                                        -836.63  0.062123
# (anime OR manga) (half-Japanese OR hafu OR half-American)           956.52 -0.059438
# caloric restriction                                               -2023.10  0.153667
# "Death Note" (script OR live-action OR Parlapanides)                866.63 -0.051314
# "dual n-back"                                                      4212.59 -0.266879
# dual n-back                                                        1213.85 -0.073265
# electric sheep screensaver                                         -745.78  0.055937
# "Frank Herbert"                                                     -93.28  0.013636
# "Frank Herbert" -mason                                            10815.19 -0.676188
# freenet project                                                   -1154.14  0.087199
# "Gene Wolfe"                                                        496.01 -0.026575
# Gene Wolfe                                                         1178.36 -0.072681
# ...
# wikileaks                                                         -3080.74  0.227583
# WikiLeaks                                                          -388.34  0.031441
# wikipedia                                                         -1668.94  0.133976
# Xen                                                                 390.01 -0.017209

Date is in links per mon­th, so when Xen has a slope of -0.02, that means that every year it falls one link.

# [1] 0.6762

Which comes from the "Frank Herbert" -mason search, prob­a­bly re­flect­ing how rel­a­tively new the search is or the effec­tive­ness of the fil­ter I added to the orig­i­nal "Frank Herbert" search. In gen­er­al, the slopes are very sim­i­lar, there seem to be as many pos­i­tive slopes as there are neg­a­tive, and the over­all sum­mary slope is a tiny neg­a­tive slope in the sec­ond model (-0.01); but most of the search­es’ slopes ex­clude zero in the cater­pil­lar plot:

qqmath(ranef(mlm2, postVar=TRUE))

This says to me that there is no large change over time hap­pen­ing within each alert, as the orig­i­nal claims went, but there does seem to be some­thing go­ing on. When we plot the over­all re­gres­sion and the per-alert re­gres­sions, we see

fixParam <- fixef(mlm2)
ranParam <- ranef(mlm2)$Search
params   <- cbind(ranParam[1]+fixParam[1], ranParam[2]+fixParam[2])
p <- qplot(Date, Links, color=Search, data=alerts)
p +
  geom_abline(aes(intercept=`(Intercept)`, slope=Date, color=rownames(params)), data=params, size=0.2) +
  geom_abline(aes(intercept=fixef(mlm2)[1], slope=fixef(mlm2)[2], color=c()), size=1) +
  ylim(0,1130) +
  theme(legend.position = "none")
Mul­ti­-level re­gres­sion, grand and in­di­vid­ual fits

This clearly makes more sense than re­gress­ing each alert sep­a­rate­ly, as we avoid crazily steep slopes when there are just a few emails to use and their re­gres­sions get shrunk to the over­all re­gres­sion. We also see no ev­i­dence for any large or sta­tis­ti­cal­ly-sig­nifi­cant change over time for alerts in gen­er­al: some alerts do in­crease over time but some alerts also de­crease over time, and there is only a small de­crease which we might blame on in­ter­nal Google prob­lems.

What about the fall?

Hav­ing done all this, I thought I was fin­ished un­til I re­mem­bered that the orig­i­nal blog­gers did­n’t com­plain about a steady de­te­ri­o­ra­tion over time, but an abrupt one start­ing some­where in 2012. What hap­pens when I do a bi­nary split and com­pare 2010/2011 to 2012/2013?

alertsRecent <- alerts[year(alerts$Date)>=2010,]
alertsRecent$Recent <- year(alertsRecent$Date) >= 2012
wilcox.test(Links ~ Recent,, data=alertsRecent)
#     Wilcoxon rank sum test with continuity correction
# data:  Links by Recent
# W = 71113, p-value = 6.999e-10
# alternative hypothesis: true location shift is not equal to 0
# 95% confidence interval:
#  34 75
# sample estimates:
# difference in location
#                     53

I avoided a nor­mal­i­ty-based test like and used in­stead be­cause there’s no rea­son to ex­pect the num­ber of links per month to fol­low a nor­mal dis­tri­b­u­tion. Re­gard­less of the de­tails, there’s a big differ­ence be­tween the two time pe­ri­ods: 219 vs 140 links! A fall of 36% is cer­tainly a se­ri­ous de­cline, and it can­not be waved away as due to my Alerts set­tings (I al­ways use “All re­sults” and never “Only the best re­sults”) nor, as we’ll see now, a con­found like the pos­si­bil­i­ties that mo­ti­vated mul­ti­-level model use:

alerts$Recent <- year(alerts$Date) >= 2012
mlm3 <- lmer(Links ~ Date + Recent + (1+Date|Search), alerts); mlm3
# Random effects:
#  Groups   Name        Variance Std.Dev. Corr
#  Search   (Intercept) 9.22e+03 9.60e+01
#           Date        9.52e-05 9.75e-03 -0.164
#  Residual             1.18e+04 1.09e+02
# Number of obs: 1048, groups: Search, 68
# Fixed effects:
#              Estimate Std. Error t value
# (Intercept) -440.1540   175.3630   -2.51
# Date           0.0413     0.0121    3.42
# RecentTRUE  -102.2273    13.3224   -7.67
# Correlation of Fixed Effects:
#            (Intr) Date
# Date       -0.993
# RecentTRUE  0.630 -0.647
anova(mlm1, mlm2, mlm3)
# Models:
# mlm1: Links ~ Date + (1 | Search)
# mlm2: Links ~ Date + (1 + Date | Search)
# mlm3: Links ~ Date + Recent + (1 + Date | Search)
#      Df   AIC   BIC logLik deviance Chisq Chi Df Pr(>Chisq)
# mlm1  4 13070 13089  -6531    13062
# mlm2  6 12771 12801  -6379    12759   303      2     <2e-16
# mlm3  7 13015 13050  -6500    13001     0      1          1

It was mid-2011

A new model treat­ing pre-2012 as differ­ent turns up with a su­pe­rior fit. Can we do bet­ter? A changepoint fin­gers May/June 2011 as the cul­prit and giv­ing a larger differ­ence in means (254 vs 147):

plot(cpt.meanvar(alertsRecent$Links), ylab="Links")
Link count 2010-2013, de­pict­ing a regime tran­si­tion in May/June 2011

With this new change­point, the test is more sig­nifi­cant

alertsRecent <- alerts[year(alerts$Date)>=2010,]
alertsRecent$Recent <- alertsRecent$Date > "2011-05-01"
wilcox.test(Links ~ Recent,, data=alertsRecent)
#     Wilcoxon rank sum test with continuity correction
# data:  Links by Recent
# W = 63480, p-value = 4.61e-12
# alternative hypothesis: true location shift is not equal to 0
# 95% confidence interval:
#   62 112
# sample estimates:
# difference in location
#                     87

And the fit im­proves by a large amount:

alerts$Recent <- alerts$Date > "2011-05-01"
mlm4 <- lmer(Links ~ Date + Recent + (1+Date|Search), alerts); mlm4
# Random effects:
#  Groups   Name        Variance Std.Dev. Corr
#  Search   (Intercept) 8.64e+03 9.30e+01
#           Date        9.28e-05 9.63e-03 -0.172
#  Residual             1.11e+04 1.05e+02
# Number of obs: 1048, groups: Search, 68
# Fixed effects:
#              Estimate Std. Error t value
# (Intercept) -1.11e+03   1.87e+02   -5.91
# Date         8.86e-02   1.30e-02    6.83
# RecentTRUE  -1.65e+02   1.44e+01  -11.43
# Correlation of Fixed Effects:
#            (Intr) Date
# Date       -0.994
# RecentTRUE  0.709 -0.725

anova(mlm1, mlm2, mlm3, mlm4)
#      Df   AIC   BIC logLik deviance Chisq Chi Df Pr(>Chisq)
# mlm1  4 13070 13089  -6531    13062
# mlm2  6 12771 12801  -6379    12759 302.7      2     <2e-16
# mlm3  7 13015 13050  -6500    13001   0.0      1          1
# mlm4  7 12948 12983  -6467    12934  66.8      0     <2e-16


Is the fall ro­bust against differ­ent sam­ples of my data, us­ing ? The an­swer is yes, and the Wilcoxon test even turns out to have given us a pretty good con­fi­dence in­ter­val ear­lier:

recentEstimate <- function(dt, indices) {
  d <- dt[indices,] # allows boot to select subsample
  mlm4 <- lmer(Links ~ Date + Recent + (1+Date|Search), d)
bs <- boot(data=alerts, statistic=recentEstimate, R=10000, parallel="multicore", ncpus=4); bs
# ...
# Bootstrap Statistics :
#     original  bias    std. error
# t1*   -164.8   34.06       17.44
# ...
# Intervals :
# Level      Normal              Basic
# 95%   (-233.0, -164.7 )   (-228.2, -156.7 )
# Level     Percentile            BCa
# 95%   (-172.9, -101.4 )   (-211.8, -156.7 )

A con­fi­dence in­ter­val of (-159,-95) is both sta­tis­ti­cal­ly-sig­nifi­cant in this con­text and also an effect size to be reck­oned with. It seems this mid-2011 fall is re­al. I’m sur­prised to find such a pre­cise, lo­cal­ized, drop in my Alerts quan­ti­ties. I did ex­pect to find a de­cline, but I ex­pected it to be a grad­ual in­cre­men­tal process as Google’s search al­go­rithms grad­u­ally ex­cluded more and more links. I did­n’t ex­pect to be able to say some­thing like “in this mon­th, re­sults dropped by more than a third”.


I don’t know of any changes an­nounced to Google Alerts in May/June 2011, and the emails can’t tell us di­rectly what hap­pened. But I can spec­u­late.

There is one cul­prit that comes to mind for what may have changed in early 2011 which would then led to a fall in col­lated links (a fall which would ac­cu­mu­late to sta­tis­ti­cal-sig­nifi­cance in June 2011): the per­va­sive change to web­page rank­ings called . It affected many web­sites & search­es, had teething prob­lems, re­port­edly boosted so­cial net­work­ing sites (which I gen­er­ally see very few of in my own alert­s), and was rolled out glob­ally in April 2011 - just in time to trig­ger a change in May/June (with con­tin­u­ous changes through 2011).

(We’ll prob­a­bly never know the true rea­son: Google is no­to­ri­ously un­com­mu­nica­tive about many of its in­ter­nal tech­ni­cal de­ci­sions and changes.)


So where does this leave us?

Well, the over­all lin­ear re­gres­sions turned out to not an­swer the ques­tion, but they were still ed­u­ca­tional in demon­strat­ing the con­sid­er­able di­ver­sity be­tween alerts and the trick­i­ness of un­der­stand­ing what ques­tion ex­actly we were ask­ing; the vari­abil­ity and differ­ences be­tween alerts re­minds us to not be fooled by ran­dom­ness and try to look for big effects & the big pic­ture - if some­one says their alerts seem a lit­tle down, they may have been fooled by se­lec­tive mem­o­ry, but when they say their alerts went from 20 links an email to 3, then we should avoid un­think­ing skep­ti­cism and look more care­ful­ly.

When we in­ves­ti­gated the claim di­rect­ly, we did­n’t quite find the claim: there was no change­point any­where in 2012 as claimed by blog­gers like - they seem to have been half a year off from when the change oc­curred in my own alerts. What’s go­ing on there? It’s hard to say. Google some­times rolls out changes to users over long pe­ri­ods of time, so per­haps I was hit early by some changes dras­ti­cally re­duc­ing links. Or per­haps it sim­ply took time for peo­ple to be­come cer­tain that there were fewer links (in which case I have given them too lit­tle cred­it). Or per­haps sep­a­rate SEO-related changes hit their searches after mine were.

Is Alerts “bro­ken”? Well, it’s taken a clear hit: the num­ber of found links are down, and my own im­pres­sion is that the re­turned links are not such gems that they make up for their gem-like rar­i­ty. And it’s cer­tainly not good that the prob­lem is now 2 years old with­out any dis­cernible im­prove­ment.

But on closer in­spec­tion, the hit seems to have been a one-time deal, and if my Panda spec­u­la­tion is cor­rect, it does not re­flect any ne­glect or con­tempt by Google but sim­ply more im­por­tant fac­tors - Search re­main­ing high­-qual­ity will al­ways be a higher pri­or­ity than Alerts, be­cause Search is the dog that wags the tail. My sur­vival model may yet have the last laugh and Alerts out­last its more fa­mous brethren.

I sup­pose it de­pends on whether you see the glass as half-full or half-emp­ty: if half-full, then this is good news be­cause it means that Alerts is­n’t in as bad shape as it looks and may not soon be fol­low­ing Reader into the great Re­cy­cle Bin in the sky; if half-emp­ty, then this is an­other ex­am­ple of how Google does not com­mu­ni­cate with its users, makes changes uni­lat­er­ally and in­vis­i­bly, will de­grade one ser­vice for an­other more profitable ser­vice, and how users are help­less in the face of its tech­ni­cal su­premacy (who else can do as good a job of spi­der­ing the In­ter­net for new con­tent match­ing key­word­s?).

See Also

  1. “What’s Wrong With Google Alerts? The small but use­ful ser­vice seems to be dy­ing. One re­searcher uses em­pir­i­cal re­search to an­swer the ques­tions that Google won’t.”, Buz­zFeed:

    Google has re­fused to shed light on the de­cline. To­day, a Google spokesper­son told Buz­zFeed, “we’re al­ways work­ing to im­prove our prod­ucts - we’ll con­tinue mak­ing up­dates to Google Alerts to make it more use­ful for peo­ple.” In other words, a po­lite non-an­swer.

  2. I have seen some al­ter­na­tive ser­vices, Ya­hoo! Search Alerts, talk­walker & Men­tion sug­gest­ed, but have not used them; the lat­ter 2 do well in a com­par­i­son with Google Alerts.↩︎