Referral SPAM and how to Block It

referral spam

Why you’re getting visits from suspicious or even non-existent websites

By now you’ve probably noticed in your Google Analytics (or other web analytics package or even your log files) that you’re receiving an unusually high rate of foreign traffic from odd websites which may or may not load when you try vising them (they may have even redirected you to a NSFW website).

This is almost surely referral spam. These are bots crawling your site leaving a false referrer. The idea here is to get you or whoever goes through your analytics to visit these sites, most likely to hit you with affiliate offers or ad impressions. It can be annoying, but they’ve got bills to pay too!

While this traffic may be minuscule for large, popular websites, it can really throw analytics reports off for small sites with little traffic, where this referral spam will account for a sizable portion and skew the results (especially “averages,” which you should avoid becoming too dependent on).

Dealing with referral spam

There are a few things you can do. You can block these requests now, and prevent them from showing up in the future. You can do this in Google Analytics as well as in your .htaccess file (my preferred method). Unfortunately this only blocks future requests, and the legacy requests are already baked into your reports, but you can filter those out of GA with the use of segments.

So let’s say for example we want to block these sites so they stop showing up in future reports:

Blocking referral spam in your .htaccess file

This is my preferred method since it blocks these requests from appearing in any traffic report (GA or otherwise). What’s more is your method for blocking them in GA will depend on which version you have installed (another reason to avoid doing this in GA and nip it in the bud at the htaccess). In this example, we would add this to our .htaccess file.

#block Spammy Traffic

RewriteEngine on
RewriteCond %{HTTP_REFERER} ^http?://([^.]+.)* [NC,OR]
RewriteCond %{HTTP_REFERER} ^http?://([^.]+.)* [NC,OR]
RewriteCond %{HTTP_REFERER} ^http?://([^.]+.)* [NC,OR]
RewriteCond %{HTTP_REFERER} ^http?://([^.]+.)* [NC,OR]
RewriteCond %{HTTP_REFERER} ^http?://([^.]+.)* [NC,OR] [NC,OR]
RewriteCond %{HTTP_REFERER} ^http?://([^.]+.)* [NC]
RewriteRule .* – [F]

Filtering out referral spam in Google Analytics

While adding the above snippet to your .htaccess will prevent these URLs from showing up moving forward, it won’t remove them retroactively. They will still show up in reports that select dates prior to your .htacess modification. Fortunately, Google’s Segmentation makes it possible to apply a retroactive filter (with the caveat that the segment has to be selected for the filter to take effect).

To create a Segment which filters out these referrers:

  • Click + Add Segment
  • Click New Segment
  • Conditions
  • Choose “Source” as your metric, switch to “Exclude” and “Matches Regex” and in this case, we’d enter the following:|||||

You can create this as its own and/or add it to an existing segment. This will allow you to view older reports without the effects of referral spam.

Rinse and Repeat

It should be noted that there are many of these bots/referrers out there. After blocking the current referral spam URLs, you’ll likely start receiving referral spam from new sets of URLs, in which case you’ll want to repeat the same steps above. Now go forth and filter.

Comments (2)

  1. Dave says:

    You can’t block https referers with apache if your site is non https. HTTP_REFERER is always empty when referring from https to http for security purposes.

  2. Chris says:

    Thanks Dave, I’ve updated the post to at least show http to avoid misleading anyone. So apparently the https referrers simply pass through as direct visits in GA (assuming the landing page is http).

Leave a Reply

Your email address will not be published. Required fields are marked *