Understanding and Stopping Referrer Spam

Having a great looking and user friendly website is essential for an effective web presence. But even more important may be the need to measure this “effectiveness”. Google Analytics is currently the most widely used tool for analyzing website traffic, and it provides an excess of data to glaze over the eyes of many SEO strategists. Unfortunately, just as your email’s inbox can be spammed by countless unwanted messages so can the data being collected by Google Analytics. Referrer spam results in reports handicapped by fake visits/sessions that affects crucial metrics such as bounce rate and average session time. Understanding how Google Analytics (GA) is being spammed and how to stop it is tantamount to measuring your website’s “effectiveness”.

Crawler Referrer Spam

Web crawlers are an integral part of the Internet and are the heart for many web based companies such as Google. Crawlers, also referred to as bots, are used to index the web for search engines and are a valuable asset to those navigating the web. For as many good reasons web crawlers exist there are also bad reasons.

These bad bots are usually deployed by a black hat marketer whose goal is to get either more visits to their website or are looking to set a cookie on a visitor’s machine. When a webmaster observes a site that is getting lots of traffic from an unknown referer often curiosity will lure them to visit the spam site. Once the webmaster is on the site a cookie is placed on their machine so that if the webmaster happens to visit an affiliate’s site (Amazon, Alibaba, etc.)  and make a purchase the black hat marketer will make a profit.

Spotting Referrer Spam

The main goal of a black hat marketer deploying referral spam bots is to make sure the webmaster or person responsible for analytics sees the bad referrals. This makes finding referral spam in analytics reports pretty easy. In the image below we can see that 4webmasters.org is at the top of our referral list. It seems really suspicious because there is no reason this site should be sending us so much traffic. An unaware webmaster or SEO analyst may decide to go visit this site to see why they are linking to their website. Due to suspicion we decide to check a reference of referral spam sites listed at GitHub.

referral spam

After viewing the referrer spam blacklist on GitHub we see that 4webmasters is the first one on the list.  This is a pretty good indication that this is crawler referrer spam.

 

Stopping Crawler Referrer Spam

Stopping crawler referrer spam is pretty straight forward as this type of crawler is “physically” accessing a website. Webmasters can implement the .htaccess file to block this type of spam.  Typically, a .htaccess file would be updated with rules similar to the following:

## SITE REFERRER BLOCKING

RewriteCond %{HTTP_REFERER} 4webmasters.org [NC,OR]

RewriteCond %{HTTP_REFERER} best-seo-offer.com [NC,OR]

RewriteCond %{HTTP_REFERER} darodar.com [NC]

RewriteRule .* – [F]

These rules will stop the spam sites from being able to crawl a website, thus preventing the data from ever registering on Google Analytics. Another way to keep spam data out of analytics is by creating filters that exclude the spam referrers entries. Although, this method does help your reports from a visual perspective. It does not keep the bots from actually visiting the site.

 

Ghost Referrer Spam

Black hat marketers seem to always find a way to make a dirty dollar. Because webmasters have figured out how to block crawlers from accessing their sites up crops a new way to position their spam sites in Google Analytics. Ghost referrers are now sending Google Analytics (GA)  request directly to Google using collected GA IDs (tracking numbers) or by random selection. This means that a spammer doesn’t have to crawl over a website, but can go straight to the source making it really hard to stop spam from destroying your analytics data.

 

Stopping Ghost Referrer Spam

At this time there is no concrete way of stopping Ghost Referrer Spam because the crawler/bot is not actually visiting the website. However, it is possible to set up filters in Google Analytics to keep the spam counts from displaying in reports.

First, navigate to the view you would like to apply the spam referral filter . Then, select filters from the view menu.

 

Next, click the New Filter button to begin creating the filter.

The Add Filter to View screen supplies us with many options to filter our analytics data.  First, give the filter a name, and then choose Custom for the filter type.  In the Filter Field box select Campaign Source, and then place a regex string containing the spam referrer’s domains in the Filter Pattern field.  It is imperative that the regex string is correct or you could end up excluding relevant traffic from your reports. In the regex string each domain should be separated by a pipe.  The backslash and period suggest any pages that fall after the .com will also be filtered out.  Also, make sure that there are no spaces and that the string does not end with a pipe character.

referral spam 51

If you are hungry for more information on how to set up these filters take a look at Megalytics’s How to Filter Out Fake Referrals and Other Google Analytics Spam.

Stopping Ghost Referrer Spam

As with all web related technology the landscape of the Internet is perpetually changing. There will always be a scam artist out there trying to make a quick buck, and unfortunately the Internet offers many ways to accomplish this. Some spammers pull right up to your front door while others try to sneak around the back. It’s up to us to stay on top of the tools needed to prevent intrusion, and in this case keep them from ransacking our precious analytics data.

One response to “Understanding and Stopping Referrer Spam”

  1. PLANET AUZ - Pixels, Code, and Beer writes:

    […] can be spammed with fake referrals, and misleading data.  Check out this post I wrote over at Smart Solutions to see how we combat referral spam for our […]