Google, Click Fraud, and Invalid Clicks

Yesterday, Andy Beal posted a detailed story on Google and click fraud, in which I was quoted as saying that Google's click fraud rate is less than 2%. Did I really say that? Not quite.

First, some background. Andy and I met during the Search Engine Strategies conference in Chicago last week, and we spent an hour talking about our systems, methods, and policies for fighting click fraud. As everyone who has ever spoken to me about this knows by now, this is an issue we take very seriously, and have dedicated extensive resources to managing effectively. Unfortunately, there is a great deal of misinformation on this topic (mainly from third parties with an incentive to exaggerate the issue), so we have been exploring ways to become more transparent ourselves. Our top priority is to protect advertisers, so that means not disclosing any proprietary methods which would allow click fraud perpetrators to reverse-engineer our systems. However, there is still a great deal of information we can share. I and others on our team have spent literally hundreds of hours on communications and sharing such information outside Google. The goal is to improve the level of understanding of this issue to arm everyone against the FUD out there.

Andy's story provides a great summary of some of the key facts at Google:

  • Invalid clicks and click fraud are separate but related concepts (invalid clicks simply being the clicks for which we do not charge advertisers)
  • We have a four-stage process which detects the vast majority of invalid clicks before they affect advertisers
  • The total percentage of clicks we mark as invalid in our system is consistently in the single digits

We had a limited amount of time to cover a lot of ground, and of course, some miscommunication can result when discussing an issue of this complexity. Unfortunately, the most significant fact that seems to have been misrepresented is the one in the headline. Specifically, I never said that our click fraud rate is less than 2%.

Instead, what I said is that the quantity of invalid clicks which we detect as a result of reactive investigations is a "negligible proportion" of the total number of invalid clicks. Andy asked me if that percentage is less than 2%. I told him that I was not able to provide a bound, but yes, "negligible" certainly means less than 2% of invalid clicks.

However, more significantly, this is quite a different thing than saying that our "click fraud rate" is less than 2%. When we mark clicks as invalid because of suspected malicious activity, the vast majority of the time we do so proactively, and none of those cases are included in the reactive figure in question. We proactively discard a single-digit percentage of our revenue, primarily by filtering traffic before it impacts an advertisers' budgets and, less significantly, through off-line banning of AdSense publishers which leads to refunds to advertisers. The difference between proactive and reactive detection is the difference between the "attempted click fraud" caught by us and the click fraud which actually affects an advertiser in a way that requires their action to correct (by asking for an investigation). Obviously it is the second category which advertisers actually care about, and I think that is the spirit in which Andy wrote his headline.

So what is our overall "click fraud rate"? As noted in the diagram in the story, it is virtually impossible to know the intent of every click. However, we can do a very effective job using statistical techniques to detect potentially malicious behavior, and the total number of invalid clicks we detect – whether for suspected malicious or non-malicious intent – is in the single digit percentages. So third-party estimates which say that click fraud is 15% or higher appear to clearly be substantial exaggerations.

I gave Andy this feedback, and he was able to make a few updates and corrections, but unfortunately was not able to change the headline. With the aforementioned caveats in mind, I would invite everyone to read Andy's article, as it does provide a great overview of the basic structure of our systems and philosophies about fighting click fraud.