What is ad fraud and how to prevent it?

Fraud attempts amount to 20 to 35 percent of all ad impressions throughout the year

-White Ops Bot Baseline 2018-2019 report

The report estimates media budget losses to ad fraud at about $5.8 billion globally in 2019. However, for the first time, the success rate of anti-fraud efforts will exceed the success rate for ad fraud attempts. If no fraud attempts would have been blocked, the projected losses to ad fraud could climb up to $14 billion globally every year.

In this blog post, we’ll discuss ad fraud, the problems it causes, and how to counter it in a somewhat proactive manner using tools. We’ll go over some types of ad fraud that we believe are the most common, but won’t go into too much detail considering the huge variety of ad fraud methods and how technical some of them can be.

As stated, this article will focus more on ad fraud as a whole and using tools to counter it (mostly proactively/automatically). Please read our article that explains how to audit Display placements after your ads have already appeared on them (always reactively/manually).

What is ad fraud?

Ad fraud is the umbrella term for all activities which have the sole purpose of generating revenue from advertisers’ media budgets by fraudulently representing online ad impressions or clicks. When in relation to affiliate marketing, it could even refer to conversion fraud.

Ad fraud is often named Invalid Traffic as well. Invalid traffic can be split into two categories:

General Invalid Traffic (GIVT):
Traffic from bots that can easily be identified as bot traffic such as search engine crawlers. GIVT can be easily identified as the bots perform actions that humans would probably never do. Not all of this traffic is intended to be fraudulent and “eat” advertising budget (e.g. search engine crawlers).

Sophisticated Invalid Traffic (SIVT):
For SIVT, the fraudsters went to great lengths to hide the traffic and/or make it look as if it was generated by an actual human. As a result, this kind of invalid traffic is much harder to detect. Some techniques of SIVT include (but are not limited to) hidden ads, domain spoofing, or malware-infected browsers or apps generating pageviews in the background.

What’s in it for the fraudsters?

Of course, fraudsters wouldn’t get up in the morning (or they’d choose a different career path) if they couldn’t make money off of advertisers’ media budgets.

For every €1 an advertiser pays, a part gets paid out to the person owning the website which serves the ad (the transparency of the fees in some media buying technologies is a different discussion). It is this part of the advertising budget that ad fraud tries to capture by generating Invalid traffic through various methods.

According to a study by Integral Ad Science, in the first half of 2019, up to about 12% of the traffic was fraudulent for non-fraud-optimized campaigns for both Desktop and Mobile web display. For Video, the fraud percentage was 9.6% for desktop and 4.7% for mobile web on average.

As most studies are global or for bigger markets such as the USA or China, it’s hard to truly estimate how much the Belgian advertising landscape is impacted. But I think we can conclude this:

Different types of fraud

Given the big range of methods used for ad fraud, we’ll only discuss the most common ones but know that the list below is far from exhaustive.

Bots

Probably the most common example of ad fraud is bot traffic. Bots nowadays can be programmed to mimic human behavior when browsing a site, watching a video, or clicking an ad to make it harder to detect.

In a lot of cases, bots are run on devices like yours and mine, without the owner knowing that it is doing so. Often these viruses are installed by visiting a malicious link that downloads the bot/virus. The bot can then run browsers in the background in which it will visit websites, click ads, and view videos.

This can also happen by downloading a malicious app on your smartphone. The app then performs similar activities when, for example, your phone is locked so you’ll never notice. In some cases, these bots also visit high-quality websites in order to pick up cookies so they become part of a target audience that is targeted a lot, so they’ll see more ads. As an extension, remarketing campaigns are also not fraud-proof: your device will have the remarketing cookie stored, which the bots will also use to display ads.

Just last March, it was reported that over 50 apps were detected on the Google Play Store which ran malware without the users knowing it. Up to one million downloads were generated for these apps combined. These kinds of apps usually perform as advertised (games, utility apps like flashlights, calculators, or camera apps, …) but hijack your device to send invalid traffic to the fraudsters’ ad-serving websites or apps, thus making your own device part of a bot network.

Domain Spoofing

Domain spoofing can be seen as the identity theft of the digital ad space. It is when a fraudulent domain pretends to be another (premium) domain to sell its low-quality inventory at higher prices generally associated with premium inventory. This tricks ad buyers to think their media budget is going to the high-quality website, while in reality it’s served on a domain featuring low-quality, potentially brand-unsafe content.

With domain spoofing, it’s possible that the impressions get served to actual humans rather than bots, but it’s still fraudulent given the deception of the ad buyer as to where his ads appear and the justified price per impression or click.

Aside from deceiving the ad buyer, domain spoofing also potentially steals ad revenue from the publisher with high-quality content.

Domain spoofing also has the added benefit (for the fraudsters of course) that it can bypass whitelists.

Ad Stacking

Ad stacking is the practice of serving multiple ads on top of each other in the same ad slot. This means that only the top ad will be visible to users, while advertisers do pay for the underlying ad impressions which can never be seen by a human.

Combinations

Of course, combinations of several techniques are a possibility and most likely the most lucrative for the fraudsters. Imagine a domain that stacks ads, pretends to be a premium inventory site through domain spoofing and uses bots to send huge amounts of traffic to generate impressions. Easy money.

Luckily, initiatives like digitaladtrust.be or digitaladtrust.fr issue a quality label to publishers based on 5 criteria, one of which is ad fraud. When making your own whitelists or evaluating the placements your ads appear on, definitely check out these websites!

Platforms most impacted

As stated above, if ad fraud wouldn’t be lucrative, it wouldn’t happen. As a result, fraud will happen (most) on platforms where the one providing ad slots gets part of the advertiser’s media budget for serving their ads.

Because of this, any channel using banners on third-party websites, video campaigns on youtube, or campaigns run of for example Facebook which serves outside of facebook.com will be most impacted.

While it can’t be ruled out, because of this, we believe search campaigns running specifically on Google and Bing aren’t (or are, at least, less) impacted by ad fraud. Search partners, on the other hand, can be affected, as websites qualifying as search partners get compensation for clicks generated.

Another Google environment we think isn’t impacted to a high degree is Gmail. Noone benefits from impressions or clicks in Gmail ads but Google. The same cannot be said for ads on Youtube, where the channel owners get compensated.

On Facebook or Instagram, It’s probably similar to Ads on Google Search or Gmail: no one benefits from impressions or clicks but Facebook so fraud is less likely. However, Facebook also gives advertisers the option to run ads on their “Audience Network” which functions the same as Google’s Display Network, incentivizing the publishers for ads served or clicked.

How to fight it

While ad fraud will probably always be a step ahead, detection methods will have to continuously evolve with it. No one technique can realistically detect or block all forms of fraud, which is why leveraging the power of Machine Learning will be required to keep up with the scale and evolution of ad fraud.

At CLICKTRUST, we’ve often ‘manually’ found suspicious placements on which ads appeared. These placements showed huge CTRs for display campaigns (+3% or more, compared to a benchmark of 0.20%-0.50%). Additionally, several of these placements had the exact same layout, showed the same low-quality content, and were sometimes registered to the same person/company. These placements were added to our blacklist which is applied by default to any campaign we run.

Simple rule-based detection techniques are easily avoided, so machine learning can help in identifying likely fraud patterns.

As a very simplified example, a rule-based technique could be:
“If the number of clicks/impressions generated by one device/user is above X per second, it’s most likely fraud because a human likely won’t browse that fast.”

Fraudsters would then quickly change the behavior of their bot to switch browsers/devices/IP-address to make it look like multiple users, or slow down their behavior to mimic more human-like behavior.

Machine learning would detect ad fraud faster than humans ever will be able to, even if fraudsters develop new techniques.

Preventing ad fraud can happen in two ways: pre- and post-bid. As we said before, we believe multiple techniques should be used in combination to ensure a minimum level of ad fraud, but we need to acknowledge that some ad fraud will always slip through the cracks as no tool is perfect.

Pre-bid techniques mean that the ads will be blocked from serving even before the bid request is processed based on certain rules. For example, an exclusion list will make sure that your ads can’t “participate in the bidding” on certain domains or URLs.

Post-bid techniques happen after the bid request is already processed. For example, the user agent, browser and/or device would be scanned for bot fingerprints or brand-unsafe signals. If these are detected, the ad will still be blocked from showing. In this case, there can be a difference between post-bid blocking, which actually blocks the ad from serving, and post-bid measurement, which only reports after the impression but does not block ads from serving. With post-bid measurement only, actions would still have to be taken to optimize campaigns so the ads no longer appear on these sites.

A fraudulent website could for example use domain spoofing to send a false domain identity to the media buyer and you win the bid so your ad is ready to serve on a fraudulent website. However, a post-bid ‘scan’ would then identify that it is actually a spoofed domain and then either block the ad from serving or report on it later depending on the technique/tool.

Tools or services such as ClickCease, MOAT, Integral Ad Science, and Doubleverify, …. can help with battling ad fraud. Which tool to choose should be determined by your needs and the functionalities of these tools (pre- or post-bid?) which we won’t discuss in detail in this article.

Conclusion

At CLICKTRUST, we realize that reducing ad fraud is a no-brainer optimization task. Here are some of the things we do and can help you with:

Fast programmatic audits that show the percentage you have lost on ad fraud
Establishing robust white lists for future programmatic campaigns (To make your own, digitaladtrust.be offers a quality label for Belgian publishers based on 5 different criteria.)
Testing multiple ad verification and pre-bid tools for programmatic & Google Ads campaigns.
Specific campaign tests to capture/detect bot traffic and learn more about their behavior.
Implementing massive blacklists that are maintained automatically
Optimizing campaigns towards KPIs which are hard(er) for bots to generate.