Click fraud is usually defined as the act of purposely clicking on ads on pay-per-click programs with no interest in the target web site. Two types of fraud are usually mentioned:
An advertiser clicking on competitor ads to deplete their ad spend budgets, with fraud frequently taking place early in the morning and through multiple distribution partners:AOL, Ask.com, MSN, Google, Yahoo, etc.
A malicious distribution partner trying to increase its income, using clickbots or paid human beings to generate traffic that looks like genuine clicks.
While these are two important sources of non-converting traffic, there are many other sources of poor traffic. Some of them are sometimes referred to as invalid clicks rather than click fraud, but from the advertiser or publisher viewpoint, there is no difference. In this paper, we are considering all types of non billable or partially billable traffic, whether it is the result of fraud or not, whether there is or there is no intent to defraud, and whether there is or there is not a financial incentive to generate the traffic in question. These sources of undesirable traffic include:
Accidental fraud: a home-made robot not designed for click fraud purposes, running loose, out of control, clicking on every links, possibly because of a design flaw. An example is a robot run by spammers harvesting email addresses. This robot was not designed for click fraud purposes, nevertheless ended up costing money to advertisers.
Political activists: people with no financial incentives, but motivated by hate. This kind of clicking activity has been found against companies recruiting people in class action lawsuits, and results in artificial clicks and bogus conversions. It is a pernicious kind of click fraud because the victim thinks its PPC campaigns generate many leads, while in reality most of these leads (email addresses) are bogus.
Disgruntled individuals: it could be an employee working for a PPC advertiser or a search engine, who was recently fired. Or it could be a publisher who believes to be unjustifiably banned.
Unethical guys in the PPC community: small search engines trying to make their competitor look bad by generating unqualified clicks, or shareholder fraud.
Organized criminals: spammers and other internet pirates used to run bots and viruses, who found that their devices could be programmed to generate click fraud. Terrorism funding comes in this category, and is investigated by the both FBI and the SEC.
Hackers: many people have now access to home made web robots (the source code in Perl or Java is available for free). While it is easy to fabricate traffic with a robot, it is more complicated to emulate legitimate traffic as it requires spoofing thousands of ordinary IP addresses – not something any amateur can do well. Some individuals might find this as a challenge and generate high quality emulated traffic, just for the sake of it, with no financial incentives.
Traditional media losing market share to PPC advertising have incentive to contribute to click fraud.
In this paper, we will be even more general by encompassing other sources of problems not generally labeled as click fraud, but sometimes referred to as invalid, non-billable, or low-quality clicks. This includes
Impression fraud: impressions and clicks should always be considered jointly, not separately. This can be an issue for search engines, as their need to join very large databases and match users with both impressions and clicks. In some schemes, fraudulent impressions are generated to make a competitor’s CTR look low. Advanced schemes use good proxy servers (e.g. AOL) to hide the activity. When the CTR drops low enough, the competitor ad is not displayed anymore. This scheme is usually associated with self-clicking, a practice where an advertiser clicks on its own ads though proxy servers to improve its ranking, and thus improve its position in search result pages. This scheme targets both paid and organic traffic.
Multiple clicks: while multiple clicks are not necessarily fraudulent, they end up either (i) costing lots of money to advertisers when they are billed at the full price or (ii) costing lots of money to publishers and search engines if only the first click is charged for. Another issue is how to accurately determine that two clicks – say five minute apart – are attached to the same user.
Fictitious fraud: clicks that appear as fraudulent, but are never charged for. These clicks can be made up by unethical click fraud companies. Or they can be the result of testing campaigns, and we call them click noise. A typical example is Googlebot. While Google never charges for clicks originating from its Googlebot robot, other search engines that do not have the most updated list of Googlebot IP addresses might accidentally charge for these clicks. Another example of fictitious fraud further discussed in this paper is fictitious clicks. We explain what fictitious clicks are and how they can be detected.
Posted by `maria valadez on 2009-04-12.
Category: Click Scoring
Scores below 425 correspond to clicks that are clearly unbillable
Spike at the very bottom and very top
50% of the traffic has good scores
In this scorecard, a drop of 50 points represents a 50% drop in conversion rate: clicks with a score of 700 convert twice as frequently as clicks with a score of 650.
From http://clickscoring.blogspot.com/2007/04/typical-click-score-distribution.html - 4/15/2007 11:23:00 AM
Posted by `Work At Home Net Labs on 2009-04-12.
Category: Click Scoring
Click Fraud Attacks: Emerging Trends Click fraud attacks have become significantly more sophisticated over the last few months. At the same time, click fraud detection systems are becoming increasingly more efficient to detect smart attacks. Here, we describe three cases that were caught by Authenticlick over the last seven days.
Bogus Conversions
Over a period of several months, a single distribution partner generating well over 1% of the traffic from the leading search engine network was responsible for up to 15% of the downstream conversions. All these conversions were found to be fake. The distribution partner in question was targeting advertisers where conversions consist of filling up a web form. These advertisers are an easy target for smart fraudsters. In addition to generating bogus conversions, the culprit operated from abroad and experienced an usually fast rate of exponential growth over the last two years.
Fraud through AOL and other "good proxies" Another fraud case was identified last week, generating a large proportion of clicks from known good proxies including AOL. This type of scheme is more difficult to detect.Authenticlick was able to unearth the fraudulent activity thanks to advanced methodology based on network topology metrics. It is interesting to note that the fraud scheme was detected, even though the data submitted by the search engine did not include any information about the user agent.
Fraud involving a symbiotic relationship between a distribution partner and an advertiser This interesting fraud case involves a very large number of IP addresses, but a very small number of advertisers. It was first identified by Authenticlick in April 2007. It is believed that either the advertiser and the fraudster have a symbiotic relationship, or the advertiser is a victim who benefits from click fraud as the fraudster improves the victim's ROI, through a particular type of fraud described here.
Additional Notes about Adware
The last fraud case discussed in this article is particularly interesting in the sense that it almost certainly implies viruses (adware or spyware) installed and remotely controlled over thousands of computers. Two types of viruses are currently active:
The first type actually triggers Internet Explorer and is best described in Google's paper. It is an Internet Explorer parasite. This type of virus is easier to detect as it generates too many clicks per user.
The second type of hitbot does not rely on Internet Explorer to trigger clicks. Instead, it has its own code to communicate using the HTTP protocol. This type of virus, more widespread than the previous, is more difficult to detect. Yet, as it relies on user agent lookup tables to generate clicks, Authenticlick has been able to identify this type of fraudulent activity, as criminals (so far) have not been able to correctly replicate the expected underlying multivariate distributions. Also note that we have developed a patented solution to catch this type of fraud.
From http://clickscoring.blogspot.com/2007/04/click-fraud-attacks-emerging-trends.html - 4/22/2007 8:06:00 PM
How Can Advertisers Benefit from Click Scoring? Since click fraud detection is a rudimentary application of click scoring, one thinks of click scoring as a tool to eliminate unqualified traffic. Click scoring can actually do much more, such as determine optimum pricing associated with a click, identify new sources of potentially converting traffic, measure traffic quality in the absence of conversions or in the presence of bogus conversions, and assess the quality of distribution partners, to name a few applications. Also note that scoring is not limited to clicks but can also involve impressions and metrics such as clicks per impressions.
From the advertiser viewpoint, one important application of click scoring is to detect new sources of traffic to improve total revenue, in a way that can not be accomplished through A/B/C testing, traditional ROI optimization or SEO. The idea consists of tapping into delicately selected new traffic sources rather than improving existing ones.
Let us consider a framework where we have two types of scores:
Score I: generic score computed using a pool of advertisers, possibly dozens of advertisers from the same category.
Score II: customized score specific to a particular advertiser.
What can we do when we combine these two scores? Here's the solution:
Scores I and II are good. This is usually one of the two traffic segments that advertisers are considering. Typically advertisers focus their efforts on SEO or A/B testing to further refine the quality and gain a little edge.
Score I is good and score II is bad. This traffic is usually rejected. No effort is made to understand why the good traffic is not converting. Advertisers rejecting this traffic might miss major sources of revenue.
Score I is bad and score II is good. This is the other traffic segment that advertisers are considering. Unfortunately this situation makes advertisers happy: they are getting conversions. However this is a red flag, indicating that the conversions might be bogus. This happens frequently when conversions consist of filling web forms. Any attempt to improve conversions (e.g. through SEO) are counter-productive. Instead, the traffic should be seriously investigated.
Scores I and II are bad. Here, most of the time, the reaction consists of dropping the traffic source entirely and permanently. Again, this is a bad approach. By reducing the traffic using a schedule based on click scores, one can significantly lower exposure to bad traffic and at the same time not miss the opportunity when the traffic quality improves.
This discussion illustrates how scoring can help advertisers substantially improve their revenue.
Case Study We have applied this concept to optimize the traffic on a partner website, where conversions consist of filling up a web form to subscribe to a newsletter.
One source representing 25% of the traffic was producing negative results, even though the scores were very high. After investigating the case, we realized that the landing page was not targeted for the user segment in question. After modifying the content to better target these users, the website experienced a substantial page view increase and visit depth - and higher revenue. Eventually we decided to increase this source to 50% of the total traffic.
Another source represented 2% of the paid clicks but 30% of the conversions from a major network. After investigation, all conversions (most of them, bogus) originating from this source were discarded, but the source continued to be monitored. Without this discovery, they would be sending newsletters to thousands of people who never actually subscribed, without knowing it (until complaints arrive).
From http://clickscoring.blogspot.com/2007/04/how-can-advertisers-benefit-from-click.html - 4/15/2007 9:30:00 PM
Posted by `maria valadez on 2009-04-12.
Category: Click Scoring
New fraud scheme on Google (phishing / click fraud) Fraudsters send you a fake email about your AdWord account being terminated. They ask you to renew your account by login on to a fake Google AdWord website that looks real. That's how they steal your login/password. Once your account is hijacked, they increase your daily budget and your bid for keywords that are part of their botnet system. In the process, they might also steal your credit card info or other useful info (your address for identity theft, your keyword list to feed their botnet).
Complaint received from a client:
Vincent....don't know if you'd be interested in this...but i use google ad words & just recently someone hacked into my profile & changed my daily max from $10 to $6,810, and then miraculously i received over 1,000 clicks that day at $5.50 per click....they were trying to charge me over $7K. I reported it and about a week later they admitted it was not legitimate. Have you heard of this
Email sent by fraudsters:
Renew Your Account Now !
Dear Member,
This is your official notification from Google Inc. that the service(s) listed below will be deactivated and deleted if not renewed immediately.
As the Primary Contact, you must renew the service(s) listed below or it will be deactivated and deleted.
Renew Now your Google AdWords services. [link deleted]
SERVICE: Google AdWords EXPIRATION: August, 19 2008
Thank you for using Google Inc service. We appreciate your business and the opportunity to serve you.
Google AdWords Service .
From http://clickscoring.blogspot.com/2008/08/new-fraud-scheme-on-google-phishing.html - 8/19/2008 2:57:00 AM
Invitation to join Analytic Bridge Analytic Bridge has grown from 20 to about 400 people in just one week. We invite you to revisit our network, and sign up if you are not already a member.
In the last seven days, we have added many groups, several white papers, dozens of useful links. Also, members have contributed to several forums, including
Explanation of Variance Inflation Factor
Data Validation
Post your best graphs in our photo section
How to produce nice graphs with R?
Data Warehousing, ETL and Business Intelligence opportunites
Professional Certificates (chartered statistician, SAS certified, series 6, etc.)
Spatial ETL Pros Needed for Leader in Geographic Business Intelligence Solutions
Genetic Data Mining Method for the Proper Use of the Correlation Coefficient
Who makes $100K or more a year?
Companies hiring statisticians and data miners
Best books for learning data mining
Basic Introduction to Text Mining
Non-Linear ARIMA using neural nets?
Statistics handbooks now available in the links section
Jobs in Switzerland
Interesting discussions on the Web Analytics group
Data mining blog
LinkedIn, Plaxo, Facebook and other networks
Building Statistical Regression Models: Straight Data are Necessary
Domain names for sale
Career paths: switching to a different industry
Generalized Goldbach Conjecture and Integer Coverages
From http://clickscoring.blogspot.com/2008/02/invitation-to-join-analytic-bridge.html - 2/24/2008 10:29:00 PM
Massive Click Fraud Case Unearthed in our Laboratory Here we provide specific details about a widespread botnet still operating. As many as 50% of all advertisers may be victims, albeit with a low frequency. It is connected with a particular search distribution partner on the largest search engine network. We will call it Spiralup, although its real name is different. Their brand is associated with spyware, though they have clearly added click fraud to their areas of focus.
Their traffic has been growing exponentially over the last few years, according to Alexa (see graph below). Note that Alexa can’t always discriminate between real and fake traffic. Software (AlexaBooster) is available which allows a user to artificially inflate Alexa rankings.
Note two sharp dips in early 2006 and 2007 (see graph below).
In 2006, the browser distribution was different, with more Firefox, possibly indicating a network of human beings paid to click.
In 2007, the browser distribution shifted, favoring Internet Explorer, as they employ a botnet programmed specifically for IE but not for other browsers.
They continually add new advertisers to their target list, but rarely generate more than 3 clicks per day per advertiser. Newly infected computers are assigned to advertisers recently added to their list.
Advertisers accepting clicks from foreign countries, and small advertisers, are hit hardest.
A portion of their traffic is real, a portion of it is bogus, generated by botnets (clicking agents attached to viruses), and a portion of it comes from human beings paid to click according to a pre-specified schedule.
Because they have infected so many computers, they are able to use a very large pool of IP addresses, though the traffic skews towards international, and some specific IP blocks and foreign transparent proxies are widely used.
Their traffic patterns are associated with unrealistic variances and they generate an extremely high proportion of bogus conversions.
The first click was billed at full price (even days later, the charge did not disappear). It resulted in a bogus conversion. It also triggered an HTTP request on the target page for a blank stylesheet.
This means that the botnet is a parasite of Internet Explorer, and does not have its own code to connect to the Internet, but rather relies on Internet Explorer to do so.
All four clicks have IE 6 as a user agent, as one would expect.
Spiralup's exponential traffic growth: �
From http://clickscoring.blogspot.com/2007/07/massive-click-fraud-case-unearthed-in.html - 7/3/2007 6:04:00 PM
Posted by `maria valadez on 2009-04-12.
Category: Click Scoring