Dealing with Frauds at Appsflyer

download Dealing with Frauds at Appsflyer

If you can't read please download the document

  • date post

    16-Apr-2017
  • Category

    Software

  • view

    198
  • download

    0

Embed Size (px)

Transcript of Dealing with Frauds at Appsflyer

Click FraudsDealing with frauds at Appsflyer

Business

Advertiser

Business

Advertiser

Publisher

Business

Advertiser

Publisher

Business

Advertiser

Publisher

Click

Business

Advertiser

Publisher

ClickInstall

Business

Advertiser

Publisher

ClickInstall

Whos the bad guy?

Advertiser

Publisher

ClickInstall

Whos the bad guy?

AdvertiserPublisher

ClickInstall

What advertiser pays for?

Cost per impressionCost per clickCost per installCost per action

What advertiser pays for?

Cost per impressionCost per clickCost per installCost per action

What advertiser pays for?

Cost per impressionCost per clickCost per installFraud techniques of a different leagueLess fraudulent installs than clicks/views as CPI is usually much higherCost per action

Fraud methods

Fraud methods

Programmatic (bots)

Fraud methods

Programmatic (bots)Humans

Fraud detection methodsRule-basedNeed expert knowledge of past fraud behaviourHighly effective at detecting known fraud typesIneffective at new typesAnomaly detectionGood for new kinds of deviationsNot good for known types of fraudSupervised learningNeed examples of past fraudCan be effective at detecting similar occurrencesIneffective at new types of fraud

Rule-basedUnrecognized user agent stringMozilla/4.0 (compatible; MSIE 4.5; Windows 98; )Wrong IMEIToo many applications installed from the same deviceFrequent re-installs on a specific deviceSave device installs from many different geographical locationsInadequately short time between click and installiOS app install receipt cant be validated by iTunes

Anomaly detectionk-means clustering

Anomaly detectionk-means clustering

Anomaly detectionk-means clustering

Choosing featuresNormally distributed values (or half-normally)Normalizing dataCustom normalizerStandardScaler (Spark >= 1.4)Choose number of clustersIterate on different clusters numberEvaluate clustering scoreBuild k-means modelFind vectors with P(x) < k-means clustering

k-means clustering - parsing

k-means clustering - feature selection

k-means clustering - finding K

k-means clustering - find anomalies

Supervised learningLogistic regressionDecision treeRandom forests Training set {x1, x2, ., xN} -> ETrain the modelValidate, then train again..TestApply!

Action itemsDrop fraudulent requestsPros:Less traffic goes through the systemCons:False positivesMust capture all the frauds as they come inMark transactions, which are fraud (in our opinion)Pros:Let customer decide what to doAllows offline fraud detectionMixed approach

Thank you!