AML and Data Analytics - Rutgers...

110
AML and Data Analytics Presentation by Deniz Appelbaum, PhD for the 16th Fraud Seminar Dec 1st, 2017 Newark NJ with special thanks to Sunder Gee, author of “Fraud and Fraud Detection”

Transcript of AML and Data Analytics - Rutgers...

AML and Data Analytics

Presentation by Deniz Appelbaum, PhD

for the

16th Fraud Seminar

Dec 1st, 2017 Newark NJ

with special thanks to Sunder Gee,

author of

“Fraud and Fraud Detection”

AML and Data AnalyticsThe AML Process

The AML Process

• MONEY LAUNDERING IS A FINANCIAL TRANSACTION SCHEME to conceal or attempt to conceal the identity of proceeds illegally obtained so that the proceeds appear to come from legitimate sources.

• IMF estimation: money laundering amounts to 2-5% of global GDP (www.farf-gafi.org/faq/moneylaundering) – 800 Billion to 2 Trillion dollars

• Money laundering disguises the illegal origin and legitimizes the funds so they can be openly used.

• Three main stages: placement, layering, and integration

Placement: The riskiest phase

• Direct connection to money source

• Most legislation is developed to prevent/detect this stage

• Large amounts of cash are chunked or disguised to escape detection/alerts• “smurfing”/structuring: splitting larger amounts so as to avoid detection

(below 10K)

• Physically move money

• Exchanging to alternative currencies

• Purchasing gems, bitcoins, money orders, cashiers checks

Layering: the most complex stage

• Where the origin of the money is being made difficult to trace.

• A number of transactions or layers need to be put between the original sources of the funds before they are brought back into the legal economy.

• Funds might be moved to foreign countries that usually have strong bank secrecy laws, moved into accounts in the name of others who are nominees, or moved to accounts held by offshore corporations where the beneficial ownership is hidden and the funds can be withdrawn and redeposited to a number of other accounts.

• Other layering tools and techniques include:• Bank secrecy laws• Offshore banks• Tax havens• Shell corporations• Trade-based laundering• Digital currencies

Integration: Re-Introduction phase

• In the integration stage, the money enters back into the legitimate economy where it appears to have come from legal and normal transactions.

• Difficult to detect, unless there is a PAPER TRAIL

• Depending on the layering stage, the return may appear to come from a sale of assets such as real estate.• False loans from off-shore companies• False inheritances• False gambling winnings• Credit cards issued by off-shore banks• Salary from false business• Importing/exporting/distribution• Co-Mingling of funds with legitimate businesses (high cash based businesses)

An AML scheme cannot be successful unless the paper trail is eliminated or made very complex!

Example:

Money Laundering Methods• Using a Front Business to Launder

Funds

• Seemingly legitimate business

• Comings and goings will not stand out

• Provides cover for delivery and transportation related to AML activities

• Expenses from illegal activity can be concealed

• Overstating Revenues and Expenses

• Depositing, but NOT RECORDING, revenue

This Photo by Unknown Author is licensed under CC BY-SA

Money Laundering Methods:

Luxury Antiques

Money Laundering Methods: Luxury Antiques

Money Laundering

Methods

• Overstating Expenses to make payoffs, buy illegal goods, other illegal investments

• Padding expense accounts

• Payments for supplies never received

• Fees to suspicious consultants

• Salaries for non-existent employees

• Basically:

• Fictitious employees

• Fictitious fees or vendors

• Inflated invoices

Money Laundering Methods: Luxury Antiques

Money Laundering Methods: Luxury Antiques

Money Laundering Methods: Luxury Antiques

Money Laundering Methods: Luxury Antiques

Money Laundering Methods

• Depositing, but not recording, revenue (cash)• “Loan proceeds”

• “Sale of property”

• “Capital investments”

• Check documentation!

Money Laundering

Methods

Characteristics of Favorite Businesses for Hiding or Laundering Money

• Revenue:

• Revenue base is difficult to measure

• Cash transactions

• Variable amounts

• Expense:

• Variable and tough to measure

• History:

• Ethnic ties

• Supplier/customer ties

Money Laundering Methods

• Bars, restaurants, and night-clubs

• High or variable prices

• Varied clientele

• Lots of traffic

• Cash

• Fast food (although lower $, mainly cash)

Examine traffic versus books

Money Laundering Methods

• Vending Machine Operations

• Highly variable

• Hard to measure volume of cash receipts amd expenses

• Wholesale Distribution

• Diverse product line

• Falsified invoices

• Fake Vendors

• Fake Customers

• Expenses easy to inflate

This Photo by Unknown Author is licensed under CC BY-NC-SA

The Real Estate Industry

• Present a broad range of options

• Multiple parties, layering is possible

• Obscure funding source

• Loan-back schemes

• Back-to-loan schemes

• Shell entities

• Appraisal Fraud

• Monetary Instruments

• Mortgage schemes

• Obscure identity of owner

But, documentation!

This Photo by Unknown Author is licensed under CC BY-SA

ATMs

• Inexpensive

• Privately owned

• Easy to load

• ATM debits the cardholder and credits the ATM owner

• No requirement to check background of ATM owner

• No mandatory reporting procedures

• No rules for maintaining ATM sales records

This Photo by Unknown Author is licensed under CC BY-NC-SA

Pre-paid Items

• Goods and services that are paid for in advance

• Open, closed, semi-open

• Gift cards

• Pre-paid debit cards

• Payroll cards

• Prepaid mobile phones

• Mass transit cards

• Gaming and lottery cards

Mobile Banking

• Using an account associated with a mobile account…Samsung pay

• Incomplete regulations

• Transactions overseas hard to trace

• Can move funds anywhere

• Phone account owner can be anonymous

This Photo by Unknown Author is licensed under CC BY-NC-ND

Digital Currencies and Virtual Assets

Online payment service which accepts funds in a variety of ways to transfer funds to and from individuals/businesses

Exist and are traded in a digital format

• Growing in number

• Loosely regulated

• Most transactions considered final

• International person-to-person

• No required customer identification, just the random “address”

• Poor record-keeping

• Unlimited volume

• Transactions almost instant

• Liberty Reserve (May 2013): $6 billion AML assets since 2000

This Photo by Unknown Author is licensed under CC BY-NC-SA

Another example:

Banks and MSBs (Money Services Businesses)• Banks:

• Employee collusion

• In-effective policies and controls

• In some jurisdictions, may be a front

• MSBs:

• Currency exchangers

• Check Cashers

• Issuers or redeemers of money orders, etc

• Money transmitters

• Prepaid access providers and sellers

• Loose regulations

• Lax ID requirements

Casinos!!!

• High volume and cash intensive

• Provides a broad range of financial services

• Chips for AML:

• Hold for a while, then cash in

• Use chips as cash to purchase drugs

• Use chips to gamble, generate legit winnings

This Photo by Unknown Author is licensed under CC BY-SA

Shell Companies

• Hide ownership

• Mask financial details

• Conceal assets

This Photo by Unknown Author is licensed under CC BY-NC

Charities and Non-profits: gifts to disguise illicit assets

AML and Data Analytics

Data Analytics

Data Analytics:

What does it actually mean?

• Data Analytics - A process of inspecting, cleaning, transforming, and modelling data with the goal of highlighting useful information, suggesting conclusions, and supporting decision making.

• Data Mining is a particular data analysis technique that focuses on modelling and knowledge discovery for predictive rather than purely descriptive purposes.

• Business Intelligence - Covers data analysis that relies heavily on aggregation (summarization) , focusing on business information.

• Predictive Analytics focuses on application of statistical or structural models for predictive forecasting or classification.

• Text Analytics - Applies statistical, linguistic, and structural techniques to extract and classify information from textual sources, a species of unstructured data.

Application and Relevance

To AML Prevention

and Detection

Data ana l ys i s technol og y enab l es aud i tors and f raud exa miners to :

• Analyze business data to test the effectiveness of internal controls

• Identify transactions that may indicate fraudulent AML activity

Data analysis also provides an effective way to fight against AML.

• Provides the ability to test 100% of the records or any size sample

• Enables investigations professionals to focus investigation actions on those transactions that are suspicious or are identified as areas of weaknesses

Data ana lys i s technology a l so enables organizat ions to:

• Conduct ad hoc analysis based on a report of wrongdoing or to perform repeatable automated procedures, for continuous auditing and monitoring

• Provide insight into the integrity of financial and business operations through transactional analysis.

Improves the ability to better assess and manage AML risk.

Basic Data Analytical Process –

Planning To Post Analysis

Reporting

The standard process of analytics consists of the following phases / processes:

• Determine Scope & Requirements

• Understand the purpose of the analysis, define the objectives & profile potential fraud schemes

• Identify & Extract Information Sources

• Identify the relevant data sets, extract the data in a useable format (consider legal ramifications of data integrity if conducting and investigation and consider computer forensics) & verify the data as accurate after extraction

• Data Extract ion & Preparat ion

• Cleanse the data sets, de-duplication of data if relevant, reformat as required (maintaining records of all changes made to the data set), transform the data into the require sets for import into test tools.

• Testing & Interpretation

• Determine testing regime to be applied, conduct standard and unique tests as per the objectives, record results (Note: Most commercial tools generate and audit log for this purpose)

• Interpret the results of the data tests and identify potential issues, weaknesses or suspicious transactions and activities

• Post Analysis Phase

• Prepare data testing report (if a consulting, investigation analysis or audit)

• Determine required preventative tests to be applied on a schedule for fraud and corruption prevention activities as part of an overall data analysis program

ACFE Data Analysis Process

Planning Phase

• Understand the data

• Define examination objectives

• Build a profile of potential frauds

• Determine whether prediction exists

Preparation Phase

• Identify relevant data

• Obtain, verify, cleanse, and normalize the data

Testing and Interpretation

• Analyze the data!

Post-Analysis Phase

• Response to findings

• Monitor the data

Data Sources and

Extractions

• Big Data: data of high volume, high velocity, and high variety that requires new and different forms of processing to enable enhanced decision making, insight discovery, and process optimization

• Continuous information source

• Qualitative Data

• PDF documents

• Twitter/Blog feeds

• Audio and Video files

• Emails, texts, corporate minutes

• Mainframe and laptop software and logs

• GPS data

• Phone call meta-data

• Receipts

• Interview recordings/logs

Planning Phase

• Understand the Data

• Availability

• Structure

• Dictionary

• Links

• Define Examination Objectives and Scope

• Purpose of exam and structure/size/resources/thresholds/limits

• Build a Profile of Potential Frauds

• Also profile potential NON-FRAUD instances

• Determine Whether Prediction Exists: BASIS OF FRAUD EXAMINATION!

• Totality of circumstances would lead a professionally trained and reasonable person to conclude that fraud might be occurring

Preparation Phase

• Identify relevant data

• Obtain the data

• Verify the data:

• Control totals

• Correct periods

• Gaps/missing fields

• Reasonableness tests

• Cleanse/Normalize Data

Testing and Interpretation

Phase

• Analyze the data

• Geo-location

• Business unit

• Time period

• Dollar value

• By Unique Identifier(s)

• Issues:

• The Role of Concealment

• Addressing False Positives

• Data validity/integrity issues?

• Data merging difficulties

• Legitimate data that falls outside the norm

Post-Analysis Phase

• Respond to Analysis Findings

• Monitor the Data

• Spectrum of Analysis:

• Ad-hoc testing

• Repetitive testing

• Continuous testing

Data Mining

• Searches and explores data for previously undiscovered instances

• Can be used preventatively and for detection

• Pattern analysis, trend ratios, matches, hidden connections

• Employee/vendor

• Duplicate bank accounts

• Abnormal transaction days/times/amounts

• Round numbers

• Missing numbers

• Benford’s Analysis

Five Advantages of Using Data Analysis Software

1) allows examiner to centralize an investigation

2) assures completion and accuracy

3) bases predictions about the probability of a fraudulent situation on reliable statistical information

4) allows searches of entire data files for red flags of possible fraud

5) assists in the development of reference files for ongoing fraud detection and investigation work

Core Data Analysis Functions in Software Packages

• Sorting

• Record Selection

• Joining Files

• Multi-file processing

• Correlation Analysis

• Trend Analysis

• Time Series

• Verifying multiples of a number

• Compliance verification

• Duplicate searches

• Expressions and Equations

• Graphing

• Filter and Display criteria

• Fuzzy logic matching

• Gap tests

• Pivot tables

• Regression Analysis

• Sort and index

• Statistical analysis

• Stratification

• Date functions

• Benford’s Law analysis

Sorting

Arrange the data in a meaningful order for analysis

Record Selection

• Select specific records for analysis

• For example: NYC Office of the Mayor employees with OT pay in 2016

Joining Files

• Connects fields from two sorted input files into a third file.

• Frequently used to match invoice data with A/R files, using common identifier

Multi-file processing

• Allows the user to relate several files by defining their relationship without the use of join. For example, relate an outstanding invoice master file to A/R file using an account number. Can relate invoice numbers as well.

Correlation Analysis• Relationships in raw data

• Examine correlations in data for deviations from expected relationships

• Pair-wise relationship between two sets of data; each x has a unique y

• The strength of this relationship is measured by the correlation co-efficient

• In excel, the CORREL(array1,array2) function returns this coefficient

• IDEA example

Trend Analysis

• Calculates the values of data over time and forecasts values into the future based on the assumption that the expected behavior will continue

• Beneficial for fraud examiners to benchmark future behaviors of accounts, persons, transactions types

• Seasonal data should be examined with Time Series Analysis

• Based on linear regression using the method of least squares

• Quantifies the trend of the data – which department shows a supplies expense that exceeds past trends?

• IDEA Trend Analysis demo

Time Series Analysis

• Calculates the trend of data over time with a seasonal component

• Decomposition Method of Time Series Analysis is the most useful for FINANCIAL data

• Testing based on seasonality – higher values at year end?

• IDEA example

Verifying Multiples of a Number

• Are numbers consistent with the regular or expected rate? Or, are transactions under or above the limit? Or, do they lie just below the limit?• IDEA limits tests/IDEA stratifications

Compliance Verification

• Are controls/rules being observed?

• Feature/attribute/limits tests

• Can be scripted as apps (Excel, IDEA, ACL) to run upon command or automatically as part of a monitoring system

Duplicate Searches

• Identify duplicate values in specified fields

• Single file or joined files

• Addresses, identifiers, days, amount

Expressions and Equations

• Fraud examiners can build expressions and equations based on their knowledge and expectations of the data

• Also used with compliance testing

Filter and Display Criteria

• Fraud examiners can create filters or queries based on specific user-defined criteria that results in only those records being displayed

• Can be deployed when loading the data or later as an extraction or criterion for another test

• In IDEA: Equation Editor box

Fuzzy Logic Matching

• Matching very similar attributes that might escape normal matching algorithms

• For example: First Street, First St, and 1st Street

• Very useful when the perpetrator has taken steps to mask steps

• May produce an increased number of false positives

Gap Tests• Identifies items missing in expected sequences or series (check and

invoice numbers)

• Finds sequences where none are expected to exist (employee government ID numbers, SSNs)

Pivot Tables

• Interactive data summarization tool found in Excel and also in IDEA

• It is used to sort, count, total, or give the average of specified data

• Assists in providing the “big picture”

Regression Analysis

• Statistical method that uses a series of records to create a model relationship between a dependent variable and one or more independent variables

• Ex: Regression could be used to determine the number of widgets manufactured based on materials and labor numbers

• Periods where sales of widgets are higher or lower than expected would require analysis

Sort and Index

• Arranges the data in a manner that assists analysis – ascending, descending

• Depending on the field type, could be alphabetically or numerically

Statistical Analysis/Descriptive

Statistics

• Calculating statistics such as averages, mins and maxes and absolute values

• IDEA Field Statistics

Stratification

• Breaks the data down into intervals or strata

• Very useful for limits testing!

Date Functions

Aging analysis:

Benford’s Test

• Founded on counterintuitive observation that individual digits of multidigit numbers are not random, but follow a pattern

• Describes expected frequencies of digits in numbers

• UNTAMPERED NATURALLY OCCURING NUMBERS!

• Posits that distribution of first digits is positively skewed, or more heavily weighted toward smaller numbers

• Number series must follow a geometric sequence

• Each successive number calculated as a fixed percentage increase over previous number

• Applications

• Net income

• Earnings per share

• Income tax

• Fraud detection

Benford’s Test:

expected Digital

Frequencies

First Digit Test

• Compares the first-digit profile of a data set to Benford’s first digit profile

First-Two-Digits Test

• Compares the first two digits of a data set with Benford’s profile for the first two digits (purchases at $300 threshold)

Last-Two-Digits Test

• Compares the last two digits of a data set with Benford’s profile for the last two digits

Points regarding Benford’s

Tests

• Data should describe the sizes of similar events ($ of purchases)

• No built in Mins or Maxes in the data

• Only positive data should be analyzed

• Numbers should occur naturally, not be assigned

• Smaller amounts occurring more frequently

• Handy for identifying shell company schemes (AML or non-AML)

• Fictitious sales/checks/transactions

• Bid-splitting and other schemes involving limits (identifies concentrations)

Graphics

• Invaluable for exploratory and explanatory analysis

• Graphics are more exciting than spread sheets!

Examples of AML data

analysis queries

• Benford’s Law tests can be used to highlight abnormal duplications. These duplications may be the result of making up expense numbers to offset illicit funds recorded in revenue to avoid paying tax on the excess revenues.

• The duplications may also be in the made-up revenue recorded in the sales register.

• The first two digits test, the last two digits test, and the numbers duplication test of Benford’s Law can be utilized.

• The relative size factor test can flag transactions in sales or expenses that are out of line for each customer or vendor.

• The same-same-same test and the same-same-different test can output specific duplications within selected fields, and those duplications with a selected difference field.

• The even amounts test/round numbers: payments paid in exactly even thousands or hundreds of thousands.

Examples of AML data

analysis queries

• Extract and review cash transactions from the payment register.

• Extract from sales or accounts receivable files high amounts paid with cash.

• Compare bank deposits with sales by joining electronic bank statement records with accounts receivable credits.

• Summarize sales from source categories for each year, join, and chart to determine unusual increases in revenue.

• Extract from the asset register significant additions and disposals and review.

• Test if transactions were at fair market value.

• Extract from the asset register items that are not normally associated with the nature of the business, such as works of art, precious metals, and so on.

• Extract from the liabilities loan accounts and review for unusual arrangements.

• Extract high-interest payments made and review.

Examples of AML data

analysis queries

• Extract related-party transactions from purchase and sales.

• Extract from the customer master file new additions and join to sales and summarize

• Extract from the vendor master file new additions and join to purchases and summarize

• Summarize sales by unit item. Summarize costs of goods sold by unit item and join to the summarized sales file. Calculate the gross margin and extract those with unusually high margins.

• Summarize sales by unit item and by customer and extract those customers who were charged significantly more than normal. The Z-score test would be appropriate here.

• Extract transactions with offshore entities.

• Create a list or file containing countries that are considered high risk for money laundering and extract transactions with those countries.

Cash manipulation and AML

• Misappropriation of incoming cash and cash equivalents

• Check washing: using chemicals to erase data from checks such as the payee name, the date, and the check amount

For AML:

• Altering amount of check received

• Altering amount of expenses to offset enhanced revenue

Cash Manipulation

• Case study: “Sample – Detailed Sales”, “Sample – Detailed Previous Year”, 2015 and 2014 respectively, in the sample project• Append an 11th field called MONTH: @Month(INV_DATE), to isolate month in

each dataset• Summarize Sales Representative field and total on the sales before taxes• Set view vertically to display files side-by-side• Combine both files using JOIN feature, create a new joined file called

“2014_2015 by sales rep”• Visualize your results. Visualize the number of sales records per sales rep for

the two years.• Pivot Table: for 2015, create pivot tables by sales rep, on month and then on

month/custno.

Customer and Billing Schemes

• Submission of a false or an altered invoice

• Payables fraud (to shelter income):• Fake vendors (collusion required for goods, less so for services)• Altering and/or double paying non-complicit vendor’s statements• Making personal purchases with company funds (ie procurement cards etc)

• Dummy or shell companies:• Post office box• No phone number• Duplication of employee data: addresses, names, phone numbers, bank

accounts

Billing Schemes

• Case Study: using OK data set • Please extract 10,000 records from the dataset• Create a NEW PROJECT in IDEA: OK Vendor Payments_2015• Upload your 10,000 row excel file into this project, please name the

dataset “State Vendor Payments”• Open the file and perform field statistics• Payment amount and transaction type are most interesting

• B—The voucher type for all the records is JRNL with PAYMENT_AMOUNT as zero; it seems that these are journal entries

• C—Contains both positive and negative amounts in the PAYMENT_AMOUNT field

• H—Contains negative amounts and are noted as Regular Voucher

• P—Paid amounts

• R—Refunds

• W—Negative amounts

Billing Schemes

• Case Study, cont.:

• Extract all records that are paid to new file name: “Payments trans type P”, by using the equation: TRANSACTION_TYPE=“P”

• Using this new paid file, perform Benford’s Tests (pg 138) on the payment amounts. • Benford’s First Digit test• Create a 3D bar chart of the first two digit tests• Create a 3D bar chart of the last two digit tests

• Using the same paid dataset, extract a new file called “Even Thousand Amounts” using this equation (pg 141):

(PYMNT_AMT % 1000) = 0 .AND. PYMNT_AMT <> 0

Check-Tampering Schemes

• The sheer volume of business payments still made by check today will maintain this as the preferred method of payment.

• The traditional check-tampering fraud schemes will continue to exist as long as check payments exist.

• Electronic-payment systems open the door to new types of fraud that must be guarded against.

• Many organizations use both traditional checks and electronic transfer payments.

• It is not unusual that an organization would use electronic direct deposits for their payroll and checks as payments for everything else.

• It is also not unusual for a business to use a hybrid system for receiving payments.

Check Tampering: The Payee

• Checks can be made out in favor of the fraudster, an accomplice, shell company, or even cash. They can also be made out to legitimate vendors to pay for personal items. Checks made payable to the fraudster, while easy to cash, are also easier to detect.

• If checks are already prepared, the payee name can be altered and replaced with the fraudster’s name. Amounts can be also changed. Modification of the existing name by adding additional letters to the end of the payee line or setting up shell companies with similar names of legitimate vendors facilitate the conversion of checks to be cashed.

• If the fraudster has access to the payments system in updating or changing vendor names, this can be done just prior to a check being issued and then changed back afterward.

• Addresses may also be changed at the same time to divert the check to the fraudster or an accomplice.

• If the check is made out to a third party, then the fraudster would have to forge the endorsement also. Having matching identification may be an issue for the fraudster.

Case Study: Check Tampering

Open the Samples Project and use the Payments file for these tests.

1) Normalize the inconsistent data in the AUTH file by APPENDING another field – name this field AUTHORIZED with the equation of @Strip(AUTH).

2) Next, test for separation of duties. Perform a direct extraction, creating a new dataset called SEP_OF_DUTIES, with the equation:

AUTHORIZED==POSTED_BY

3) Last, test for missing entries by creating a new dataset called BLANK_ENTRIES, with the equation:

AUTHORIZED = = “ ” .OR. POSTED_BY = = “ ”

Payroll Fraud and AML

• Ghost Employees

• Falsified or Excess Overtime

• Fraud related to commissioned earnings

Case Study: Payroll Fraud – test for

employees with the same address

Result:

Test for payments made after

termination dates

Test for payments made after

termination dates

RESULT: Test for payments made after termination dates

Expense Reimbursement Schemes and AML

Travel and Entertainment

expenses

Procurement Cards

• Personal items

• Expenses that never materialized or were subsequently canceled

• Fake or Altered Receipts

Improper Expense claims

include:

• Overstated Expense Reimbursements

• Mischaracterized Expense Reimbursements

• Multiple Reimbursements

• Fictitious Expense Reimbursements

Fraudulent Expense

reimbursements:

Falsified Travel Expenses and AML

Load the provided excel travel expenses file into IDEA as a managed project called Employee Travel Expenses_Ch 11

Days Traveled Test: Create a field called DATE_DIFF with the equation:

@Age(END_DATE, START_DATE)

Same Day Traveled with Accommodation Charges:

DATE_DIFF = = 0 .AND. ACCOMMODATION > 0

Same Day Traveled with Flight Charges:

DATE_DIFF = = 0 .AND. AIR_FARE > 0

Same Day traveled with both fight and accommodation changes:

DATE_DIFF = 0 .AND. AIR_FARE > 0 .AND. ACCOMMODATION > 0

Travel Expenses –

Traveled with Flight but No Accommodation Charges:

DATE_DIFF > 0 .AND. AIR_FARE > 0 .AND. ACCOMMODATION = 0

SAME-SAME-SAME (Duplicate Tests)

Using the main Travel Expenses data set, test for duplicates on START_DATE and for EMPLOYEE_NO.

Using the main Travel Expenses data set, test for duplicates on EMPLOYEE_NO and AIR_FARE, where AIR_FARE is greater than 0.

Likewise, test for duplicates on EMPLOYEE_NO and ACCOMMODATION, where ACCOMMODATION > 0.

Extraction bases on Audit Unit: Please extract the following, creating a new data set called Ass.Dep.Min.:

POSITION = = “Assistant Deputy Minister”. Please display the result in a graphic format that you feel is most appropriate.

Types of Non-Cash Misappropriations Misuse and Abuse

Unconcealed Misappropriations

In plain view

Suspicion only

Poor management/employee relationships

No whistleblowing process

Perpetrator hold management position

Lack of desire to get involved

Transfer of Assets

Proprietary Information

Concealment of Non-Cash Misappropriations for AML• Falsifying Sales or Purchase records

• Remove sales documents from files before shipping• Overstate COGS and ship to an accomplice for billing adjustments• Charging small sales to a customer with a large A/R balance• Charging a larger sale into smaller chunks spread across several customers• Discount/write-off the false sale to bad debt

Create false sales and shipping docs

Inventory is increased

Fake sale occursWrite-off

subsequent A/R if needed

Example: Round Dollar Payments

Data AnalyticsData Files

Ensure data validity

Consider data format and structure

Compare the cost and benefit of potential analysis

Consider the spectrum of distinct levels of aggregation at which fraud monitoring is required

BEGIN WITH THE END IN MIND (will this algorithm hold up under intense scrutiny of a court case???)

Other Analytics for

AML Schemes

• Social media/web-scraping

• Summarize per employee/vendor for links

• Analyze all bid/purchase data for reasonableness

• Match bid data to originals

• Look at successful bid trends

• Run duplicate tests for addresses, etc.

Concealments: Look at fields such as “Consulting fees”, and “legal fees”

Look for personal relationships, family connections –more qualitative examination/investigation

CONCLUSION: Controls, right-to-audit, identify red-flag transactions

Textual Analytics

• Social media Posts

• Instant Messages

• Videos

• Voice/audio files

• User Documents

• Mobile software apps

• News feeds

• Sales and marketing materials

• Presentations!

Enhanced Text Mining:• Weighted fraud indicators• Emotive tone• Unethical behavior• Entity Extraction• Text link Analysis• Social Network Analysis• Fraud Triangle Analytics

Suggested Fraud Keywords for AML

Pressure: deadline, quota, trouble, short, excessive, overage, problem, alert,

concern, limits

Opportunity: Override, inflate, revenue inflation, expense

padding, adjust, reserves, new vendors, consulting fee, legal expense, incentive payment, donation, goodwill payment

Rationalization: reasonable, deserve, temporary

Additional Visual Analytics: Tree

MapsGraphical representation of the data by

rectangular spaces, size, and colors/intensities

Additional Visual Analyics: Link Analysis with SAS Visual

Investigator

• Creates visual representations of links between people, social networks, and indirect relationships

• Beneficial for AML: can track the placement, layering, and integration of money as it moves around unexpected sources

• Can also: associate communications, uncover indirect relationships, show discreet connections, and demonstrate complex networks

Geo-Spatial Analysis

• Displays geo-locational data along with other attributes, can reveal intersections and clusters of behavior

Clustering (with WEKA)

As defined by Sharma & Panigrahi (2013):

“is known as gaining insights and identifying interesting patterns from the data stored in large databases in such a way that the patterns and insights are statistically reliable, previously unknown, and actionable [3].

• Cluster analysis as a data mining technique helps finding similar objects in data.• Kaufman & Rousseeuw (2009) have

defined cluster analysis as ”the art of finding groups in data.”

“Birds of a Feather Flock Together”

Legacy costs and budget maneuvers explain 58.42% of the point variability...

Artificial Intelligence

and AML

• Artificial intelligence (AI) allows IT systems to imitate the cognitive ability of human – “problem solving”, “reasoning”, “planning” and “learning”

• AI enabled systems possess inbuilt intelligence to sift through, aggregate, blend, and identify patterns and relationships that are buried within mountains of data - a large number and types of data sources.

• Customer onboarding

• Link analysis

• Customer segmentation

• Screening

• Risk management

• Transaction monitoring

• Alert investigation, reporting and case management

• HSBC is partnering with Ayasdi, FinCEN has been using its own AI system FAIS

Blockchain and AML

• Bitcoin as digital currency is highly suspect, not on official books

“It essentially provides users with a digital public record of Bitcoin transactions (the digital currency through which these transactions are conducted) that have been executed by a particular entity. It is inherently difficult for hackers to manipulate”

• Blockchain – semi-private? Peers? Impossible to change values without consensus….plus changes are recorded

Evaluating Data Analysis

Software

• Data import/export capabilities

• Data visualization

• Suite of tools?

• Tailoring:

• Performance

• Functionality

• Usability

• Support for additions

Possible data Mining and Analysis Software

Excel IDEA/CaseWare ACLActiveData for

ExcelThompson

Reuters

Tableau Python R WEKA Oversight

SAS OracleIBM Watson

and IBM Blockchain

Thank You! Questions?

[email protected]