1 Continuity Equations: Analytical Monitoring of Business Processes in Continuous Auditing Michael...
-
Upload
jasmine-ferguson -
Category
Documents
-
view
218 -
download
0
description
Transcript of 1 Continuity Equations: Analytical Monitoring of Business Processes in Continuous Auditing Michael...
1
Continuity Equations: Analytical Monitoring of Business Processes in
Continuous Auditing Michael G. AllesAlexander Kogan
Miklos A. VasarhelyiJia Wu
12th World Continuous Auditing SymposiumNov 3-4, 2006
2
IT-enabled Business Processes (BPs)• A business organization consists of a variety of business
processes.• A business process is “a set of logically related tasks
performed to achieve a defined business outcome,” Davenport and Short (1990).
• Modern information technology makes it possible to measure and monitor business processes at the unprecedented level of detail (disaggregation) on the real-time basis. But currently there is a lack of BP control monitoring.
• Continuous auditing (CA) methodology can utilize the IT capability to capture BP data at the source and in the disaggregated and unfiltered form to achieve more efficient, effective and timely audit.
3
Comparison between Conventional Analytical Procedures and CA Analytical Monitoring
Conventional Analytical Procedure
• Focus on financial data• Audit data are
summarized and aggregated.
• Analytical modeling based on the relationships between financial accounts
• Ratio Analysis, trend analysis, reasonableness tests
CA Analytical Monitoring
• Focus on business processes data
• Audit data are unfiltered and disaggregated.
• Analytical modeling based on the relationship between business processes
• Continuity equation models
4
Reengineering of Substantive Testing in CA
• AP can be used in the planning, substantive testing, and reviewing stages of an audit. We focus on AP in substantive testing.
• Conventional auditing:– First, apply analytical procedures to identify potential problems.– Then, focus detailed transaction testing on the identified
problem areas.• CA – the sequence is reversed:
– First, apply automated general transaction tests to all the transactions and screen out identified exceptions for resolution.
– Then, apply automated analytical procedures to the transaction stream to identify unforeseen problems.
– Finally, alarm humans to investigate anomalies. (Targeted transaction tests)
5
Enterprise System Landscape
Ordering
Accounts Payable
Materials Management
Sales
Accounts Receivable Human Resources
Business Data Warehouse
Automatic Transaction Verification
Exception Alarms
Automatic Analytical Monitoring: Continuity Equations
Anomaly Alarms
Data-oriented Continuous Auditing System
Responsible Enterprise Personnel
6
Data-oriented CA: Automation of Substantive Testing
• Automation of Transaction Testing:– Formalization of BP rules as transaction integrity and validity
constraints.– Verification of transaction integrity and validity detection of
exceptions generation of alarms.• Automation of Analytical Procedures:
– Selection of critical BP metrics and development of stable business flow (continuity) equations.
– Monitoring of continuity equation residuals detection of anomalies generation of alarms.
• This presentation focuses on the automation of APs.
7
Advanced Analytics in CA: BP Modeling Using Continuity Equations
• Continuity equations:– Statistical models capturing relationships between various
business processes rather than financial accounts.– Can be used as expectation models in the analytical
procedures of continuous auditing.– Originated in physical sciences (various conservation laws:
e.g. mass, momentum, charge).• Continuity equations are developed using statistical
methodologies of: – Linear regression modeling (LRM);– Simultaneous equation modeling (SEM);– Multivariate time series modeling (MTSM): Vector
Autoregressive Model (VAR), Subset-VAR, Bayesian-VAR (BVAR).
8
Basic Procurement Cycle
P.O.(t1)
Receive(t2)
Voucher(t3)
t2-t1
t3-t2
9
P.O.(t)= 0.24*P.O.(t-4) + 0.25*P.O.(t-14)+ 0.56*Receive(t-15) + εPO
Receive(t)= 0.26*P.O.(t-4) + 0.21*P.O.(t-6)+ 0.60*Voucher(t-10) + εR
Voucher(t)=0.54*Receive(t-1) - 0.17*P.O.(t-9) + 0.22*P.O.(t-17) + 0.24*Receive(t-17) + εV
Inferred Analytical Model (Subset-VAR) of Procurement
10
Steps of Analytical Modeling and Monitoring Using Continuity Equations
• Choose essential business processes to model (purchasing, payments, etc.).
• Define (physical, financial, etc.) metrics to represent each process: e.g., $ Amount of purchase orders, quantity of items received, number of payment vouchers processed.
• Choose the levels of aggregation of metrics:– By time (hourly, daily, weekly), by business unit, by
customer or vendor, by type of products or services, etc.
11
Steps of Analytical Modeling and Monitoring Using Continuity Equations-II
• Identify and estimate stable statistical relationships between business process metrics – Continuity Equations (CEs).
• Define acceptable thresholds of variance from the expected relationships.
• If the variances (residuals) exceed the acceptable levels, alarm human auditors to investigate the anomaly (i.e., the relevant sub-population of transactions).
12
How Do We Evaluate CE Models?• Linear Regression Model is the classical
benchmark for comparison.• Models are compared on two aspects:
– Prediction Accuracy, and– Anomaly Detection Capability.
13
Prediction Accuracy Comparison: Results Analysis
• Mean Absolute Percentage Error (MAPE) is used to measure prediction accuracy.
• Prediction accuracy comparison results:– Multivariate Time Series (best).– Linear regression (middle).– Simultaneous Equations (worst).
• Difference is small (<2%).• Noise in our data sets may pollute the results.• Prediction accuracy is relatively good for all continuity
equation models:– There are studies in which MAPE exceeds 100%.
14
Simulating Error Stream: The Ultimate Test of CA Analytics
• Seed errors of various magnitude into randomly chosen subset of the holdout sample.
• Identify anomalies as those observations in the holdout sample for which the variance exceeds the acceptable threshold of variance.
• Test whether anomalies are the observations with seeded errors, and count the number of false positives (Type I ERR) and false negatives (Type II ERR).
• Repeat this simulation several times by choosing different random subsets to seed errors into.
15
Measuring Anomaly Detection• False positive error (false alarm, Type I error): A non-
anomaly mistakenly detected by the model as an anomaly. Decreases efficiency.
• False negative error (Type II error): An anomaly failed to be detected by the model. Decreases effectiveness.
• A good analytical model is expected to have good anomaly detection capability: low false negative error rate and low false positive error rate.
16
Simulated Real-time Error Correction
• CA makes it possible to investigate a detected anomaly in (nearly) real-time.
• Anomaly investigation can likely correct a detected problem in (nearly) real-time.
• Real-time problem correction results in utilizing the actual (not erroneous) values in analytical BP models for future predictions.
• Real-time error correction is likely to make subsequent anomaly detection more accurate, and the magnitude of this benefit can be evaluated using simulation.
17
Subset VAR Model Com parison: α = 0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0% 50% 100% 150% 200% 250% 300% 350% 400% 450%
Seeded Error Magnitude
Type
I an
d II
Err
or R
ate
False Negative: Error CorrectionFalse Negative: Non-CorrectionFalse Positive: Error CorrectionFalse Positive: Non-Correction
18
Model Error Detection Comparison: α = 0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0% 50% 100% 150% 200% 250% 300% 350% 400% 450%
Seeded Error Rate
Type
II E
rror
Rat
e
Subset VARBVARSEMRegression
19
Error Detection: Aggregated Data vs. Disaggregated Data
• In CA the disaggregated data are available. Can the disaggregated data boost anomaly detection performance?
• Dimensions for aggregation and disaggregation: – temporal and geographic.
• A comparative simulation study of error detection vs. BP metric aggregation has to examine different aggregation patterns of seeded errors:– Best case – aggregated error (e.g., total weekly error seeded in a
single day)– Worst case – disaggregated error (e.g., total weekly error is equally
partitioned between every day of the week) – Intermediate case – somewhat disaggregated error
20
Subset VAR Model Comparison: α = 0.05
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0% 50% 100% 150% 200% 250% 300% 350% 400% 450%
Seeded Error Magnitude
Type
I an
d II
Erro
r Rat
e
False Negative: Weekly
False Negative: Best Case Daily
False Negative: Worst Case Daily
False Positive: Weekly
False Positive: Best Case Daily
False Positive: Worst Case Daily
21
Subset VAR Model Comparison: α = 0.05
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0% 50% 100% 150% 200% 250% 300% 350% 400% 450%
Seeded Error Magnitude
Type
I an
d II
Erro
r Rat
e
False Negative: Entire Company
False Negative: Subunit Best Case
False Negative: Subunit Worst Case
False Positive: Entire Company
False Positive: Subunit Best Case
False Positive: Subunit Worst Case
22
Results and Conclusions from Simulation Studies
• Various statistical methods can be used to derive expectation models of acceptable quality:– Linear regression is often OK;– Multivariate time series methodology can provide somewhat
more accurate models.• Real-time error correction significantly improves
error detection capabilities of all models.• More disaggregated models are not always better:
weekly data can be more stable than the daily one.• Alarms have to be managed – trade-off between Type
I and Type II errors.
23
Concluding Remarks• New CA-enabled analytical audit methodology:
simultaneous relationships between highly disaggregated BP metrics.
• How to automate the inference and estimation of numerous CE models?
• How to identify and remove outliers from the historical data to estimate statistically valid CEs (step-wise re-estimation of CEs)?
• How to choose the confidence level for generating alarms (trade-off between Type I and Type II errors: efficiency vs. effectiveness)?
• How to make it worthwhile (is it worth the cost)?