Business Rule Computation in Distributed Organizations

55
Business Rule Computation in Distributed Organizations George Dimitoglou Research Advisor Prof. Shmuel Rotenstreich

description

Business Rule Computation in Distributed Organizations. George Dimitoglou Research Advisor Prof. Shmuel Rotenstreich. Outline. Introduction Related Work Problem Description Solution & Results Conclusions Future Work. Environment. Complex, dynamic, distributed organizations Activities - PowerPoint PPT Presentation

Transcript of Business Rule Computation in Distributed Organizations

Business Rule Computation in Distributed Organizations

George Dimitoglou

Research AdvisorProf. Shmuel Rotenstreich

2

Outline

• Introduction• Related Work• Problem Description• Solution & Results• Conclusions • Future Work

3

Environment

• Complex, dynamic, distributed organizations• Activities

– Acquisition of services, resources– Collaboration, team formation

• Rules & Constraints– Business Rules are statements of “How to do business”– Faster, systematic decisions

4

Objectives

• To investigate Business Rule– Management– Distribution– Execution

• Develop a general computational model for constraint computation

5

Outline

• Introduction• Background and Related Work• Problem Description• Solution & Results• Conclusions• Future Work

6

Related Work: Organization Theory

• Organizational Design– Formalization (standardization): Number of rules, documentation

and policy manuals– Theories

• Bureaucracy: order, uniformity, consistency, well-defined hierarchy, record-keeping, rules.

• Systems Theory: inputs, outputs, transformations and prescribed interactions

• Organization Structure– Complexity

• Horizontal (specialization), Vertical (hierarchy), Spatial (geographic)– Centralization/decentralization

• decision-making locations (not spatial) within the organization– Theories

• Closed vs Open Systems:Simple, static, introverted environments Complex, dynamic, environment-interacting

7

Related Work: Computer Science

8

Related Work: Data-Centric Approach

9

Related Work: Process-Centric Approach

10

Outline

• Introduction• Background and Related Work• Problem Description• Methodology and Solution• Results• Conclusion• Future Work

11

Problem Description

• Managing Constraints in distributed environments• Existing approaches

– Small, static, contained, monolithic systems• Shortcomings

– Centralization– Lack of operation capability in dynamic environments

• Application-only interaction

– Lack of distributed operation– Lack of Dynamic Rule support

• No real-time rule changes

12

Outline

• Introduction• Background and Related Work• Problem Description• Solution & Results• Conclusion• Future Work

13

Methodology

• Business Rules– Formulation – Developed/Extended a simple rule type classification

• Software prototype– Devised methods for Rule processing – Developed architecture for multiple rule engines and

repositories– Developed distributed and non-distributed processing

algorithms– Added features to enhance distributed processing

14

Business Rule Formulation

• Declarative statements that constraint, validate, compute, approve or perform any operation.

• Example rule (Stimulus/Response):

R: if P1 P2 … Pn then A1 A2 … Ak

wheren ≥ 1 is a conjunction of n predicates (conditions) k ≥ 1 is the set of conjunctions of k results (actions)

Predicates (Pn) and results (Ak) express values of operands using relational operators from the set S ={>, <, ≠, =, ≤, ≥}

15

Formulation Example

• Each Business Rule is described using XML

16

Business Rule Elements

<ruleid>value</ruleid>(Unique rule identifier)

<task_type>value</task_type> (Matching applicable rules to tasks)

LHS

RHS

<creation-date><effect-date><expiry-date><lifetime><owner><status>(“Administrative” elements)

17

Business Rule TypesStimulus/Response

• Define what conditions must be met before an activity can legally take place:

RSR: if P1 P2 ... Pn then A1 A2 ... Ak or,

RSR': when P1 P2 ... Pn then A1 A2 ... Ak

Example: Task: check account balance, print a statement. Rule:

RSR: if (client has an account) then (check account balance) (print

statement) Task: add quarterly interest dividends to account.

RSR’: when (end of quarter) then (calculate interest)

(add interest to balance)

18

Business Rule TypesOperation Constraint Rules

• Define constraints that must hold before and/or after an activity.

ROC: if P1 P2 ... Pn then [Q] A1 A2 ... Ak [S]

where

Q: pre-condition stating properties that must hold when activity is to be performed

S: post-condition stating properties that will hold after the activity is completed.

• Example:

– Task: withdrawing funds from a bank account

ROC: if (client has an account) then [check account has funds]

(withdraw amount) (print statement) [balance 0]

19

Business Rule TypesStructure Constraint Rules

Specify constraints on tasks, which must never be violated. Software Engineering class and loop invariants – Class invariant: assertion describing a property, holding for all class

instances.– Loop invariant: assertion to be satisfied prior to the first loop execution,

preserved at each iteration, and still hold on loop termination.

• Rule R has an invariance property for task Ti:– if R is valid in every computation state ("must always hold")

during processing of Ti.

RSC: it must always hold that P1 P2 ... Pn

or,RSC: it must always hold that

if P1 P2 ... Pn then A1 A2 ... Ak

Example:RSC: it must always hold that (minimum balance=$100)

RSC: it must always hold that if ( account = savings) then (minimum balance=$500)

20

Business Rule TypesComputation

• Describe algorithm processing or equations. • Extension (or general case) of Stimulus/Response Rules. • Predicates are TRUE and activity is an algorithm execution or a computation.

RCR: y=f(x)• Example:

COMPUTE interest AS interest = principal * years * rate/100

Instantiate class Computation that parses parameters of the rule XML and call the relevant methods for execution.

21

Business Rule Lifecycle

• Rule execution can be formally expressed as a DFA R by the tuple:

R= (Q, , q0, F, )where Q: finite set of states: finite set of input events e. ={activate, trigger, execute, deactivate} F: set of final states. F={new, active, dormant}: transition function from Q x to Q so that (q, e) is a state for each state q and

input event e.

22

Task Processing

23

The Business Rule Engine (BRE)

• Components• FIFO Queue• Engine• Repository (RuleBase)• Interfaces

• Functionality• Rule storage• Search• Enforcement

24

Task Processing Search-Match-Process-Enforce

Search

MatchCollect

Enforce(modify task)

1) Create “candidate set”2) Resolve any conflicts3) Create execution sequence

25

Rule Processing Algorithm

ProcessTask– ParseTask

• Parameters

– RetrieveRules• Search Repository• Store in candidate set

– CategorizeRules• Type-specific

processing

– ResolveConflicts• Check for conflicts• Resolve• Create execution

sequence

– ApplyRules• Modify task

26

Rule Processing Algorithm Time-Complexity Analysis

• Methods• Time-Complexity Analysis

– Hash-based repository operations (search, retrieval) perform in constant time O(1).

– PARSETASK, RETRIEVERULES, CATEGORIZERULES, APPLYRULES perform in linear time O(n).

– Overall complexity depends on RESOLVECONFLICTS

• Quick sort with O(nlogn) – Empirical data

• Overall– O(nlogn)

27

Conflict Resolution

• Conflict categories – Sequence Conflicts. Affect rule execution sequence.

• Two applicable rules, one accepting the task and the other rejecting it. – Parameter Conflicts.

• Two or more rules modify same task parameter, but compute its value using different formulas and produce different results.

a) Rule A: modifies parameter - Rule B: with different values b)Rule A: Removes parameter - Rule B: adds itc) Rule A: Modifies parameter - Rule B: eliminates itd)Rule A & B: Compute same parameter using different

formulas.

• Resolution– Dynamic (each time) while in the candidate set– Execution sequence– Resolution factors

• Rule Type• Priority• Timestamp

28

Distributed Rule Processing

Overall Environment• Multiple units of one or more

organizations• Some units collaborate others don’t• BREs “scattered” throughout• BRE Locality

BRE infrastructure• “Underlying” network of peer BREs• BREs: Uniquely identifiable and

addressable• Shared and interconnected rule-sets

Example:IF operand=value THEN

<Rule Engine>.<Rule number>

29

Distributed Rule Processing Cont’d.

• Ad hoc team formation (TTL=2)– BRE0: Team Leader– Hash-table sharing. All rule results to

team leader BRE0

• Membership– New members– Existing members removed– Leader failure dealt via LEA

Team Leader

Team Formation

• Memory (Caching)• Optimistic Fault Tolerance

30

Distributed Rule Processing Algorithm

• Additional Functionality– Interaction with Resource Discovery mechanisms– Query Propagation– TTL for control– Ad hoc team maintains storage of shared rule sets

31

Distributed Rule ProcessingTime-Complexity Analysis

• Multiple P2P-connected BREs• Add network and communication costs• Performance of processing a single task:

where k is the number of rule enginesm represents a communication costs constant

Overall: O(nlogn*m)

32

“Nested” Rules

• Single BREIF operand=value THEN <Rule number>*

• Problem: Termination • Solution:

– forbid loops during formulation (smart interface)– Max occurrence limits in candidate set

• Multiple BREs IF operand=value THEN <Rule Engine>.<Rule number>

• Problem: Termination, Cascading rules - no global control• Solution: Max cycle occurrence in candidate set

Max depth-limit enforcement

* All rule types are subject to “nesting”

33

Procedures

• Ordered rule sequences• Executed/enforced on task(s) as a single rule• Precedence of execution over individual rules

• Advantages– Additional, useful construct, controlling the order of rule execution– For repetitive tasks, performance improvement (no conflict resolution)

• Disadvantages– Black box– User must ensure no conflicts

34

ProceduresImplementation

(a) Sample procedure expression using XML and (b) the corresponding XML tree

35

Global Variables

• Distributed System View• Variables with “scope” over organizational segments• Lack of global control facilities• No monitoring or controlling of activities• Local & “Global” inter-dependent constraints

• Rule processing is unaware of such variables• GVAR types:

– Counters (establish limits, tasks accepted/rejected)– Advisories (express an advise or suggestion, BRE may enforce/ignore

36

Global VariablesImplementation

• Distributed GVAR Network– Master (read/write)– Proxies (read)

• “replicate everywhere” algorithm

• Proxies– faster GVAR lookups– no bottlenecks

• Master– consistency

GLOBAL VAR IABLE VALUE APPLICABLE TASKS TYPE SCOPE

simultaneous_travel 5 faculty_travel-request counter departmentwireless_connections 100 wireless_net_access counter subnetmin_rescue_crew 3 rescue_operation (all) advisory universal

37

Environment Snapshot

38

Task Temporal Relations

• BRE “Memory”– Organizational Memory– Learning

• Task Temporal Sequences– T1 happens before T2 (T1 T2)

– T1 overlaps T2 – Time delay

• Task Causal Sequences– T1 causes a task T2 (T1 T2)

– anti-symmetric– irreflexive

39

Task Temporal RelationsImplementation

TEMPORAL SEQUENCES DEFINITION

Def_ID Scope Task1 Relation LengthUnit

ofTime

Task2

243 CSDept,EEDept.

advising before registration

335 LawSchl.

pay_loans startduring/completeafter

graduation

567 Global doctoralproposal

delay 6 months doctoraldefense

: : : : : : :: : : : : : :

809 Global probation causes send letter

REAL-TIME SEQUENCES

Seq_ID Owner Task1 Timestamp1 Status1 Task2 Timestamp2 Status2

243 stdnt,181 meetAdvsr 1/3/01 12;15 Y RegForClass 2/3/01 9:30 Y243 stdnt.344 meetAdvsr 6/3/01 11:45 Y809 stdnt.676

: : : : : :: : : : : :

567 stdnt.852 onProbation

• Distributed TSEQ Network– Proxies contain both:

• Read-only

– Master contains both:• Read/Write

40

Performance Analysis

• Measurements– Capacity– Response Time (latency)– Throughput

• Methodology– Creation of multiple synthetic rule sets– For Capacity Testing: Multiple, variable size rule loads– For Latency and Throughput, multiple variable size rule

loads with variable percentages of applicable rules: 1%, 10%, 25%, 75% and 100%

41

Performance AnalysisCapacity Testing

• Repository type• Synthetic data set (1 x

rule=1342b)• 55K (Java tuning required)• > 105K, SAX limitation

• Memory-bound, CPU: idle

ID Number of Rules BRE Memory Foootprint (K) Memory Usage (K) Available File Cache

1 1 17,084 117,820 425,196 26,2922 100 17,992 118,092 426,888 24,1883 1,000 18,896 120,124 432,500 22,3484 2,000 19,876 119,576 433,480 21,1005 5,000 22,684 122,404 432,816 19,9206 10,000 28,104 135,408 425,732 19,8487 15,000 33,584 135,860 422,324 19,5048 20,000 39,116 150,676 415,144 19,6689 25,000 44,432 149,736 411,268 19,584

10 30,000 48,812 149,144 406,264 19,60411 35,000 55,568 172,224 404,600 17,48812 40,000 60,240 172,748 398,696 17,86813 45,000 66,140 174,784 394,152 17,63214 50,000 71,076 173,616 388,280 17,52415 55,000 75,052 180,760 385,588 17,66416 60,000 80,796 180,696 378,948 17,54017 65,000 86,688 206,996 354,352 23,67618 100,000 125,264 255,248 327,812 22,04819 105,000 163,840 303,500 301,272 20,42020 150,000 179,816 344,808 280,540 18,06421 200,000 232,084 351,844 224,688 20,95222 250,000 292,332 501,508 172,380 14,40823 300,000 337,144 502,924 123,364 18,54824 350,000 389,432 499,180 70,604 18,15625 400,000 425,196 531,164 37,284 19,51626 450,000 464,156 582,704 5,488 14,50827 460,000 457,116 581,348 5,116 11,876

Physical Memory (K)

42

Performance AnalysisResponse Time (Latency) Testing

• Objective: BRE time to process single task with variable loads. • Methodology: Load synthetic rule sets with variable applicable rule

loads and send task.

• Results: strong relation between latency - % of applicable rules

10-fold increaseFrom 200 to 2000 rules

Just doubles the latency

10-fold increaseFrom 200 to 2000 rules

Increase by a factor of ~11

43

Performance AnalysisResponse Time (Latency) Results

• Task latency has little variation with same number of applicable rules, even with variable total loads.

00:01.172

00:01.578

00:01.375

00:08.750

00:08.156

00:08.905

00:00.000 00:01.728 00:03.456 00:05.184 00:06.912 00:08.640 00:10.368

Same number of Applicable Rules over Single Task Latency (mm:ss.SSS)

10K/100K

10K/40K

10K/20K

100/100

100/1K

100/10K

44

Performance AnalysisThroughput

• Objective: Determine max number of tasks that can be processed during a given period of time (sec, min, hr).

• Methodology: Same synthetic rule sets as with latency experiments.

• Results: throughput affected by % of applicable rules AND total rules

45

Performance AnalysisQueuing Theory

• Objective: BRE response times as a function of utilization (“stress test”)

• Methodology: used Throughput results (service rate).

• Results:

46

Performance AnalysisQueuing Theory Results

• Result: a consistent way of identifying optimal BRE utilization at any rule load

BRE Utilization

1.0

Utilization Stability Condition

at p < 1

Max Throughput

Saturation

“unstable”

47

Outline

• Introduction• Related Work• Problem Description• Solution & Results• Conclusions• Future Work

48

Conclusions & Research Contributions

• Original approach computation of distributed constraints.

• Software prototype proof-of-concept novel ideas (in the context of BR): – “minimalist” expression of constraints in XML

• flexible rule formulation, minimal semantic specification– Lightweight standalone/distributed rule engines – Fast rule-searching algorithm– Rule Management, Task Processing, Conflict Resolution– Conflict resolution algorithm– Building “local caches” for fault tolerance– Managing global variables and ensuring consistency of temporal sequences

• Development of general computational model to solve problems in domains with similar computation and distribution requirements.

• Tool-set to understand dynamic environments.

49

Future Work

• Variable rule enforcement

• Task generation

• Business rules and context: distributed BREs as context awareness processors

• Business rules and policy

• Automatic "bottom-up" and "top-down", business rule generation– Bottom-up: local computations contribute in creating rules with

global scope– Top-down: automatic distribution of global rules and variables to

local rule engines

Business Rule Computation in Distributed Organizations

George Dimitoglou

Research Advisor Prof. Shmuel Rotenstreich

51

Performance AnalysisCompared to What?

Repository Capacity (number of rules)

0

20000

40000

60000

80000

100000

120000

BlazeAdvisor Repository

GW BRE (single)

EnvironmentBlazeAdvisorSystem: UnknownSoftware footprint: > 500KGW BRESystem: DELL 1.1Ghz, 512Mb RAMSoftware footprint: > 400K

BlazeAdvisor

GW BRE

52

Rule Processing as a Computational Model

• Objects– Contain data, access methods, resources (i.e. BRE)– Multiple interfaces

• Operations– Service primitives with method handles

• Activities– Collections of operations treated as a unit of work for

the objects• Environment

– Distributed, P2P (and client-server)• Constraints

53

Applying the Computational Model

Model Military Nature/Ecology BiologyObjectsContain data,access methods,resources andhave multipleinterfaces

Personnel (soldiers, units),assets or resources (ships,planes, radars) they allinterface differently.

Bees (guard,colony bees),assets (wings,stingers) andinterfacedifferently (sting,fetch nectar,mate).

Ants (queen,femaleworkers,males)

Cells (store andretrieve signals)transmitted viaassets (nerve cells)

OperationsServiceprimitives withmethod handles

Personnel acquiring andusing resources.Assets providing feedbackand reporting.

Flying to flowers.Returning andreporting to hive.

Carry food,mate.

Chemical reactions.

Activities (alsoprocedures)Collections ofoperationstreated as a unitof work for theobjects

Personnel preparing a planefor mission.Assets participating in anexercise.

Build hives, collectnectar.

Buildchambers,store food.

Sequence ofoperations (chainreactions)

EnvironmentDistributed, P2P,c/s

Distributed (geographically),P2P (multi-national forces),c/s within own hierarchies.

Distributed (wherethe nectar is),centralized hive.

Distributed(multiple antcolonies) orcentralized.

Distributed or,centralized

Constraints Military code, context-specific rules, guidelinesaccording to operationalplans.

Rules of flight,rules of nectarload, rules ofhierarchy andreproduction.

Rules foravoidingobstacles,chamberusage.

Chemical andbiological rules.

Complex Systems

54

Distributed “nested rule” example

Starting from BRE001:BRE001: 001.123: IF x=4 THEN 003.977BRE003: 003.997: IF y=7 THEN 019.789BRE019: 019.789: IF k=2 THEN 678.778BRE678: 678.778: IF v=1 THEN 001.123

a) Depth=3 does not mean stopping at BRE019.

b) Depth=3 means above sequence 3X in the candidate set

55