Load Balancing and Intelligent Load Balancing Jesús González Escalation Engineer.

37
Load Balancing and Intelligent Load Balancing Jesús González Escalation Engineer

Transcript of Load Balancing and Intelligent Load Balancing Jesús González Escalation Engineer.

Load Balancing and Intelligent Load BalancingJesús González Escalation Engineer

Objectives

• Deep understanding of load balancing architecture

• Troubleshooting techniques

• Identify the root cause of any LB problem

Agenda

• Architecture• Data Collector

• Updates - Examples• Performance Data Helper• Citrix and the PDH

• Where can it go wrong?• Black hole effect (Load Throttling - Intelligent Load Balancing)• Some servers get all the connections• Full load after installing an MUI

• Troubleshooting

Architecture

Architecture

Data Collector DC - Dynamic Store DSData Collector DC - Dynamic Store DS

LmsSS.dll - IMA SubsystemLmsSS.dll - IMA Subsystem

MFRules.dll - XA countersMFRules.dll - XA countersLMS20Rules.dll - System countersLMS20Rules.dll - System counters

PDH.dll - Performance Data HelperPDH.dll - Performance Data Helper

Data Collector Updates

• Data Collector is updated every 30 s

• Only when the change in the load evaluator is bigger than 5%• Every 5 minutes we send a full update

• Connection logon or logoff

Data Collector UpdatesServer load - CPU Load example

• Data Collector

XenApp

Time

30s 60s

15 20 23 25 23 50 98 34 45 56

X 20 23 25 23 50 98 34 45 56 40

X 23 25 23 50 98 34 45 56 40 12

38 41 40Server Load % :

5 minutes

Data Collector UpdatesXA CPU Load != Task Manager CPU Load

• CPU load

• XA = 25%

• Task Manager = 1%

Ctxnotif.dll

MFSrvSS.dll

LMSSS.DLL

Data Collector UpdatesAt Logon time - Default load evaluator

Data Collector

ICA Client

Dynamic Store

Server A

Server B

IMAS

erve

r B

Ser

ver

AS

erve

r A

BIAS

WI/XML

IMA

X

Performance Data Helper – API

The performance data helper interface calls the registry interface to retrieve performance data

Uses the PDH.DLL to access the PDH API

http://msdn2.microsoft.com/en-us/library/aa373083.aspx

Performance Data Helper – C++ Example

pdhStatus = PdhAddCounter ( hQuery,

"\\Processor(0)\\% Processor Time",

0,

&hCounter);

YOU MUST CALL THE PERFORMANCE COUNTER BY ITS NAME

Citrix and the PDH

• \\Processor(_Total)\\% Processor Time

• \\System\\Context Switches/sec

• \\Memory\\% Committed Bytes In Use

• \\Memory\\Page Faults/sec

• \\Memory\\Pages/sec

• \\PhysicalDisk(_Total)\\Disk Bytes/sec

• \\PhysicalDisk(_Total)\\Disk Reads/sec

• \\PhysicalDisk(_Total)\\Disk Writes/sec

Performance Data Helper - Registry

Performance Data Helper – File system

EnglishPerfc009.dat

Perfh009.dat

GermanPerfc007.dat

Perfh007.dat

Performance Data Helper - perfmon

Performance Data Helper - IMA

IMA(At start up)

IMA(At start up)

Citrix and the PDH

We also provide Performance counters

Citrix and the PDHHKLM\SYSTEM\CurrentControlSet\Services\IMAService\Performance

Where can it go wrong?

Black hole effect

Problem Solution

• At Peak logon times a recently boot up server will get all connections and might become unresponsive

• Cause: Server is unable to update the DC

• Load Throttling

• Intelligent load balancing

HKLM\SOFTWARE\Citrix\IMA\LMS\

(DWORD ) UseILB = 1

(DWORD ) ILBMultiplier = 2

Before Intelligent Load Biasing

BIASBIASCurrent Load (0)

Max Load (10000)

0 sessions

Current Load (100)

Max Load (10000)

1 session

BIASBIAS

Max Load (10000)

2 sessions

Current Load (200)

1 session comes 2nd session comes

Default BIAS = 10000/100 = 100

Intelligent Load Biasing

BIASBIAS

Current Load (0)

Max Load (10000)

ILBMultiplier (2)

0 sessions

Current Load (5000)

Max Load (10000)

1 session

BIASBIAS

Max Load (10000)

2 sessions

Current Load (7500)ILBMultiplier (2)

1 session comes 2nd session comes

Load = [(Max Load – Current Load) / ILBMultiplier] + Load

Load Throttling- Low logon rate

Data CollectorICA Client

Dynamic Store

Server A

Server B

Server B

Server A

Server Load BIAS

0

0 0

0(10000-0)/2+0 = 5000100

(10000-0)/2+0 = 5000100

Load Throttling- Low logon rate

Server load if Low logon rate

0

1000

2000

3000

4000

5000

6000

ILB=1

ILB=0

Load Throttling- Black hole effect

Data CollectorICA Client

Dynamic Store

Server A

Server B

Server B

Server A

Server Load BIAS

0

6000 0

0(10000-0)/2+0 = 5000(10000-5000)/2+5000 = 7500

(10000-6000)/2+6000 = 8000

(10000-7500)/2+7500 = 8750

Load Throttling

Server load at Peak logon times

0

2000

4000

6000

8000

10000

12000

1 8

15

22

29

36

43

50

57

64

71

78

85

92

99

ILBMultiplier=2

ILBMultiplier = 10

ILBMultiplier = 20

ILB=0

Upgrade to W2K3 causes full load

Problem Solution• English W2K

\\System\\% Total Processor Time

• English W2K3

\\Processor(_Total)\\% Processor Time

• In W2K3, use

\\Processor(_Total)\\% Processor Time

Some servers get all the connections

Problem Solution

• 1 server all the connections

• Cause: Fail to read performance counters causes Load = 0

• Fail to read performance counters causes Load = 10000

Full load after installing an MUI

Problem Solution

• After installing MUI with advance load evaluators it results in load = 10000

• Cause:

-> perfc007.dat (German) perfc009.dat (English)

• HKLM\Software\Citrix\IMA\LMS\

EnableTranslation=1 (dword)

Full load after installing an MUI

• Jesús González

in a German restaurant

English Menu (perfc009.dat)

22 . Vegetable Soup

German Menu (perfc007.dat)

22. Gemüsensuppe

Troubleshooting

Troubleshooting

Data Collector DCData Collector DC LmsSS.dllLmsSS.dll

LMS20Rules.dllLMS20Rules.dll

PDH.dllPDH.dll

\\Processor(_Total)\\% Processor TimeXXXXXXXXXXXXXXXX

Full Load

TroubleshootingFailing to read performance counters => full load

• Use procmon (filemon) while restarting IMAMake sure the correct perfcXXX.dat file can be accessed.

• Consider rebuilding performance countershttp://support.microsoft.com/kb/300956/en-us

• Check that no performance counter are disabled[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\PerfDisk\Performance]"Disable Performance Counters"=dword:00000001

• Use CDFControl to gather CDF traceshttp://support.citrix.com/article/CTX111961

•LMSRuleDll_Interface::CreatePDHQuery() ERROR!!!!! rc = -1073738823

•-1073738823 -> FFFFFFFFC0000BB9 (HEX)

•http://msdn2.microsoft.com/en-us/library/aa373046.aspx

“The specified counter could not be found”

TroubleshootingCDF Traces example

Summary

• Deep understanding of load balancing• BIAS, PDH…

• Understanding Intelligent load balancing• Load Throttling

• Typical scenarios• Full load versus 0 load, MUI…

• Troubleshooting techniques• So far we are able to resolve all the problems that arrive to EMEA tech support with those tips

Before you leave…

• Session surveys are available online at www.citrixsynergy.com starting Thursday, 7 October• Provide your feedback and pick up a complimentary gift card at the registration desk

• Download presentations starting Friday, 15 October, from your My Organiser Tool located in your My Synergy Microsite event account