QPS R2E Mitigation and Consolidation Strategies for LS1

19
TE-MPE-EP, RD, 13-Sep-2012 1 QPS R2E Mitigation and Consolidation Strategies for LS1 R. Denz, TE-MPE-EP Acknowledgements: present and past QPS teams & supporters, R2E community

description

QPS R2E Mitigation and Consolidation Strategies for LS1. R. Denz, TE-MPE-EP Acknowledgements: present and past QPS teams & supporters, R2E community. Outline. Introduction QPS – protection of superconducting elements in the LHC Protection strategies - PowerPoint PPT Presentation

Transcript of QPS R2E Mitigation and Consolidation Strategies for LS1

Page 1: QPS  R2E Mitigation  and  Consolidation Strategies for LS1

TE-M

PE

-EP,

RD

, 13-

Sep

-201

2

1

QPS R2E Mitigation and Consolidation Strategies for LS1

R. Denz, TE-MPE-EP

Acknowledgements: present and past QPS teams & supporters, R2E community

Page 2: QPS  R2E Mitigation  and  Consolidation Strategies for LS1

TE-M

PE

-EP,

RD

, 13-

Sep

-201

2

2

Outline

Introduction QPS – protection of superconducting elements in the LHC

– Protection strategies Radiation induced faults in 2012 (latest news) Mitigation and consolidation measures

– Relocation of equipment– DAQ systems & field-bus couplers– DSP bases systems (fast digital detection systems)– High precision systems

Summary

Page 3: QPS  R2E Mitigation  and  Consolidation Strategies for LS1

TE-M

PE

-EP,

RD

, 13-

Sep

-201

2

3

Relocation during LS1

Due to functional requirements a significant amount of QPS and EE equipment is exposed to radiation during LHC operation– Radiation load depends on location and LHC exploitation

QPS and EE equipment locations– LHC tunnel

• Main magnet protection, nQPS, some 13kA EE systems (e.g. point 3)– Partly shielded areas (RR13,17,53,57,73,77, UJ14, 16, 56)

• IPQ, IPD, IT, 600 A protection, EE 600 A, EE 13 kA– Protected areas (UA23, 27, 43, 47, 63, 67, 83, 87, UJ33)

• IPQ, IPD, IT, 600 A protection, EE 600 A, EE 13 kA LHC exploitation and expected radiation load

– t < LS1: radiation load still below design levels but effects noticeable– LS1 < t < LS2: radiation load at design levels preparation has to start now– t > LS2: radiation load above design levels

• … left to the reader as a homework ;-))

Introduction – a short reminder …

LS1 < t < LS2 / T < LS1 ≈ 10

Page 4: QPS  R2E Mitigation  and  Consolidation Strategies for LS1

TE-M

PE

-EP,

RD

, 13-

Sep

-201

2

4

QPS - protection of superconducting elements in the LHC

Circuit type QuantityMain bends and quads 24

Inner triplets 8

Insertion region magnets 94

Corrector circuits 600 A 418

Total 544

Protection system type QuantityQuench detection systems 7568

Quench heater discharge power supplies 6076

Energy extraction systems 13 kA 32

Energy extraction systems 600 A 202

Data acquisition systems 2532 (~0.5 TB/week)

System interlocks (hardwired) 13722

• The dependability of the system is critical for LHC performance.• Due to the mere size of the system, reliability, availability and

maintainability are a major challenge.• Mitigation and consolidation measures are normally not

straightforward to implement.

Page 5: QPS  R2E Mitigation  and  Consolidation Strategies for LS1

TE-M

PE

-EP,

RD

, 13-

Sep

-201

2

5

QPS protection strategies - overview

Main circuits (main dipoles and quads)– Analog & digital quench detection systems for main magnets, quench

heaters, cold by-pass diodes and energy extraction systems – Dedicated bus-bar splice protection

Insertion region magnets, inner triplets and corresponding bus-bars– Global protection of magnet and bus-bar by digital quench detector,

quench heaters Correctors magnet circuits

– Global protection of magnet and bus-bar by digital quench detector– Energy extraction systems where necessary

HTS hybrid current leads (all type of circuits and ratings)– Individual protection by digital protection system

Supervision– Field-bus based data acquisition systems

Page 6: QPS  R2E Mitigation  and  Consolidation Strategies for LS1

TE-M

PE

-EP,

RD

, 13-

Sep

-201

2

6

Fault analysis has to be done very carefully as not all problems are related to radiation– Equipment faults, EMC, bad connections, circuit breakers, real triggers (very

rare but not excluded)• In addition there remain some doubtful cases where the exact cause of

the trip cannot be determined easily• E.g. dedicated test in CNRAD on fast radiation induced transients in

PhotoMOS devices Radiation induced faults are responsible for most of the QPS triggers in stable

beam conditions – So far only non-destructive errors have been observed.

Confirmed radiation induced faults are transmitted regularly to the R2E project to be included in their statistics– Radiation to electronics related problems are discussed as well in the

RADWG

Radiation induced faults

Page 7: QPS  R2E Mitigation  and  Consolidation Strategies for LS1

TE-M

PE

-EP,

RD

, 13-

Sep

-201

2

7

Radiation induced fault statistics 2012

ΣSEU_2012 / ΣSEU_2011 = 0.37 SEU_DUMP_2011 = 3.5 / [fb-1]; SEU_DUMP_2012 = 1.6 / [fb-1]

RADTOLDetectionSystems!

Page 8: QPS  R2E Mitigation  and  Consolidation Strategies for LS1

TE-M

PE

-EP,

RD

, 13-

Sep

-201

2

8

Radiation induced fault statistics 2012 – arc distribution

Page 9: QPS  R2E Mitigation  and  Consolidation Strategies for LS1

TE-M

PE

-EP,

RD

, 13-

Sep

-201

2

9

Mitigation and consolidation measures – relocation

Relocation of equipment into protected area is the preferred solution for R2E consolidation– Solves the cause of the problem; gives a lot of flexibility for existing and

future designs– With modern electronics very long instrumentation cables may become

acceptable eventually in combination with deported (analog and digital) I/O

During LS1 all QPS equipment currently installed in UJ14, UJ16 and UJ56 will be re-located to protected areas– In 2012 so far 9/23 beam dumps out of SB conditions were provoked by

equipment installed in these areas!– IT protection no longer required to be radiation tolerant; this is very

important with respect to future upgrades of the triplets implicating more sophisticated protection equipment (Nb3Sn)

Relocation as a consolidation measure needs to be further studied for LHC operation a after LS2– Dedicated manpower e.g. a thésard wishful …

Page 10: QPS  R2E Mitigation  and  Consolidation Strategies for LS1

TE-M

PE

-EP,

RD

, 13-

Sep

-201

2

10

Mitigation and consolidation measures – DAQ systems

Notorious “ISO150” problem causing permanent trigger on DAQ systems type DQAMCMB and DQAMCMQ (main magnet protection)

Firmware upgrade for DQAMCMB and DQAMCMQ as first mitigation measure– Deployment concluded in 2011 significant reduction in down time– Includes a few other upgrades to simplify operation– 103/103 transparent in 2012, 91/116 in 2011!

Change from level to falling edge trigger– Prevents DAQ system from stalling and avoids access– Fault indicated by a status flag (ST_DQAMC_BUS) not part of the QPS_OK

signal Add secondary software trigger to keep post mortem functionality

– Trigger associated to U_HDS_1 signal (< 800 V) Full consolidation requires new hardware design replacing incriminated chip and

optimized routing of trigger signal– Done within spare part production 200 boards delivered recently– Type tests finished, 2 systems already deployed in LHC (no problem so far)

Page 11: QPS  R2E Mitigation  and  Consolidation Strategies for LS1

TE-M

PE

-EP,

RD

, 13-

Sep

-201

2

11

Mitigation and consolidation measures – field-bus coupler

Loss of fieldbus communication of DAQ systems type DQAMC– Failure of the fieldbus coupler chip (MicroFip™)

• 7 cases in 2011, 6 in 2012 so far observed

– Radiation tests performed by QPS in CNRAD 2009 and 2010 showed this kind of problem with both versions of the chip

– Fault state can be cured by a power cycle, an auto power cycle option has been already successfully tested in CNRAD 2009, 2011, 2012

Version Technology QPS Equipment Qty

VLSI Technology 9838 N363921 / Philips 0246 Y42350Y1

0.6 µm DQAMC, DQAMG, DQAMS

2098 (1624 inside LHC tunnel)

ON Semiconductor 0907LNP 15016-530 /AMI Semiconductor 0839LXT 15016-530

0.5 µm DQAMGS (nQPS) 436 (all inside LHC tunnel)

Page 12: QPS  R2E Mitigation  and  Consolidation Strategies for LS1

TE-M

PE

-EP,

RD

, 13-

Sep

-201

2

12

Mitigation and consolidation measures – field-bus coupler II

Recent investigations show that an auto power cycle option could be added to the installed devices without destructive interventions– Microcontroller detects absence of fieldbus communication and reboots– Similar approach already implemented for nQPS remote power cycle

– Additional circuit could be attached on the existing connector used for the programming of the device

– Type tests to be completed and prototypes to be built• Field tests still possible in 2012/13

– Implementation of the “Magic Token” on selected boards as well to be studied

Page 13: QPS  R2E Mitigation  and  Consolidation Strategies for LS1

TE-M

PE

-EP,

RD

, 13-

Sep

-201

2

13

Mitigation and consolidation measures – field-bus coupler III

On the long term all MicroFip™ based devices need to be replaced as this chip will no longer be produced– In general discontinued electronic components become more an more a

problem for the maintenance of QPS systems In case of the MicroFip™ the CERN wide solution for replacement is the in

house developed NanoFip chip– NanoFip is a FLASH FPGA based implementation of the fieldbus protocol

• FPGA type already in use by QPS ( symmetric quench detection)• No problems so far in LHC confirming the good radiation test results

– Unfortunately the new chip is “nano-compatible” in software and hardware to the old version

• 1st QPS hardware prototype available but further development postponed due to lack of manpower

• QPS fieldbus network to be reconfigured in order to allow smooth transition, i.e. not to be forced to update all devices in one go

– First deployment foreseen for QPS during LS1 in the DS areas?

Page 14: QPS  R2E Mitigation  and  Consolidation Strategies for LS1

TE-M

PE

-EP,

RD

, 13-

Sep

-201

2

14

Mitigation and consolidation measures – DSP based systems

The TI TMS320C6211™ based general purpose detection board has been developed for the protection of insertion region magnets, inner triplets and corrector magnet circuits.

The design is not radiation tolerant as not required at the time of the development

The DSP based approach turned out to be crucial for the commissioning of the corrector magnet circuits– On target development, flexibility, fast firmware development cycles

A part the inherent sensibility to radiation induced faults, the performance of the system is excellent with respect to reliability and availability

The high dynamic range of the current reading requires a fast high resolution ADC (not available at beginning of the century …) or a complex digital to analog feedback loop– Future devices will use fast high resolution 24 bit ∑Δ ADCs

• The DSP based boards will be replaced by FPGA based radiation tolerant systems (Jens will give you all the details)

• The task has been accomplished for the insertion region magnets and inner triplets field test on RQ6.R1 during TS#3 and run afterwards.

• The algorithm for the 600 A protection is currently ported to an FPGA in the framework of a radiation tolerant design not so easy piece of work

Page 15: QPS  R2E Mitigation  and  Consolidation Strategies for LS1

TE-M

PE

-EP,

RD

, 13-

Sep

-201

2

15

HTS leads

UDIFF

ICIRCUIT

)()(

Parallel

DIFFCIRCUITCIRCUITDIFFRES R

UIILUU

s.c. magnet(s) n = 1 … 154RPARALLEL

Power converter

Energy extraction system

Mitigation and consolidation measures – 600 A protection

The big advantage of the concept is that it only requires the lead instrumentation– Only reasonable solution for circuits

with large family size; to be avoided for single magnets

Most complex detection system used by QPS– Very tedious commissioning; still

some open issues e.g. tune feedback compatibility at higher energy

Algorithm is currently ported to an FPGA in the framework of a radiation tolerant design not so easy piece of work

Higher thresholds will make this task easier (wish)list of circuits sent to GURU(s)

• Type tests must be completed prior to LS1 and a field test during the powering tests in February must be performed.

• As an additional / alternative measure the change of evaluation logic is still possible (1 && 1) || (1 && 1) instead of (1 || 1). More equipment to be installed DQGPU type A to be revised but otherwise almost transparent.

Page 16: QPS  R2E Mitigation  and  Consolidation Strategies for LS1

TE-M

PE

-EP,

RD

, 13-

Sep

-201

2

16

Firmware upgrade– Triplication of digital filters and other modifications– Expected to cure a significant amount but not all faults (seems to work …)– Partial deployment in half cells 8 to 11 around IP1, 2, 5, and 8 for splice

protection Hardware upgrade (required for some hot zones only)

– Technology evaluated – two possible options• FPGA based version using high resolution ADC

– ADS1281 results promising• Standard technology with optimised firmware and modified evaluation

logic– Using four instead of two redundant processors and majority voting

» (1 && 1) || (1 && 1)– Implementation study showing feasibility completed

– Design in 2012 installation in hot zones during LS1

Mitigation and consolidation measures – high precision digital protection systems (nQPS splice protection and HTS lead protection)

• Radiation levels measured by R2E (J. Mekki) during the 2012 run in half cells like 11L1, 9R5, 8R5 indicate that this hardware upgrade will become necessary during LS1 in DS areas around point 1, 5, and probably 7

Page 17: QPS  R2E Mitigation  and  Consolidation Strategies for LS1

TE-M

PE

-EP,

RD

, 13-

Sep

-201

2

17

For new devices the radiation tolerance is taken into account according to the required levels– The new devices developed by the EE section follow the same strategy

Radiation tolerance normally inherited from previous projects

New devices to be implemented during LS1

Device Purpose Exposed Interlocking Beam critical QPS_OK

DQHSU Enhanced quench heater supervision

YES NO NO YES

DQQDE Earth voltage measurement during fast discharge

YES NO NO NO

DQCSU Crate supervision unit for nDQLPU

YES NO (YES) requires fault on 2 different boards

YES

Device Expected tolerance Comment

DQHSU As DQQDS or nDQQDI Good enough for DS

DQQDE As good or slightly better than DQQBS Minor risk for DS areas (supervision only)

DQCSU As good as DQAMC w/o uFIP Good enough for DS as well

Page 18: QPS  R2E Mitigation  and  Consolidation Strategies for LS1

TE-M

PE

-EP,

RD

, 13-

Sep

-201

2

18

Summary I

During the LHC exploitation in 2010, 2011 and 2012 the protection system for superconducting circuits of the LHC demonstrated its reliability and capability to ensure the integrity of the protected superconducting elements.

None of the observed faults caused a total loss of magnet and/or circuit protection.– Redundancy of the protection systems is essential

While there were no magnet quenches above injection current so far there were some real triggers of current lead protection systems.

The system availability has been significantly improved over the last years; the outstanding problems are in most case related to radiation induced faults.

While most of the radiation induced faults are transparent to LHC operation, the number of beam dumps caused by spurious triggers is an issue.– Mitigation and consolidation measures applied so far allowed to keep the fault

rate reasonable despite increasing luminosity• TCLs, additional shielding, firmware upgrades

Page 19: QPS  R2E Mitigation  and  Consolidation Strategies for LS1

TE-M

PE

-EP,

RD

, 13-

Sep

-201

2

19

Summary II

The main problem are the DSP based quench detectors originally developed for radiation free areas– Consolidation work has been launched already in 2008– The symmetric quench detection board is the first result of these efforts

• Fully satisfying performance during LHC operation During the upcoming stop of the LHC in 2013/2014 the system upgrades will

focus on:– Completion of radiation to electronics consolidation– Enhanced supervision capabilities, especially for the quench heater

circuits of the main dipoles– Full implementation of the redundant UPS powering scheme

Present mitigation & consolidation measures concern the LHC run up to LS2; operation after LS2 will be a different story and require definitively more effort