CoverSheet - Los Alamos National Laboratory · LA-UR-16-28155 Approved for public release;...

LA-UR-16-28155Approved for public release; distribution is unlimited.

Title: D-wave Quantum Computer as an Efficient Classical Sampler

Author(s): Chertkov, MichaelHagberg, Aric ArildLokhov, AndreyMisiakiewicz, TheodorMisra, SidhantVuffray, Marc Denis

Intended for: Distribution of slides from the ISTI D-wave rapid response call

Issued: 2016-10-26 (Draft)

Disclaimer:Los Alamos National Laboratory, an affirmative action/equal opportunity employer, is operated by the Los Alamos National Security, LLC forthe National Nuclear Security Administration of the U.S. Department of Energy under contract DE-AC52-06NA25396. By approving thisarticle, the publisher recognizes that the U.S. Government retains nonexclusive, royalty-free license to publish or reproduce the publishedform of this contribution, or to allow others to do so, for U.S. Government purposes. Los Alamos National Laboratory requests that thepublisher identify this article as work performed under the auspices of the U.S. Department of Energy. Los Alamos National Laboratorystrongly supports academic freedom and a researcher's right to publish; as an institution, however, the Laboratory does not endorse theviewpoint of a publication or guarantee its technical correctness.

D-wave Quantum Computer as an EfficientClassical Sampler

M. Chertkov1,2 (PI), A. Hagberg3, A. Lokhov1,2 (co-PI),T. Misiakiewicz1, S. Misra3, M. Vuffray2

1Center for Nonlinear Studies2Theoretical Division T-43Theoretical Division T-5

D-wave Quantum Computing Efforts Debrief

Introduction: D-wave as an efficient sampler

Theoretical and experimental evidence that D-wave can approximatelysample from a Boltzmann distribution at some effective temperature

Ronnow et al., Science (2014)Amin, Phys. Rev. A (2015)

Perdomo-Ortiz et al., Sci. Rep. (2016)Benedetti et al., Phys. Rev. A (2016)

Ending up in excited states due to noise, "freeze-out", etc.

Introduction: D-wave as an efficient sampler

Disadvantage for optimization turned into advantage for numerousapplications:

Ending up in excited states due to noise, "freeze-out", etc.

! Restricted Boltzmann Machines (blocks for Deep Learning)Denil & Freitas, NIPS (2011); Dumoulin et al., AAAI Artificial Intelligence (2015);Benedetti et al., Phys. Rev. A (2016); Amin et al., “Quantum Boltzmann Machine” (2016)

! Producting samples in hard glassy modelsKatzgraber et al., Phys. Rev. X (2014 & 2015); Martin-Mayor & Hen, Sci. Rep. (2015);Venturelli et al., Phys. Rev. X (2015); Zhu et al., Phys. Rev. A (2016)

! Accurate calibration of the D-wave machineKing & McGeoch (2014) “Algorithm engineering for a quantum annealing platform”;Perdomo-Ortiz et al., Sci. Rep. (2016); Raymond et al., “Global warming: temperatureestimation in annealers” (2016); Also example in this debrief!

Relation between input and effective Hamiltonians in D-wave

Input Hamiltonian

H =∑〈i ,j〉

Jijσiσj +∑i∈V

Hiσi

Effective Hamiltonian in D-wave

Heff =∑〈i ,j〉

J ′ijσiσj +∑i∈V

H ′iσi

Let us write J ′ij = β(Jij + ∆Jij), H ′i = β(Hi + ∆Hi ), where

T = 1/β : effective temperature∆Jij , ∆Hi : possible biases

Correspondence H ↔ Heff by solving the reconstruction problem oflearning β, ∆Jij , ∆Hi from samples produced by D-wave with Heff

Reconstruction problem in D-wave

Given M independent samples (configurations), reconstruct Heff

𝑘 =1 𝑘 =2 … 𝑘 =M

𝜎1 +1 −1 … +1

𝜎2 −1 −1 … −1

⋮ ⋮ ⋮ … ⋮

𝜎N +1 +1 … −1

Task known as Inverse Ising Problem. The optimal algorithm forsolving this task is the LANL-developed “Screening method”

Vuffray, Misra, Lokhov, Chertkov, NIPS (2016)Lokhov, Vuffray, Misra, Chertkov, submitted to Nature Physics (2016)

How does Screening method work?

For each spin, minimize thepotential Si (Ji ,Hi ) which appliescounter-interactions (P ∝ e−H):

(Ji , Hi ) = argmin(Ji ,Hi )

(Si (Ji ,Hi ) + λ‖Ji‖1

)Si (Ji ,Hi ) = 〈exp(

∑j 6=i

Jijσiσj + Hiσi )〉M

Vuffray, Misra, Lokhov, Chertkov, NIPS (2016)Lokhov, Vuffray, Misra, Chertkov, submitted toNature Physics (2016)

〈Si(Ji,Hi)〉

〈Si(Ji*,Hi*)〉

〈Si(Ji=0,Hi=0)〉=1

Ji

Hi i

i

i

First outcome of this project: development of an efficientalgorithmic implementation using advanced first-order optimizationmethods (∼ N2 times faster, to appear on GitHub)

Effective temperatureWhere does the effective temperature come from? Let us look at theannealing procedure with τ = t/tannealing:

H(τ) = A(τ)

(−∑i∈V

σxi

)+ B(τ)

∑〈i,j〉

Jijσzi σ

zj +

∑i∈V

Hiσzi

Monotonic functions A and B satisfy A(0)� B(0) and A(1)� B(1).

The “freeze-out” phenomenon: theevolution stops at the point τfreeze:

Teff = TD-waveB(1)

B(τfreeze)

Benedetti et al., Phys. Rev. A (2016)Raymond et al., “Global warming: temperatureestimation in annealers” (2016)

! No unique Teff: β is the function of the input Hamiltonian

! “Single qubit freeze-out”: τfreeze can vary for different spins

Effective temperatureWhere does the effective temperature come from? Let us look at theannealing procedure with τ = t/tannealing:

H(τ) = A(τ)

(−∑i∈V

σxi

)+ B(τ)

∑〈i,j〉

Jijσzi σ

zj +

∑i∈V

Hiσzi

Monotonic functions A and B satisfy A(0)� B(0) and A(1)� B(1).

The “freeze-out” phenomenon: theevolution stops at the point τfreeze:

Teff = TD-waveB(1)

B(τfreeze)

Benedetti et al., Phys. Rev. A (2016)Raymond et al., “Global warming: temperatureestimation in annealers” (2016)

B(τ), B(1)=9.40 GHzA(τ)

τ τfreeze

B(τfreeze)

! No unique Teff: β is the function of the input Hamiltonian

! “Single qubit freeze-out”: τfreeze can vary for different spins

Illustration: estimating the effective temperatureData set (from Marcus Daniels): embedded closed circles of N = 22spins with different values of Ji,i+1 and Hi = 0 (diverse realizations,tannealing, etc.). Example for M = 7250 and Ji,i+1 = −0.0625 ∀(i , i + 1).

0.5 0.4 0.3 0.2 0.1 0.0 0.1

Jij

10 0

10 1

10 2

10 3

Non-zero couplings(define the topology)

Zero couplings(edges absent)

2

3

4

520

21

22

1

...

Refined {J ′ij ,H ′i }. Neglecting H ′i and biases, βeff ≈ 7 since J ′ = −0.44.

0.5 0.4 0.3 0.2 0.1 0.0

Jij

100

101

102

103

0.05 0.00

Hi

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Illustration: estimating the effective temperature

Example for M = 7250 and Ji,i+1 = −0.4375 ∀(i , i + 1).

2.5 2.0 1.5 1.0 0.5 0.0 0.5

Jij

10 0

10 1

10 2

10 3

2

3

4

520

21

22

1

...

Refined {J ′ij ,H ′i }. Neglecting H ′i and biases, βeff ≈ 4.2 since J ′ = −1.84.

2.5 2.0 1.5 1.0 0.5 0.0

Jij

100

101

102

103

0.2 0.1 0.0 0.1 0.2 0.3

Hi

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Illustration: estimating the effective temperature

Example for M = 7250 and Ji,i+1 = −0.75 ∀(i , i + 1).

5 4 3 2 1 0 1

Jij

10 0

10 1

10 2

10 3

2

3

4

520

21

22

1

...

Refined {J ′ij ,H ′i }. Neglecting H ′i and biases, βeff ≈ 3.72 since J ′ = −2.79.

4 3 2 1 0

Jij

100

101

102

103

0.8 0.6 0.4 0.2 0.0 0.2 0.4 0.6 0.8 1.0

Hi

0

1

2

3

4

5

Illustration: estimating the effective temperatureIn the case of Ji,i+1 = −1.0 ∀(i , i + 1), M = 7250 is insufficient: thetopology can not be correctly recovered.

14 12 10 8 6 4 2 0 2 4 6 8 10

Jij

100

101

102

103

Dependence between J ′ on J:

- 0.5 - 0.4 - 0.3 - 0.2 - 0.1 0.0

- 1.8

- 1.6

- 1.4

- 1.2

- 1.0

- 0.8

- 0.6

What about biases?

Simple test: if P(σ) ∝ e−H(σ)/(αT ), then α2T ∂∂α〈H〉 = 〈H2〉− 〈H〉2

0 20 40 60

050

0010

000

1500

020

000

2500

0

α

Var(E)α2T Mean(E)'α

Checkerboard pattern with magnetic fields

Finite sample-size erroramplified by α2

0 20 40 60

020

0040

0060

0080

0010

000

1200

0

α

Var(E)α2 T Mean(E)'α

Checkerboard pattern without magnetic fields

Indications ofα-dependent bias

Example found by Carleton Coffrin, see next talk!

Illustration: detecting and correcting biases

Example of the input H = 0 over the entire Chimera graphAlthough D-wave comes with a software for correcting biases, they arestill present and persist. Example from the Burnaby machine on Sep 15:

0.10 0.05 0.00 0.05

J(bias)ij

100

101

102

103

0.15 0.10 0.05 0.00 0.05 0.10 0.15

H(bias)i

0

20

40

60

80

100

120

-0.1 0.1Jij=0


September 15 October 4

-0.1 0.1Jij=0


Jij=0.05 Hi=0

Jij=0Hi=0

Jij=0.05 Hi=0

Jij=0Hi=0

βJ = 5.919

0.1 0.0 0.1 0.2 0.3

J(bias)ij

100

101

102

103

104

βJ = 5.841

0.10 0.05 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35

J(bias)ij

100

101

102

103

104

Illustration: detecting and correcting biasesJij=0.025 Hi=0

Jij=0Hi=0

Jij=0Hi=0.05

Jij=0Hi=0

βJ = 5.800

0.10 0.05 0.00 0.05 0.10 0.15 0.20

J(bias)ij

100

101

102

103

βH = 6.484

0.20 0.10 0.00 0.10 0.20 0.30 0.40 0.50

H(bias)i

0

50

100

150

200

250

300


Corrections: inputting H = − 1βJ

∑〈i ,j〉

J(bias)ij σiσj − 1

βH

∑i∈V

H(bias)i σi

Heff before corrections

0.10 0.05 0.00 0.05

J(bias)ij

100

101

102

103

0.20 0.15 0.10 0.05 0.00 0.05 0.10 0.15

H(bias)i

0

40

80

120

160

Heff after corrections

0.05 0.00 0.05

J(bias)ij

100

101

102

103

0.20 0.10 0.00 0.10 0.20

H(bias)i

0

40

80

120

160

Symmetrized and more squeezed distributions with a single iteration

Path forward: efficient calibration of the D-wave machineThe calibration issue addressed in several recent papers withheuristic methods: King & McGeoch (2014); Perdomo-Ortiz et al., Sci. Rep. (2016); . . .

As shown in the previous examples, we can do much better!

! Iteratively correcting the biases for the target HT :

i)HT

β−→ HT + ∆(HT )

ii)HT −∆(HT )

β−→ HT −∆(HT ) + ∆(HT −∆(HT ))

≈ HT −∆′(HT )∆(HT )

iii)HT −∆(HT ) + ∆′(HT )∆(HT )

β−→ . . .

! Machine learning task: learn the functional form of ∆(HT )with the linear response theory; start directly at the point (ii)

! Include the higher-order interaction terms in thereconstruction problem to capture the effect of inactive spins

Acknowledgements and questions

Many thanks to Marcus Daniels for the data set used in the firstpart of the work, and to Carleton Coffrin for the insight and datacontributions in the second part!

Questions?

CoverSheet - Los Alamos National Laboratory · LA-UR-16-28155 Approved for public release;...

Documents

Transcript of CoverSheet - Los Alamos National Laboratory · LA-UR-16-28155 Approved for public release;...