A Globally Convergent Penalty-Based Gauss-Newton Algorithm ...
Relaxed Gauss Newton methods with applications to ...tuomov/mathematics/gn_overrelax.pdf · thus...
Transcript of Relaxed Gauss Newton methods with applications to ...tuomov/mathematics/gn_overrelax.pdf · thus...
relaxed gauss–newton methods with
applications to electrical impedance
tomography
Jyrki Jauhiainen∗ Petri Kuusela∗ Aku Seppänen∗ Tuomo Valkonen†
2020–02–26
Abstract As second-order methods, Gauss–Newton-type methods can be more effec-
tive than first-order methods for the solution of nonsmooth optimization problems with
expensive-to-evaluate smooth components. Such methods, however, often do not converge.
Motivated by nonlinear inverse problems with nonsmooth regularization, we propose a
new Gauss–Newton-type method with inexact relaxed steps. We prove that the method
converges to a set of connected critical points given that the linearisation of the forward
operator for the inverse problem is sufficiently precise. We extensively evaluate the perfor-
mance of the method on electrical impedance tomography (EIT).
1 introduction
The classical Gauss–Newton method can be used for the iterative solution of nonlinear least
squares problems minx12 ‖A(x)‖
2. It works by successive linearisation of the nonlinear operator
A ∈ C1(V ;RM ) defined on V ⊂ Rn . Often, not the least in inverse problems and data science,
one wishes to combine such a least squares fitting with a nonsmooth but convex regularization
term F : V → R incorporating prior information of a good approximate solution to the ill-posed
problem A(x) = 0. We thus wish to solve
(1.1) minx
J (x) :=1
2‖A(x)‖2 + F (x).
One readily extends the idea behind the Gauss–Newton method to this problem: linearise A,
solve the resulting convex nonsmooth problem to high accuracy, repeat. Unfortunately, such a
basic approach rarely converges, especially in inverse problems where A and its differentials
almost by definition are not injective. In this work, after several relaxations of the approach,
we prove the convergence of a variant of the Gauss–Newton method for (1.1), concentrating on
applications to electrical impedance tomography (EIT).
∗Department Of Applied Physics, University of Eastern Finland, Kuopio, Finland. [email protected]†ModeMat, Escuela Politécnica Nacional, Quito, Ecuador and Department of Mathematics and Statistics, University
of Helsinki, Finland. [email protected]
1
nonsmooth nonconvex optimization methods
If F and A are sufficiently smooth, (1.1) can frequently be solved with Newton’s method. A small
degree of nonsmoothness can be dealt with semismooth Newton’s method [31, 36, 37]. If F is
nonsmooth, nonlinear primal-dual proximal splitting (NL-PDPS) [44, 12] is one possibility; see
[47] for an overview. Usually NL-PDPS as a first-order method requires thousands of iterations
to converge. If the iterations are computationally costly, the method becomes impractical. This
can be the case for A the solution operator of a partial differential equation (PDE). We are
thus led to Gauss–Newton-type methods that combine both worlds, however, they often fail to
converge [44].
Convergence analysis of the classical Gauss–Newton, for the nonlinear least squares problem
minx12 ‖T (x)‖
2, withT Lipschitz-continuously differentiable, may be found, for example, in [33].
In [34] merely locally Lipschitz T is considered. Several works have also studied extensions
of the Gauss–Newton method to the general composite minimization problem minx h(T (x));
see, for example, [7, 15, 27]. These works generally assume that the set of minima C of h is
“weakly sharp”, and that the inclusionT (x) ∈ C has some “regular points”. In our setting, writing
h(x ,y) = G(x)+ F (y) forT (x) = (A(x),x), the existence of a “regular point” would reduce to the
injectivity of the differential A′(x) at a minimiser x of J . Since, in inverse problems, the range of
A is generally much smaller than the domain, such a condition cannot be expected to hold. The
assumption of “weak sharp minima” amounts to strong metric subregularity of the objective at
the solution set. According to [1], this is a local form of strong convexity.
In [40] the Gauss–Newton method is studied for problems of the specific form (1.1). There also,
A′(x) has to be injective, and the sub-problem solutions exact. In this case, linear convergence
is proved. However, we want to avoid such injectivity assumptions, and also allow the sub-
problems to be solved inexactly. To be able to do this, and still obtain convergence, we will
introduce a relaxation term into our subproblems, and relaxation step between the Gauss–Newtons
steps. The former connects our approach to the classical Levenberg–Marquardt method which,
indeed, can be seen as a proximal Gauss–Newton method for nonlinear least squares [22, 19].
We also will not require the sub-problems to be solved exactly, merely to obtain sufficient decrease
following a condition akin to what has been employed in a different context in [6, 3]. With this,
in Section 2, we will show the convergence of iterates of the proposed Relaxed Inexact Proximal
Gauss–Newton method (RIPGN) to connected components of critical points. In particular, if the
critical points are isolated, we will obtain convergence.
electrical impedance tomography
We will evaluate the proposed method on image (conductivity) reconstruction in Electrical
Impedance Tomography (EIT). This is a large-scale nonlinear PDE-constrained inverse problem.
EIT is an imaging technique in which electric conductivity in a target domain is reconstructed
from boundary measurements. The relationship between the boundary measurements and the
electrical potential and conductivity within the domain are governed by a nonlinear elliptic
partial differential equation. In general, the underlying inverse problem of EIT, which is also
known as Calderon’s problem [8], is ill-posed in the sense that it doesn’t depend continuously
on the boundary data. However, by assuming certain bounds on the conductivity, it is possible to
2
show an optimal logarithmic modulus of continuity [39]. This, of course, means that even small
changes in the conductivity can cause large changes in the boundary values. Cases of nonsmooth
conductivities in two dimensions are considered in paper [2]. For cases of piecewise analytic
and smooth conductivities in three dimensions, we refer to [23, 24] and [42], respectively.
Theoretical work on the inverse problem of EIT has introduced several direct methods
for reconstructing the conductivity. In recent years, so-called D-bar method, which utilizes
complex geometrical optics solutions to the Schrödinger formulation of the inverse conductivity
problem, has undergone considerable progress [43, 32]. In the present, however, we formulate
the inverse conductivity problem as a least squares minimization problem between the boundary
values from the PDE and measurement data. Optimization and Tikhonov-regularization based
approach offers several benefits over the direct methods. It is easier to include physically
more accurate boundary conditions, domain shapes and regularization functions. Moreover,
in a Bayesian framework, the optimization-based solution can be considered as maximum a
posteriori estimates with certain prior distribution [21]. With further analysis, error estimates
may also be obtained [4]. The underlying optimization problem is, however, often tricky to
solve, as the boundary currents depend nonlinearly on the conductivity. This means that the
optimization problem is nonconvex. Moreover, total variation type regularization, which help to
reconstruct the boundaries of different materials within the target domain, makes the problem
nonsmooth.
organization
The rest of this paper is organized as follows: first, in Section 2, we examine the convergence
of the relaxed inexact proximal Gauss–Newton method. For a certain relaxation parameter,
we show that the algorithm converges to a connected set of Clarke critical points, given that
the linearisation of the operator A sufficiently well approximates the original operator. In
Section 3, we provide a more detailed description of the algorithm and explain how to reliably
solve linearised nonsmooth subproblems in the Gauss–Newton scheme. In Sections 4 and 5,
by using EIT as an example, we study numerically and experimentally whether the relaxed
Gauss–Newton method improves the computational efficiency of the image reconstructions
compared to alternative optimization methods.
2 convergence properties of the relaxed inexact proximal
gauss–newton method
We intend to solve problem (1.1) by successive linearisations of A: for some zk we take
Ak (x) := Azk (x) with Ay (x) := A(y) + ∇A(y)∗(x − y)
A standard Gauss–Newton-type approach would then solve on each iteration the linearised,
convex problem
(2.1) minx
Jk (x) :=1
2‖Ak (x)‖
2+ F (x)
3
Algorithm 2.1 Outline of relaxed inexact proximal Gauss–Newton method (RIPGN).
Require: Convex, proper, lower semicontinuous F : RN → R and A ∈ C1(dom F ;RM ).
Require: Relaxation parameterw > 0.
1: Choose an initial iterate z0 ∈ dom F .
2: for all k ≥ 0 do
3: Find an approximate solution xk to (2.2).
4: Update zk+1 := (1 −w)zk +wxk
5: end for
and update zk+1 := xk to form the linearisation point of the next iteration. As we have remarked
in the introduction, such a method seldom converges. Our plan, to obtain a convergent method,
is to solve for some proximal parameter β > 0 the modified problem
(2.2) minx
Jk (x) :=1
2‖Ak (x)‖
2+ F (x) +
β
2‖x − zk ‖2 = Jk (x) +
β
2‖x − zk ‖2.
Then we take the linearisation point zk+1 as an interpolation between xk and zk , precisely
zk+1 := (1 −w)zk +wxk
for a sufficiently small relaxation parameter w ∈ (0, 1]. Furthermore, we allow xk to be solved
inexactly from (2.2). This yields our outline method of Algorithm 2.1, the relaxed inexact proximal
Gauss–Newton method (RIPGN).
We now prove the convergence of the method with β > 0. In Appendix a we show that
it is possible to take β = 0 under strong metric subregularity. We need assumptions that
guarantee that the solutions of the linearised subproblems stay in a bounded set, and we need
the linearisations Ay to locally approximate A sufficiently well:
Assumption 2.1. F : RN → R is convex, proper, and lower semicontinuous, the operator
A ∈ C1(dom F ;RM ), and
J (x) :=1
2‖A(x)‖2 + F (x).
Given an initial iterate z0 ∈ RN , the sublevel set levJ (z0) J is bounded, inf F > −∞, and Amax :=
supz∈dom F ‖A(z)‖ < ∞. Moreover, for some d,C > 0 , the linearization error
‖A(x) − Ay (x)‖ ≤ C ‖x − y ‖2 (x ∈ clB (y ; d) , y ∈ levJ (z0) J ).
Here B(x , r ) is the open ball of radius r at x while clB(x , r ) is its closure. We write dom F :=
x ∈ RN | F (x) < ∞ for the effective domain of F and levc J := x ∈ R
N | J (x) ≤ c for the
c-sublevel set of J . We will also write ∂Jk (x) for the subdifferential of the convex functions Jkat x , and, moreover, denote by ∂C J (x) the Clarke subdifferential of the non-convex function J
at x , as defined in [11]. We call a point x satisfying 0 ∈ ∂C J (x) Clarke-critical. Then we have:
Theorem 2.1. Suppose Assumption 2.1 holds and, for some β, ε > 0,
(2.3) 0 < w ≤ min
1,
d√2β−1(J (z0) − inf F )
,β − ε
2CAmax
.
4
On line 3 of Algorithm 2.1, find an approximate minimiser xk to (2.2) specifically satisfying
1. For some ek ∈ ∂ J (xk ) we have ek → 0 as k → ∞, and
2. either Jk (zk ) ≥ Jk (x
k ) with xk , zk , or xk = zk ∈ [∂ Jk ]−1(0).
Then the iterates satisfy:
(i) J (zk ) is monotonically decreasing; indeed, J (zk ) ց L for some L ∈ R.
(ii) Any accumulation point x of zk k ∈N is Clarke-critical and satisfies J (x) = L;
(iii) Indeed, dist(zk ,U ) → 0 for a connected componentU of VL := x ∈ V | 0 ∈ ∂C J (x), J (x) =
L.
Proof. Suppose first that xk = zk ∈ [∂ Jk ]−1(0) for some k ∈ N. Since ∂ Jk (z
k ) = ∂Jk (zk ) =
∂C J (zk ), we obtain zk+1 = zk , so that there is nothing left to prove: the algorithm has converged
to a critical point in a finite number of iterations.
So, by assumption, Jk (zk ) ≥ Jk (x
k ) with xk , zk for all k ∈ N. Using (2.2) we now obtain
(2.4) J (zk ) − Jk (xk ) = Jk (z
k ) − Jk (xk ) +
β
2‖xk − zk ‖2 ≥
β
2‖xk − zk ‖2 > 0.
Sincew ≤ 1, from the convexity of Jk we have
(2.5) J (zk ) − Jk (zk+1) ≥ J (zk ) −
((1 −w)Jk (z
k ) +wJk (xk )
)= w
(J (zk ) − Jk (x
k )).
Consequently, by (2.4),
J (zk ) − Jk (zk+1) ≥
wβ
2‖xk − zk ‖2 > 0.
Now we show by induction that
(2.6) J (z0) ≥ J (zk ) (k ≥ 0).
As a by-product, we will verify (i), and obtain useful estimates for (ii) and (iii).
Induction base: Obviously J (z0) ≥ J (zk ) holds for k = 0.
Induction step: Suppose J (z0) ≥ J (zk ). We show J (z0) ≥ J (zk+1). From (2.4) we have
J (z0) − Jk (xk ) ≥ J (zk ) − Jk (x
k ) ≥β
2‖xk − zk ‖2.
Since Jk (xk ) ≥ inf F , we have
‖xk − zk ‖ ≤√2β−1(J (z0) − inf F ) := r ,
and sincew ≤ δ/r , it follows
(2.7) ‖zk+1 − zk ‖ = w ‖xk − zk ‖ ≤ wB ≤d
rr = d,
thus zk+1 ∈ clB(zk ; d). From Assumption 2.1 with h := zk+1 − zk ,
(2.8) ‖A(zk+1) −Ak (zk+1)‖ ≤ C‖zk+1 − zk ‖2 ≤ C‖h‖2.
5
Now using (2.8) and the definition of Amax for the inequality in the next estimate, we obtain
(2.9)1
2‖Ak (z
k+1)‖2 −1
2‖A(zk+1)‖2 =
1
2‖A(zk+1) −Ak (z
k+1)‖2
+ 〈Ak (zk+1) −A(zk+1),A(zk+1)〉
≥ 〈Ak (zk+1) −A(zk+1),A(zk+1)〉 ≥ −CAmax‖h‖
2.
Furthermore, using (2.9),
J (zk ) − J (zk+1) = J (zk ) −1
2‖A(zk+1)‖2 − F (zk+1)
≥ J (zk ) −1
2‖Ak (z
k+1)‖2 − F (zk+1) −CAmax‖h‖2
= J (zk ) − Jk (zk+1) −CAmax‖h‖
2.
Using (2.5), (2.7), and (2.4), we continue
J (zk ) − J (zk+1) ≥ w(J (zk ) − JK (x
k ))−CAmax‖h‖
2
= w(J (zk ) − JK (x
k ))−2w2CAmax‖x
k − zk ‖2
2
≥ w
(β ‖xk − zk ‖2 − 2wCAmax‖x
k − zk ‖2
2
).
Since (2.3) implies β ≥ 2wCAmax + ε for some ε > 0, we deduce that
J (zk ) − J (zk+1) ≥wε
2‖zk − xk ‖2 > 0.
With this and J (z0) ≥ J (zk ), we get J (z0) > J (zk+1). This completes the proof of the induction
step and consequently (2.6).
In the process, we obtained
(2.10) J (zk ) − J (zk+1) ≥wε
2‖zk − xk ‖2 and J (zk ) > J (zk+1) (k ≥ 0).
Since levJ (z0) J is bounded and J is proper and lower semicontinuous, this verifies (i).
To verify (ii), we observe that summing (2.10) over ℓ = 0, . . . ,k − 1 and telescoping gives
J (z0) ≥ J (zk ) +wε
2
k−1∑
ℓ=0
‖zℓ − x ℓ ‖2 ≥ inf F +wε
2
k−1∑
ℓ=0
‖zℓ − x ℓ ‖2 (k ≥ 1).
This implies zk − xk → 0. We have assumed that ek ∈ ∂ Jk (xk ) for some ek → 0. With ∂ Jk
further expanded, using that
∇
(1
2‖Ak (x)‖
2
)= ∇Ak (x)Ak (x) = ∇A(zk )[A(zk ) + ∇A(zk )∗(x − zk )],
6
this is to say
(2.11) ek ∈ ∇A(zk )[A(zk ) + ∇A(zk )∗(xk − zk )] + ∂F (xk ) + β(xk − zk ).
Since zk k ∈N ⊂ levJ (z0) J , which by assumption is bounded, we can thus find a converging
subsequence zki → x for some x . Necessarily x ∈ dom F .
Recall that the subdifferential mapping x 7→ ∂F (x) is outer semicontinuous [20], that is,
if qki ∈ ∂F (zki ) and also qki → q, then q ∈ ∂F (x). As A ∈ C1(dom F ;RM ), passing to the
subsequential limit in (2.11), using the outer semicontinuity and ek → 0, we obtain
(2.12) 0 ∈ ∇A(x)A(x) + ∂F (x).
Of course, ∇A(x)∗A(x) = ∇(12 ‖A(x)‖
2). By standard calculus rules for the Clarke subdifferential
[11], (2.12) is therefore to say 0 ∈ ∂C J (x). This proves (ii).
Finally, to prove (iii), let x1 and x2 be two different accumulation points of zk k ∈N. To reach
a contradiction, suppose they would lie in two disjoint subsets U1 and U2 of VL . Without loss of
generality, we may assume thatVL = U1∪U2. SinceVL is closed (by J being lower semicontinuous
and ∂C J outer semicontinuous), so areU1 andU2. We can therefore find ϵ > 0 such thatU 2ϵ1 and
U 2ϵ2 remain disjoint, where U ϵ
j := Uj + B(0, ϵ), (j = 1, 2). Let L′ := infx ∈V \(U ϵ1 ∪U ϵ
2 ) J (x). Then
L′ > L. By definition of x1 and x2 as accumulation points, there exist subsequencesU ϵ1 ∋ zk
1i → x1
and U ϵ2 ∋ zk
2i → x2 that satisfy J (zk
1i ) → J (x1) = L < L′ and J (zk
2i ) → J (x2) = L < L′. By
passing to a subsequence, we may assume without loss of generality that k1i < k2i < k1i+1. Since
U 2ϵ1 and U 2ϵ
2 are disjoint, and ‖zk+1 − zk ‖ = w ‖zk − xk ‖ → 0 this implies for i large enough
the existence of k∗i ∈ N such that zk∗i ∈ V \ (U ϵ
1 ∪U ϵ2 ) with k
1i < k∗i < k2i . Then J (zk
∗i ) ≥ L′ > L.
However, since J (zk )k ∈N is decreasing and J (zk1i ) → L, we also have lim supi→∞ J (zk
∗i ) ≤ L.
This contradiction establishes that x1 and x2 must lie in the same connected component of
VL .
Remark 2.2 (More general data terms). Let д : Rn → R be subadditive and L-Lipschitz, for
example, д = ‖ · ‖p , p ∈ [1,∞]. How could we replace 12 ‖A(x)‖
2 by д(A(x)) in (1.1)? The inequality
(2.9) is the crucial part of the proof to work with such an alternative fitting function. Due to
subadditivity we have д(Ak (zk+1)) − д(A(zk+1)) ≥ −д(A(zk+1) −Ak (z
k+1)). If for some C ′ > 0 we
assume
(2.13) д(A(zk+1) −Ak (zk+1)) ≤ C ′‖h‖2,
then instead of (2.9) we obtain д(Ak (zk+1)) − д(A(zk+1)) ≥ −C ′‖h‖2. The proof now goes through
if we replace the third bound on w in (2.1) byβ−ϵ2C ′ . For д = ‖ · ‖1 and C
′= C , (2.13) is simply
Assumption 2.1, so no additional assumptions are needed for that choice.
Remark 2.3 (Unique accumulation point under second-order growth conditions). If one of the
accumulation points x of zk k ∈N is actually a unique local minimiser, for example, J satisfies
a second-order growth condition around x , then S = x forms a connected component of VL .
Consequently, x has to be the unique accumulation point of zk k ∈N. It follows that the whole
sequence convergences to x .
7
Remark 2.4 (Convergence with a larger relaxation parameter). There are two obvious strategies
to replace the relaxed variable zk+1 by zk+1 := (1 −wk )zk+wk x
k for some stepwise relaxation
parameterwk that violates the bounds (2.3):
a) Since CAmax in the third bound of (2.3) arises from (2.9), we can replace it by the exact
“fractional linearisation error”
max
0,
‖A(zk+1)‖2 − ‖Ak (zk+1)‖2
2‖zk+1 − zk ‖2
= max
0,
‖A(zk+1)‖2 − ‖Ak (zk+1)‖2
2wk ‖zk − xk ‖2
.
This depends on wk through zk+1. We therefore need to perform a line search to find (the
largest)wk satisfying this condition subject to the first two bounds of (2.9).
b) If the inequality (2.10) holds for zk+1 in place of zk+1. We can again use a line search to find
a parameterwk ≥ w satisfying this.
3 solution of the inner problem and other implementation details
In this section, we discuss how to solve the subproblems (2.1) generated by Algorithm 2.1.
Furthermore, we present a framework of how to apply RIPGN to (nonsmooth and nonconvex)
regularized nonlinear least squares problems.
3.1 balanced primal dual proximal splitting for the linearised subproblem
To solve the nonsmooth but convex problems (2.2), we utilize a variant of the primal-dual
proximal splitting (PDPS) due to Chambolle and Pock [9]). The basic version of the method
applies to minG + F1 K1 for some convex G and F1 and a linear operator K1. The function G
and the Fenchel conjugate F ∗1 need to have easily calculable proximal maps
proxtG (z) := argminx
G(x) +1
2t‖x − z‖,
where t > 0 is a step length parameter. However, our problem (2.2) with Jk defined in (2.1) will
typically involve several operators; in case of total variation regularization of x ,
minx
1
2‖Ak (x)‖
2+ α ‖∇hx ‖ +
β
2‖x − zk ‖.
Proximal maps for functions composed with operators are generally not easily calculable.
Therefore, the linear part of Ak and the discretised gradient ∇h will both have to go into K1;
it will consist of two different blocks with different scales, which moreover vary between the
subproblems due to changing linearisations of Ak . We will therefore adapt the algorithm to the
scales of these blocks following [46, 35].
8
3.2 spatially-adapted primal-dual proximal splitting
For convex, proper, lower semicontinuous G : X → R, F1 : Y1 → R, F2 : Y2 → R and linear
operators K1 ∈ L(X ;Y1), K2 ∈ L(X ;Y2), on (finite-dimensional) Hilbert spaces X ,Y1, and Y2, we
consider
(3.1) minx ∈X
G(x) + F1(K1x) + F2(K2x).
With Kx := (K1x ,K2x) and y = (y1,y2) ∈ Y := Y1 × Y2, we can write the problem using the
convex conjugates of F1 and F2 as
minx ∈X
maxy ∈Y
G(x) + 〈Kx ,y〉 − F ∗1 (y1) − F ∗2 (y2).
Due to potentially different scales of the “blocks” y1 and y2 of y , we use two different dual step
length parameters for numerical efficiency. This has been called “diagonal preconditioning” in
[35] and “spatial adaptation” in [46]. The latter also introduces ways to perform acceleration
when strong convexity is present in only some blocks. In either case, without acceleration, such
a block-adapted method requires specifying step lengths t , s1, s2 > 0 satisfying
Id > tΣ1/2KK∗Σ1/2 for Σ := diag(s1 Id, s2 Id),
where we write Id : x 7→ x for the identity operator. Since KK∗=
(K1K
∗1 K1K
∗2
K2K∗1 K2K
∗2
), by Young’s
inequality, this condition holds if for some λ > 0 and estimates L1 ≥ ‖K1‖ and L2 ≥ ‖K2‖,
(3.2) 1 > (1 + λ)ts1L21 and 1 > (1 + λ−1)ts2L
22.
Algorithm 3.1 specializes the spatially adapted or diagonally preconditioned PDPS to the two-
dual-block case and these step length conditions; for more general descriptions, stochastic
sampling, and acceleration, we refer to [46]. A simple choice to satisfy (3.2) is to take for λ = 1,
some t > 0, and small δ ∈ (0, 1),
(3.3) s1 = (1 − δ )/[2tL21 ] and s2 = (1 − δ )/[2tL22].
Notice how larger ‖Kj ‖ will cause correspondingly smaller step length parameter sj . This way
the method can balance between differing scales of the different blocks of the dual variable.
The method hasO(1/N ) convergence rate for an ergodic gap [46]. Since F ∗2 is strongly convex,
it would also be possible to update the parameters t , s1, s2 > 0 on each iteration to accelerate
the method to a mixed O(1/N 2) +O(1/N ) convergence rate for y2 [46].
3.3 relaxed inexact proximal gauss–newton
We now explain how we will use Algorithm 3.1 to solve the sub-problems (2.2) for the RIPGN.
We now assume that F has the structure F (x) = F2(K2x) = F (x) + δV (x), F2 is convex, proper
and lower semicontinuous, K2 is linear, and δV is the 1,∞-valued indicator function of a
set V ⊂ RN . We will typically use V to model positivity constraints. We now formulate (2.2),
namely
minx
1
2‖Ak (x)‖
2+ F (x) +
β
2‖x − zk ‖2
9
Algorithm 3.1 Primal-dual proximal splitting with distinct step lengths for two dual blocks
Require: Convex, proper, lower semicontinuous G : X → R, F1 : Y1 → R, F2 : Y2 → R and
linear operators K1 ∈ L(X ;Y1), K2 ∈ L(X ;Y2).
1: Choose step length parameters t , s1, s2 > 0 satisfying (3.2) for some upper bounds L1 ≥ ‖K1‖
and L2 ≥ ‖K2‖ and λ > 0.
2: Choose initial iterates x0 ∈ X , y01 ∈ Y1, y
02 ∈ Y2.
3: for all i ≥ 0 until a stopping criterion is satisfied do
4: x i+1 := proxtG(x i − tK∗
1yi1 − tK∗
2yi2
)
5: sx i+1 := 2x i+1 − x i
6: y i+11 := proxs1F ∗1
(y i1 + s1K1sx
i+1)
7: y i+12 := proxs2F ∗2
(y i2 + s2K2sx i+1
)
8: end for
Algorithm 3.2 Relaxed inexact proximal Gauss–Newton for problem (1.1).
Require: Convex, proper, lower semicontinuous F2 : Rn → R, linear and bounded K2 : R
N →
Rn , convex V ⊂ R
N , and A ∈ C1(V ;RM ).
Require: w > 0, δ ∈ (0, 1), t > 0, and β > 0.
1: Choose initial iterate z0.
2: s2 := (1 − δ )/[2t ‖K2‖2]
3: for all k ≥ 0 until a stopping criterion is satisfied do
4: Kk1 := ∇A(zk )∗
5: bk := A(zk ) − ∇A(zk )∗zk
6: s1 := (1 − δ )/[2t ‖Kk1 ‖
2]
7: Using Algorithm 3.1 with parameters t , s1, s2 and initial iterates x0 := zk , y01 := 0, and
y02 := 0, find an approximate solution xk = x i (for large i) to (3.4)
8: zk+1 := zk +w(xk − zk )
9: end for
in the form (3.1) by taking Fk1 (y) =12 ‖y − bk ‖2, Kk
1 = ∇A(zk )∗, and bk = A(zk ) − ∇A(zk )∗zk .
Furthermore, we place the proximal and the indicator term intoGk (x) = δV (x)+β
2 ‖x −zk ‖2.We
added superscript k to F1, K1, and G to highlight that these terms depend on the outer iteration.
Now the linearised problem (2.2) can be written
(3.4) argminx
Gk (x) + Fk1 (Kk1 x) + F2(K2x).
This has the form (3.1) and can be solved with Algorithm 3.1 using step parameters (3.3).
Note that in Theorem 2.1 we may consider δV as a part of F . However, from computational
stand-point, it is usually more efficient to include it into G.
The whole process of solving (1.1), the relaxed inexact proximal Gauss–Newton method, is
described in Algorithm 3.2. Here we would like to stress that A(z) and F (z) depend on the
application. In the next section, we discuss specific choices of these functions in the case of
electrical impedance tomography.
10
4 application to electrical impedance tomography
We give a brief review of the EIT forward model and its finite element (FE) approximation
in a case where measurements consist of electric currents corresponding to a set of potential
excitations. We treat the inverse conductivity problem of EIT as a regularized nonlinear least
squares problem for which we describe three different regularization schemes. In this section,
as a deviation of the previous section, the unknown of interest is written σ instead of z or x to
be consistent with typical notation for electrical conductivity.
4.1 forward model of eit
Due to our measurement equipment, we derive the forward model of EIT in such way that
it solves the current through each electrode, given the conductivity within the domain and
potential at each electrode. More specifically, in each excitation, one of the electrodes on object’s
surface is set to a known electric potential, and the rest of the electrodes are connected to
ground. Corresponding to each excitation, electric currents through all grounded electrodes are
measured.
As the result of the FE approximation, we obtain a nonlinear operator I (σ ), which together
measurement vector Im and an additional weight matrix LA, forms the data fidelity term A(σ )
(see below). For details of the FE approximation, we refer to [49].
Given the electrical conductivity σ within domain Ω and a potentialUp
kat each electrode ek
during excitation p, the forward problem of EIT is to solve the current Ip
kthrough each electrode.
This requires solving also the spatially distributed electric potential up inside the domain. The
most accurate physically realizable way to model this is the Complete Electrode Model (CEM)
[10]. For existence and uniqueness of CEM see [41]. With χ = (χ1, χ2, χ3) the spatial coordinates
within the domain Ω ⊂ R3, CEM is described by a set of equations
∇ · (σ (χ )∇up (χ )) = 0(χ ∈ Ω), up (χ ) + ζkσ∂up (χ )
∂n= U
p
k(χ ∈ ∂Ωek ),(4.1a)
∫
∂Ωek
σ∂up (χ )
∂ndS = −I
p
k, and σ
∂up (χ )
∂n= 0,
(χ ∈ ∂Ω \
L⋃
k=1
∂Ωek
).(4.1b)
where ∂Ωek is the part of the ∂Ω covered by k’th electrode, ζk is contact impedance, n is the
outward unit normal of Ω, and L is the number of electrodes. In addition, the currents Ip
kare
required to satisfy Kirchhoff’s law∑L
k=1 Ip
k= 0. From here on,we assume the contact impedances
to be known, ζk = 10−7 Ω, as the actual contact impedances in the measurement setups used in
this study are negligible.
In order to approximate the solution of the boundary value problem (4.1) numerically, we
utilize Galerkin finite element method (FEM). Following the scheme described in thesis [49], we
write a variational form of the system (4.1). Moreover, we use a finite dimensional approximation
of the electric potential u as up (χ ) =∑Nu
j=1upj ϕ j (χ ) and write the vector of electrode currents
for excitation p as Ip =∑L−1
j=1 Ipj nj to ensure that the Kirchhoff’s current law is fulfilled. Here
ϕ j is a basis function for presenting the electric potential, and nj , j = 1, . . . ,L − 1, are vectors
that form a basis for the electrode currents. As in a typical Galerkin scheme, ϕ j and nj are also
11
used as test functions in the variational form. The FE approximation, i.e., the coefficient vector
θp = (up1 , . . . ,u
p
N, Ip1 , . . . , I
p
L−1), is obtained as a solution of the linear system
(4.2) Dθp = U p , where D =
(D1 0
D2 D3
)∈ R
N+L−1×N+L−1,
and the elements of the blocks D1, D2 and D3 are
[D1]i j =
∫
Ω
σ (χ )∇ϕ j (χ ) · ∇ϕi (χ ) dV +
L∑
k=1
1
ζk
∫
ek
ϕ j (χ )ϕi (χ ) dS,
[D2]k j = −
L∑
k=1
1
ζk
∫
ek
ϕ j (χ )(nk )k dS = −
(1
ζ1
∫
e1
ϕ j (χ ) dS −1
ζk+1
∫
ek+1
ϕ j (χ ) dS
)
[D3]kl =
L∑
k=1
(nl )k (nk )k =
1, k , l
2, k = l
where i, j = 1, . . . ,N ; j = 1, . . . ,N ; and k, l = 1, . . . ,L − 1. The vector U p is computed from the
known electrode potentials as
(4.3) [U p ]i =
∑Lk=1
Up
k
ζk
∫ekϕi (χ ) dS, i = 1, . . . ,N
Upi+1
ζi+1|ei+1 | −
Up1
ζ1|e1 |, i = N + 1, . . . ,N + L − 1.
Note that the electrode currents Ip are obtained from (4.2) by first solving the coefficient vector
θp = D(σ )−1U p then multiplying Ip = Kθp where K ∈ RL×N+L−1, K = [0, . . . , 0,n1, . . .nL−1].
Now the operator A can be written as
A(σ ) = LA (I (σ ) − Im) ,
where LA arises from the factorization of the inverse noise covariance matrix (precision matrix)
W = LA∗LA [14], I (σ ) = (I (σ )1, . . . , I (σ )L) ∈ R
L2 is a vector containing currents from all excita-
tions, and Im is the measurement vector corresponding to I . For the linearisation, specifically the
components used in (3.4), we have Kk1 = LA∇I (σ
k )∗ and bk = LA(I (σk ) − Im − ∇σ I (σ
k )∗σk).
Finally, we also discretise the conductivity, setting σ =∑N
i=1 σiφi , where φi are linear basis
functions. Note that U p is constant with respect to the factors σi , thus the partial derivatives∂Ip
∂σican be solved from
0 =∂U p
∂σi=
∂Dθp
∂σi=
∂D
∂σiθp + D
∂θp
∂σi⇐⇒
∂Ip
∂σi=
∂Kθp
∂σi= −KD−1 ∂D
∂σiθp .
4.2 regularization and constraints
Next we introduce three different regularization schemes for EIT. We utilize these schemes
in Section 5. The first scheme comprises of smoothness-promoting L2-regularization and a
barrier function to approximate the positivity constraint. We use this scheme to compare the
12
RIPGN against Newton’s method. The other two schemes comprise of total variation (TV)
with a positivity constraint, and smoothed TV with the barrier function. The latter is used to
compare RIPGN against Newton’s method in TV-regularized setting, and the smooth models
against nonsmooth models. For a detailed description on how to compute the required proximal
mappings for Algorithm 3.1 see hp://proximity-operator.net and [5]. Additional mappings are
listed in Appendix d.
4.2.1 smoothness-promoting regularization with a barrier
We take the first regulariser
FΓ(®σ ) := ‖RΓ(®σ − ®σm)‖2,
where ®σm is the expected value of ®σ , and ®σ = (σ1, . . . ,σN ) is the vector of FE factors of σ . The
matrix RΓ is defined by inverse factorization(R∗ΓRΓ
)−1= Γ of a Gaussian kernel Γi, j = ae−
‖χi−χj ‖2
2b
[29]. Furthermore, we introduce a piecewise polynomial barrier function
Bmin(σ ) :=12 ‖Lmin(σ )(®σ − σmin)‖
2, with [Lmin]i j (σ ) :=
lmin, where i=j and σi < σmin
0, otherwise,
where lmin is a coefficient that determines the strength of the barrier function. Now the convex
component in (1.1) is F (σ ) = FΓ(RΓσ ) + Bmin(σ ). As Bmin is diagonal, in the subproblems, it
is computationally more efficient to include it into Gk . Thus, for formulating the two-block
PDPS for the subproblems as in Section 3.3, we take F2(y) = FΓ(y), K2σ = RΓ ®σ , and Gk (σ ) =
Bmin(σ ) + δV (σ ) +β
2 ‖σ − σk ‖2.
4.2.2 tv regularization and nonsmooth constraints
In the second scheme we apply nonsmooth total variation regularization with positivity con-
straints. Since σ is continuous by its finite element construction, its isotropic total variation
(TV) [38] can be written as
TV(σ ) =
∫
Ω
|∇σ (χ )| dV ,
where |x | =√x21 + x
22 + x
23 is the Euclidean spatial norm. In linear basis, the spatial gradient
of σ is constant within an element, meaning∂σ (χ )∂χ1= ( ∂σ
∂χ1)i if χ belongs to element i , and the
integration yields
TV(σ ) =
NE∑
i=1
Vi
√(∂σ
∂χ1
)2
i
+
(∂σ
∂χ2
)2
i
+
(∂σ
∂χ3
)2
i
,
where Vi is the volume of the i’th element and NE is the number of elements in FE basis. This
can be expressed
TV(σ ) =
NE∑
i=1
√(R1®σ )
2i + (R2®σ )
2i + (R3®σ )
2i =: ‖R∇σ ‖2,1,
13
where R∇σ := [ (R1 ®σ )T (R2 ®σ )
T (R3 ®σ )T ]T and the components (i, j) of Rl ∈ R
NE×N for l = 1, 2, 3 are
computed from the basis functions φ j as
[Rl ]i j =
Vi
∂φ j∂χl, φ j when is non-zero in element i,
0, otherwise.
For formulating the two-block PDPS for the subproblems as in Section 3.3, we now take F2(y) =
α ‖y ‖2,1, K2 = R∇, and Gk (σ ) = δV (σ ) +
β
2 ‖σ − σk ‖2.
In some examples of Section 5, we use TV regularization on two-dimensional domains. In
those cases, the volume Vi of the element i is replaced by the element surface area and the
spatial difference operators, R1 and R2, are computed from the two-dimensional basis functions.
Operator R3 is dropped.
4.2.3 smoothed tv regularization and barrier function
As the last regularization scheme, we introduce a smoothed version of TV and semismooth
barrier functions. The smoothed TV can be written as
˜TV (σ ) = ‖ f (σ )‖1 with [f (σ )]i =√(R1®σ )
2i + (R2®σ )
2i + (R3®σ )
2i + γ .
Here, γ is a smoothing parameter that we set to γ = 10−7. We also introduce a maximum barrier
Bmax(σ ), by an obvious modification of the minimum barrier Bmin(σ ) described above. Now the
component F in (1.1) is F (σ ) = α ˜TV (σ ) + Bmin(σ ) + Bmax(σ ), and for the subproblems we have
F2(y) = α ‖y ‖1, K2(σ ) = f (σ ), andGk (σ ) = Bmin(σ ) + Bmax(σ ) + δV (σ ) +β
2 ‖σ − σk ‖2. Note that
with these notations, the operator K in the subproblem (2.2) is nonlinear. Hence we solve it
using a variant Algorithm 3.1 for nonlinear K from [44, 30].
5 numerical and experimental studies
We evaluate the proposed relaxed inexact proximal Gauss–Newton (RIPGN) method numerically
in EIT image reconstruction. In the first set of numerical studies, Cases 1–3 (Section 5.2), we
compare RIPGN against Newton’s method and NL-PDPS in a circular 2D geometry. In Cases 4–5,
Section 5.3, we evaluate the performance of RIPGN with experimental data from an EIT-based
surface sensing system, or sensing skin. Finally, in Case 6 (Section 5.4), we demonstrate viability
of RIPGN to three-dimensional EIT reconstruction with numerical simulations. We include
further experiments in the Supplementary Material.
5.1 computational aspects
In the numerical studies, we evaluate the convergence of RIPGN (Algorithm 3.2) with multiple
relaxation parameters w and use static values for the parameters δ , t , and β . We set δ to an
arbitrary small value δ = 0.01 to satisfy (3.2), choose t = 10−6 by evaluating the convergence
of the first subproblem of Case 3 with multiple step parameters (see Section 5.2.5), and set β
to a small value β = 10−10; in our experience, β has similar impact on the convergence of the
14
Table 1: Parameters used in each test case. σ ref is the value of the homogeneous estimate com-
puted from reference measurements.Case(s) Setup Data Regularisation LA(i, i) α lmin lmax σmin σmax Vmin Vmax
1 2D Water tank Synthetic Smooth 5 · 104 - 102√2J (σ 1) – 10−4 – 10−8 1012
2 2D Water tank Synthetic Smoothed TV 5 · 104 10 102√2J (σ 1) 102
√2J (σ 1) 10−4 1010 10−8 1012
3 2D Water tank Synthetic TV 5 · 104 10 – – - – 10−4 1012
4 Sensing skin Measured Smoothed TV 100 1/4 5√2J (σ 1) 10−1
√2J (σ 1) 10−4 σ ref 10−8 1012
5 Sensing skin Measured TV 100 1/4 – – - – 10−4 σ ref
6 3D Water tank Synthetic TV 5 · 104 10 – – - – 10−4 1012
Algorithm 3.2 as the relaxation parameter w . Every linearised subproblem is solved to 6000
iterations.
We start the RIPGN, Newton, and NL-PDPS iterations from a homogeneous estimate σ 1.
Furthermore, we introduce minimum and maximum constraints, Vmin and Vmax, by defining the
domain V as a hypercube V =σ ∈ R
N : Vmin ≤ σi ≤ Vmax, i = 1, 2, . . . ,N . Table 1 shows
the parameters that vary between the cases. Note that in this section, we denote the first index
as k = 1 instead of k = 0.
In synthetic tests, Cases 1–3 and 6, we compute the relative error of the estimated conductivity
σ with respect to the true conductivity σtrue as RE= ‖σ − σtrue‖/‖σtrue‖ · 100%. Note, however,
that due to the simulated measurement noise and the modeling errors caused by the differing
mesh sparsities, the true conductivity is often quite far from the actual minimum of the objective
function. To highlight this, we compute the objective function at the true conductivity by
evaluating the true conductivity at the nodes of the mesh we use in the forward solution. We
also compute the relative error of this interpolation, to assess how well the original conductivity
could be presented in the forward solution mesh.
We perform all computations in MATLAB 2017b with dual Intel Xeon E5649 @ 2.53/2.93 GHz
CPUs and with 99 GB RAM (1333 Mhz ECC DDR3). We implement crucial components of the
construction of the matrix D and the Jacobian ∇A in C++. We compute the forward solution
(4.2), the equation KD−1 for the Jacobian and the linear system for Newton’s method through
LU decomposition using UMFPACK [13]. In Case 6, we compute the forward solution using
BiCGSTAB.
To catch the stagnation of the RIPGN andNewton’s method,we initially stop the iteration if an
iterate zk decreases the value of the objective function less than 0.5, i.e., if J (σk ) − J (σk−1) < 0.5.
However, in order to ensure that the iteration does not end prematurely, we compute additional
two iterates to check if one of those decreases the objective function by at least 0.5. If they do,
we continue the iteration normally, and if not, we discard these two iterates and take the initial
stopping iterate as the estimated solution. We employ this stagnation check after eighth iteration
to ensure that at least 10 iterations are computed. For NL-PDPS, we extend these conditions
to 700 and 300, respectively. We note that, as in many previous EIT studies [48], line search
is used in Newton’s method, as the method did not converge within reasonable time with a
constant step parameter.
15
5.2 numerical 2d eit studies
In Cases 1–3, the geometry of the domain Ω resembles shallow water tank. The diameter of the
tank is 24 cm and the height is 7 cm. Furthermore, the tank has sixteen evenly placed electrodes
on the surface; the width and height of the electrodes are 2.5 cm and 7 cm, respectively.
The conductivity inside the tank is constant along the vertical axis, and hence, although the
EIT forward model is three-dimensional, the conductivity is two-dimensionally distributed. In
the forward model, we map the 2D conductivity to 3D by linear interpolation.
When simulating the measurement data, we present the electrical conductivity in a piecewise
linear basis using a tetrahedral mesh consisting 84052 nodes and we approximate the electric
potential in a second order polynomial basis consisting 629513 nodes. In the reconstruction, we
approximate the 2D conductivity in a piecewise linear basis with triangular 2D mesh of 1117
nodes; for the forward solver, we map this 2D distribution to piecewise linear 3D distribution
(tetrahedral mesh consisting 8189 nodes). Furthermore, we approximate the electric potential
with second order polynomial basis functions in a mesh with 56986 nodes.
To simulate actual measurements more realistically, we add Gaussian distributed noise, with
std of 0.005 |Ii |, to each simulated measurement Ii .
5.2.1 case 1: smoothness-promoting regularization & newton’s method
We first evaluate the RIPGN against Newton’s method on a smooth optimization problem. We
use the smoothness promoting regularization (Scheme 1; Section 4.2.1). Furthermore, to match
the regularization, the true conductivity is also smooth (Figure 2, left): We generate the true
conductivity by drawing a sample from a multivariate Gaussian distribution expressing spatial
smoothness. This distribution is of the form described in Section 4.2.1, and its expectation as
well as the parameters of the covariance matrix are chosen to be same as in the model used in
regularization. We note, however, that since the FE mesh used in inversion is sparser than that
in the data simulation, the true conductivity is a not a realization from a model that corresponds
to the regularizing function.
Figure 1 shows the value of the objective function as a function of iteration number k and
computational time t for RIPGN method corresponding to five relaxation parametersw and for
the Newton’s method. Table 2 lists the number of iterations required for convergence, value of
the objective function at the last iterate, computational time and relative error corresponding to
each of these estimates. Figure 2 illustrates the reconstructed images.
Figure 1 and Table 2 show that in Case 1, Newton’s method and RIPGN with w ≤ 3/4
converge. The reconstructions have small relative errors, as shown by Table 2. Smaller relaxation
parameters result in increased number of iterations, which in turn increases the computational
times, as expected. RIPGN withw = 3/4 converges in around 7 minutes, while Newton’s method
converges in about same amount of iterations, but the computation of each iterate is considerably
longer, taking around 37 minutes to converge. Hence, although subproblems are solved exactly
in Newton’s method, we need the same amount of iterations for convergence as with RIPGN,
which solves subproblems inexactly. Longer computational times with Newton’s method are
mostly due to the line search method.
Figure 2 shows that the reconstruction from converging iterations are visually very close to
16
100
101
102
k
100
102
104
106
108
1010
J(
k)
RIPGN w=1/4
RIPGN w=1/2
RIPGN w=3/4
RIPGN w=9/10
RIPGN w=1
Newton
102
103
104
t(s)
100
102
104
106
108
1010
J(
k)
RIPGN w=1/4
RIPGN w=1/2
RIPGN w=3/4
RIPGN w=9/10
RIPGN w=1
Newton
Figure 1: Case 1. Value of the objective function J as function of iteration number k (left), and
computational time t (right) for the RIPGN and Newton’s method.
(a) True. (b) RIPGNw = 14 . (c) RIPGNw = 3
4 . (d) RIPGNw = 1. (e) Newton. 0.01
0.015
0.02
0.025
0.03
0.035
0.04
S/m
Figure 2: Case 1. True conductivity (a), RIPGN-based reconstructions with relaxation parameters
w = 1/4 (b) andw = 1 (c), and the Newton-based reconstruction (d).
Table 2: Case 1. The number of iterations required for convergence, value of the objective function
at the last iterate, computational time, and relative error of the estimate for the RIPGN
and Newton’s method.
Algorithm Iterations (K) J (σ ) Time (s) RE(%)
RIPGNw = 1/4 34 84.069 1197.4 2.1864
RIPGNw = 1/2 19 84.059 642.58 2.1895
RIPGNw = 3/4 12 84.512 398.23 2.1975
RIPGNw = 9/10 8 1.1179 · 108 254.1 73.431
RIPGNw = 1 8 4.5842 · 108 255.97 103.68
Newton 14 86.7 2236.2 2.1969
the true conductivity. With step parameters w = 9/10 and w = 1, the RIPGN reconstructions
diverge. Convergence, indeed, cannot be expected for relaxation parametersw ≈ 1 due to the
bound (2.3) in Theorem 2.1.
As mentioned in Section 5.1, we also evaluate the objective function at the true conductivity.
This gives J (σtrue) = 1.2057 · 105 and a 0.9294% relative error, meaning that although the
true conductivity can be presented quite accurately in the forward solution mesh, the best
presentation is very likely far off from the actual minimum of the objective function.
17
100
101
102
k
102
104
106
108
1010
J(
k)
RIPGN w=1/4
RIPGN w=1/2
RIPGN w=3/4
RIPGN w=9/10
RIPGN w=1
Newton
102
103
104
t(s)
102
104
106
108
1010
J(
k)
RIPGN w=1/4
RIPGN w=1/2
RIPGN w=3/4
RIPGN w=9/10
RIPGN w=1
Newton
Figure 3: Case 2. Value of the objective function J as function of iteration number k (left), and
computational time t (right) for the RIPGN and Newton’s method.
(a) True. (b) RIPGNw = 14 . (c) RIPGNw = 3
4 . (d) RIPGNw = 1. (e) Newton.
0.005
0.01
0.015
0.02
0.025
S/m
Figure 4: Case 2. True conductivity (a), RIPGN-based reconstructions with relaxation parameters
w = 1/4 (b) andw = 1 (c), and the Newton-based reconstruction (d).
5.2.2 case 2: smoothed tv regularization & comparison with newton’s method
Because standard Newton’s method cannot be used on non-smooth problems (such as those
induced by regularization Scheme 2, Section 4.2.2), in Case 2, we compare RIPGN to Newton’s
method in Scheme 3 (Section 4.2.3); a smoothed version of Scheme 2. In Case 2, the true target
contains a circular inclusion of low conductivity (10−3 S/m) on a constant background with
conductivity of 0.028 S/m.
Figure 3 and Table 3 show that in Case 2, Newton’s method takes around 44 minutes to
converge while RIPGN with relaxation parameter w = 3/4 and w = 9/10 takes around 5–6
minutes. RIPGN diverges again with relaxation parameterw = 1. The relative errors in Case
2 are larger than in Case 1. This is expected, as the conductivity in Case 1 was a draw from a
distribution with statistical properties that corresponded to the regularization that was used.
These errors are further increased as the smooth shapes in Case 1 tend to be more accurately
representable with linear interpolation than sharp-edged inclusion in Case 2. The reconstructed
images (Figure 4) are, however, fairly accurate. Evaluating the objective function at the true
conductivity gives J (σtrue) = 8.6491 · 104 with 4.6023% relative error.
18
Table 3: Case 2. The number of iterations required for convergence, value of the objective
function at the last iterate, computational time, and relative error of the estimate for
the RIPGN and Newton’s method.
Algorithm Iterations (K) J (σ ) Time (s) RE(%)
RIPGNw = 1/4 31 135.35 1033.7 6.5056
RIPGNw = 1/2 15 135.61 487.21 6.5088
RIPGNw = 3/4 10 135.51 315.76 6.5164
RIPGNw = 9/10 10 135.63 319.6 6.5162
RIPGNw = 1 8 5.386 · 108 275.66 151.06
Newton 13 135.77 2622.4 7.1371
100
105
k
102
104
106
108
1010
J(
k)
RIPGN w=1/4
RIPGN w=1/2
RIPGN w=3/4
RIPGN w=9/10
RIPGN w=1
NL-PDPS
102
104
106
t(s)
102
104
106
108
1010
J(
k)
RIPGN w=1/4
RIPGN w=1/2
RIPGN w=3/4
RIPGN w=9/10
RIPGN w=1
NL-PDPS
Figure 5: Case 3. Value of the objective function J as function of iteration number k (left), and
computational time t (right) for the RIPGN and NL-PDPS.
5.2.3 case 3: tv regularization & comparison with nl-pdps
In Case 3,we compare RIPGNwith NL-PDPS [44]. We use the nonsmooth regularization (Scheme
2; Section 4.2.2). The target conductivity in Case 3 is the same as in Case 2.
Figure 6 shows no visual differences between the reconstruction computed with RIPGN
(w < 1) and the reconstruction computed with NL-PDPS. However, Figure 5 and Table 4 show
that NL-PDPS takes over a week and a half to solve the problem with the desired accuracy, while
RIPGN (with w = 3/4 orw = 9/10) takes less than 6 minutes. It should be noted though that
the total amount of iterations, including the 6000 in each RIPGN linearisation, is considerably
fewer with NL-PDPS. This is consistent with earlier studies [44, 12].
Finally, Figure 6 and Table 4 show that the unsmoothed total variation slightly improves the
reconstruction quality and the relative error from Case 2 (cf. Figure 4 and Table 3).
19
(a) True. (b) RIPGNw = 14 . (c) RIPGNw = 3
4 . (d) RIPGNw = 1. (e) NL-PDPS.
0.005
0.01
0.015
0.02
0.025
S/m
Figure 6: Case 3. True conductivity (a), RIPGN-based reconstructions with relaxation parameters
w = 1/4 (b) andw = 1 (c), and the NL-PDPS-based reconstruction (d).
Table 4: Case 3. The number of iterations required for convergence, value of the objective
function at the last iterate, computational time, and relative error of the estimate for
the RIPGN and the NL-PDPS.
Algorithm Iterations (K) J (σ ) Time (s) RE(%)
RIPGNw = 1/4 31 129.48 935.31 5.8401
RIPGNw = 1/2 16 129.97 473.62 5.8436
RIPGNw = 3/4 11 129.9 313.4 5.8466
RIPGNw = 9/10 11 130.24 314.73 5.8533
RIPGNw = 1 8 7.1417 · 108 225.87 334.68
NL-PDPS 24221 130.56 9.9253 · 105 5.9157
5.2.4 effects of the smoothed tv
Next we compare the solutions of the smoothed TV scheme to those of the (nonsmooth) TV
scheme. Although the differences between the reconstructions in Figure 4 and Figure 6 appear
small, closer inspection reveals these to be fundamental. Figure 7 shows the true conductivity
and three profiles of the true conductivity that are taken along the dashed line. The Figure 7 also
shows profiles from the solutions computed using Newton’s method, RIPGN with smoothed TV
and RIPGN with TV.
The profiles in Figure 7 illustrate that the solution corresponding to smoothed TV is spatially
smoother than that corresponding to non-smoothed TV—the former fails to track the sharp
edges in the conductivity. We remind that all solutions are actually piecewise linear due to the
choice of basis functions.
5.2.5 subproblem parameter selection and balancing
In Cases 1–3, we used step parameter t = 10−6 in the linear solver. We chose this step parameter
by evaluating the rate of convergence of the first subproblem in Case 3 with multiple step
parameters t , and then selecting the one that converges fastest. Figure 8 (left) shows the value of
the objective function at the approximative solution J1(x1) after 6000 iterations. Furthermore, to
illustrate the differences between the balanced and the non-balanced method, the figure shows
the value of J1(x1) when the problem is solved without balancing, i.e., with s1 = s2 = (tL2)−1.
On the right in Figure 8, solid lines indicate the value of Jk (xk ) when the problem is solved
20
16 18 20 22 24 26
0
0
0.005
0.01
0.015
0.02
0.025
S/m
8 10 12 14
0
0.0256
0.0257
0.0258
0.0259
0.026
0.0261
0.0262
0.0263
0.0264
S/m
2 4 6
0
0.02585
0.0259
0.02595
0.026
0.02605
0.0261
0.02615
0.0262
S/m
True
Newton
RGN (smoothed)
RGN
Figure 7: The differences between the smoothed and unsmoothed total variation are distin-
guishable on closer inspection. Conductivity profile is highlighted with a dashed blue
line. Same profile is also taken from smooth Newton and RIPGN reconstructions and
nonsmooth RIPGN reconstruction.
Figure 8: Left: Value of the objective function in the first linearised problem at the minimum
point estimate x as a function of step parameter t . The step parameters t = 10−6 and
t = 10−7 are highlighted in red. Right: Value of the objective function of linearised
problem k at the xk with t = 10−6 for the balanced algorithm and t = 10−7 non-balanced.
Area around the curves highlight the minimum value with any t . The dashed line
represents the operator norm of ∇A.
using both the balanced and the non-balanced methods with step parameters t = 10−6 and
t = 10−7 respectively. Areas below the curves show the minimum with any of the tested
parameters. For this experiment, the outer iteration is advanced with relaxation parameter
w = 3/4 using solutions from the balanced method with t = 10−6. For the curiosity, the operator
norm of ∇A is also shown in the figure.
Figure 8 shows that although both methods converge almost equally in the first subproblem,
the balanced method outperforms normal PDPS in the subsequent problems. Furthermore,
21
Figure 8 shows that unlike with the non-balanced PDPS, in the balanced PDPS, the optimal step
parameter remains almost unchanged at every linearisation.
5.3 experimental studies
The measurement device we use in the experimental studies is manufactured by Rocsole Ltd.
(www.rocsole.com). This device utilizes a typical ECT measurement principle in which each
electrode is sequentially set to a known sinusoidal potential, while the others remain grounded.
The currents induced by the potential differences are then sampled, in this case with 1 MHz
sampling frequency, and the amplitude of the induced current is computed from the samples using
discrete Fourier transform. Here the excitation frequency is set to 39 kHz and measurements
used in the reconstruction are time averages of the computed amplitudes over one minute time
period.
5.3.1 cases 4-5: sensing skin & crack detection
In Case 4, we test RIPGN in a crack detection problem arising from EIT-based sensing skins (see
[18]). Computationally this crack detection problem differs from the inclusion detection in a
typical water tank geometry, because cracks cause sharp but spatially narrow inclusions of low
conductivity on the background conductivity of the paint layer. Furthermore, the conductive
paint is far from being homogeneous in thickness and consequently, the background conductivity
is inhomogeneous. To take into account this inhomogeneity we follow an approximative data
correction approach proposed in [18]. In addition, we exploit the fact that the cracks never
increase the conductivity, allowing us to constrain the conductivity from above.
The sensing skin used in the study is painted with Kontakt Chemie EMI 35 conductive graphite
paint onto a rectangular plexiglass. The side lengths of the plexiglass are 44 cm and 42 cm and
each side has seven 2.5 cm × 1.25 cm electrodes. Furthermore, four 2.5 cm × 2.5 cm electrodes
are placed in the middle of the sensing skin.
From the sensing skin measurements, we compute a smoothed TV solution with Newton’s
method and RIPGN (Case 4), and a nonsmooth TV solutions with RIPGN (Case 5). The triangular
mesh used in the computations has 3147 nodes for the conductivity represented in a piecewise
linear basis and 12281 nodes for the electric potential in second order basis. Parameters used in
these cases are shown in Table 1.
Figure 10 (left) shows a photograph of the sensing skin in Case 4. The crack in the photograph
is highlighted in red as the crack is very narrow.
Figure 9 shows that for every relaxation parameter RIPGN converges considerably better than
withw = 1 in Cases 1–3. However, the value of the objective function oscillates slightly over the
last few iterations whenw > 1/2. The better convergence with relaxation parameterw ≤ 1/2 is
also confirmed by Table 5. The objective function with Newton’s method converges to similar
values as RIPGN with the larger step parameters. Note that in this case, the iteration time with
Newton’s method is considerably shorter than in Cases 1–3 due to the two-dimensional forward
model. Furthermore, Figure 10 shows that the reconstructed images capture the shape and
length of the crack rather well. In this example, the effect of relaxation parameter to the quality
of RIPGN-based reconstruction is very small, and even the difference between the RIPGN-
22
100
101
102
103
k
105
106
J(
k)
RIPGN w=1/4
RIPGN w=1/2
RIPGN w=3/4
RIPGN w=9/10
RIPGN w=1
Newton
102
103
104
t(s)
105
106
J(
k)
RIPGN w=1/4
RIPGN w=1/2
RIPGN w=3/4
RIPGN w=9/10
RIPGN w=1
Newton
Figure 9: Case 4. Value of the objective function J as function of iteration number k (left), and
computational time t (right) for the RIPGN and Newton’s method.
(a) True. (b) RIPGNw = 14 . (c) RIPGNw = 3
4 . (d) RIPGNw = 1. (e) Newton.1
2
3
4
5
6
S
Figure 10: Case 4. Photo of the sensing skin (crack highlighted) (a), RIPGN-based reconstruc-
tions with relaxation parameters w = 1/4 (b) and w = 1 (c), and the Newton-based
reconstruction (d).
Table 5: Case 4. The number of iterations required for convergence, value of the objective
function at the last iterate, and computational time for the RIPGN and the Newton’s
method.
Algorithm Iterations (K) J (σ ) Time (s)
RIPGNw = 1/4 43 29478 1791.2
RIPGNw = 1/2 37 29504 1538.9
RIPGNw = 3/4 23 29681 950.08
RIPGNw = 9/10 23 29858 954.99
RIPGNw = 1 17 29679 676.2
Newton 112 29949 1642.3
23
100
101
102
k
105
106
J(
k)
RIPGN w=1/4
RIPGN w=1/2
RIPGN w=3/4
RIPGN w=9/10
RIPGN w=1
102
103
t(s)
105
106
J(
k)
RIPGN w=1/4
RIPGN w=1/2
RIPGN w=3/4
RIPGN w=9/10
RIPGN w=1
Figure 11: Case 5. Value of the objective function J as function of iteration number k (left), and
computational time t (right) for the RIPGN.
(a) True. (b) RIPGNw = 14 . (c) RIPGNw = 3
4 . (d) RIPGNw = 1.1
2
3
4
5
6
S
Figure 12: Case 5. Photo of the sensing skin (crack highlighted) (a), RIPGN-based reconstructions
with relaxation parametersw = 1/4 (b) andw = 1 (c).
Table 6: Case 5. The number of iterations required for convergence, value of the objective
function at the last iterate, and computational time for the RIPGN and the Newton’s
method.
Algorithm Iterations (K) J (σ ) Time (s)
RIPGNw = 1/4 44 37601 1256.8
RIPGNw = 1/2 26 37629 750.21
RIPGNw = 3/4 19 37801 540.46
RIPGNw = 9/10 19 37812 531.91
RIPGNw = 1 11 38015 299.03
and Newton- based reconstructions is somewhat negligible. We note, again, that the choices
of the optimization method and relaxation parameter do have an effect on the converge and
computation speed, as shown by Table 5.
In Case 5, the sensing skin dataset used in Cases 4 is used to reconstruct TV regularized
solution (Scheme 2) with RIPGN. The results are shown in Figures 11–12 and Table 6. Com-
paring these results with results in Case 4 shows that the contrast between the crack and the
background conductivities is higher when the non-smooth model is used (Scheme 2). Again,
the computational times are shorter than in the smoothed case (see Section 5.2.3). Apart from
these differences, the results are fairly similar to smoothed TV.
24
100
101
102
k
102
103
104
105
106
107
J(
k)
RIPGN w=1/4
RIPGN w=1/2
RIPGN w=3/4
RIPGN w=9/10
RIPGN w=1
102
103
104
105
t(s)
102
103
104
105
106
107
J(
k)
RIPGN w=1/4
RIPGN w=1/2
RIPGN w=3/4
RIPGN w=9/10
RIPGN w=1
Figure 13: Case 6. Value of the objective function J as function of iteration number k (left), and
computational time t (right) for the RIPGN.
5.4 numerical 3d eit study
In Case 6,we evaluate the feasibility of RIPGN to three-dimensional EIT. The geometry resembles
a cylinder that has a radius of 14 cm and a height of 26 cm. Furthermore, the cylinder has four
horizontal layers of electrodes on the surface. Each layer contains 10 evenly placed square
electrodes with side length of 3 cm. The gap between each electrode layer is 4 cm. The cylinder
contains a resistive inclusion with conductivity of 10−3 S/m on a background conductivity of
0.028 S/m.
In the data simulation we present the electrical conductivity in a piecewise linear basis with
210860 nodes, and the electric potential in a second order polynomial basis with 1632276 nodes.
Furthermore, the inversion mesh has 18835 nodes for the conductivity and 135504 nodes for the
potential. The reconstructions are computed with Scheme 3 (Section 4.2.2).
Figure 14 shows that in Case 6 the relaxation parameter has negligible impact on the recon-
struction quality and the reconstructions look very similar to the true conductivity distribution.
Figure 13 and Table 7 show that, even in terms of the final value of the objective function,
RIPGN converges similarly with every step parameter. Clearly, in this case we get no benefits
for lowering the step parameter as lowering it only increases the amount of iterations required
to satisfy the convergence criteria; with step parameter w = 1/4 it takes 47 iterations, while
withw = 1 it takes only 9. This is also reflected in the computational times. Furthermore, these
computational times are considerably longer compared to the previous cases as number of nodes,
elements, and electrodes in the model are greater. As in the previous synthetic cases, the true
conductivity is known and evaluating the objective function at σtrue yields J (σtrue) = 7.0220 · 104.
Furthermore, the relative error is RE = 1.250%.
25
(a) True. (b) RIPGNw = 14 . (c) RIPGNw = 3
4 . (d) RIPGNw = 1.
0.005
0.01
0.015
0.02
0.025
S/m
Figure 14: Case 6. True conductivity (a), RIPGN-based reconstructions with relaxation parame-
ters w = 1/4 (b) and w = 1 (c). A tomographic slice of the distribution along plane
p(x) = −x1 − x2 + x3 = 14 is shown in the figures.
Table 7: Case 6. The number of iterations required for convergence, value of the objective
function at the last iterate, computational time, and relative error of the estimate for
the RIPGN and the NL-PDPS.
Algorithm Iterations (K) J (σ ) Time (s) RE(%)
RIPGNw = 1/4 47 300.5643 1.6681 · 104 2.8742
RIPGNw = 1/2 24 300.6393 8.4881 · 103 2.8737
RIPGNw = 3/4 13 300.8226 4.6491 · 103 2.8768
RIPGNw = 9/10 12 301.1563 4.3054 · 103 2.8752
RIPGNw = 1 9 300.6904 3.2748 · 103 2.8747
6 conclusions
We proposed a novel relaxed inexact proximal Gauss–Newton (RIPGN) method, and studied
it both theoretically and numerically. We applied the method to image reconstruction from
electrical impedance tomography (EIT) measurements which is a large-scale non-linear inverse
problem governed by a PDE model.
We showed that the RIPGN converges to a connected set of Clarke critical points under
conditions that hold for typical inverse problems. Furthermore, we presented a framework for
the application of RIPGN to such problems. We confirmed the efficacy of the RIPGN on synthetic
and experimental EIT data. These studies showed that by adjusting the relaxation parameterw ,
the iterates generated by the RIPGN converge to solutions that meaningful for EIT applications.
Furthermore, whenw was appropriately selected, the RIPGN estimates were significantly faster
to compute than more conventional estimates produced by Newton’s method in the smooth
case and the NL-PDPS in the nonsmooth case.
Overall, RIPGN combined with (NL-)PDPS offers a flexible framework to solve various non-
convex and nonsmooth problems. In EIT, the greatest advantage of the method was achieved
with nonsmooth TV regularization. Following the implementation of this work, RIPGN method
can be straightforwardly adopted also to a variety of other optimization problems—those asso-
ciated with other non-smooth regularization schemes as well as other imaging/reconstruction
applications yielding non-convex optimization problems. In the future, this may enable handling
such large-scale problems without need for smoothing and/or reducing the model complexity,
which both can lead to loss of contrast and appearance of imaging artefacts. Moreover, the
26
RIPGN might even enable—via computational speed-up—realizations of high-contrast real-time
imaging in some applications.
acknowledgments
This project has received funding from the European Union‘s Horizon 2020 research and
innovation programme under grant agreement No 764810. The research was also funded by the
Academy of Finland (Centre of Excellence of Inverse Modelling and Imaging, 2018-2025, project
303801).
T. Valkonen has been supported by Academy of Finland grants 314701 and 320022 as well as
Escuela Politécnica Nacional internal grant PIJ-18-03.
appendix a geometric justification for zero proximal parameter
We now improve Theorem 2.1 by showing that we can take the proximal parameter β = 0
provided ek is small enough and a critical point satisfies certain geometric conditions. We will
then also obtain local convergence to this specific critical point. The rough plan of work is to
show that (2.4) holds under these conditions for some β > 0 despite the algorithm employing
β = 0. Throughout, we take J as in (1.1) and for brevity write
G(x) :=1
2‖A(x)‖2 and Gk (x) :=
1
2‖Ak (x)‖
2=
1
2‖A(zk ) + ∇A(zk )(x − zk )‖2.
We will for some ρ > 0 on line 3 of Algorithm 2.1,
(a.1) solve (2.1) for xk to such accuracy that ‖ek ‖ ≤ ρ‖xk − zk ‖ for some ek ∈ ∂Jk (xk ).
Lemma a.1. Suppose Assumption 2.1 holds. In Algorithm 2.1 use (a.1). If qk := ek − ∇Gk (xk ) ∈
∂F (xk )satisfies
F (zk ) − F (xk ) ≥ 〈qk , zk − xk 〉 +1
2‖z − xk ‖2
Γk
for some operator Γk such that ∇A(zk )∇A(zk )∗ + Γk ≥ (2ρ + β)I for some β > 0, then (2.4) holds.
If (2.3) holds for this β , then the conclusions of Theorem 2.1 hold.
Proof. We have qk = ek − ∇Ak (xk )Ak (x
k ) = ek − ∇A(zk )[A(zk ) + ∇A(zk )∗(xk − zk )]. Since we
take β = 0 in the algorithm, ek ∈ ∂Jk (xk ). Therefore
J (zk ) − Jk (xk ) =
1
2‖A(zk )‖2 −
1
2‖Ak (x
k )‖2 + F (zk ) − F (xk )
≥1
2‖A(zk )‖2 −
1
2‖Ak (x
k )‖2 + 〈qk , zk − xk 〉 +1
2‖zk − xk ‖2
Γk
=
1
2‖A(zk )‖2 −
1
2‖Ak (x
k )‖2 − 〈Ak (xk ),∇Ak (x
k )∗(zk − xk )〉
+ 〈ek , zk − xk 〉 +1
2‖zk − xk ‖2
Γk.
27
We expand and simplify
1
2‖A(zk )‖2 −
1
2‖Ak (x
k )‖2 − 〈Ak (xk ),∇Ak (x
k )∗(zk − xk )〉
=
1
2‖A(zk )‖2 −
1
2‖A(zk ) + ∇A(zk )∗(xk − zk )‖2
− 〈A(zk ) + ∇A(zk )∗(xk − zk ),∇A(zk )∗(zk − xk )〉
=
1
2‖∇A(zk )∗(xk − zk )‖2.
Using the assumption ‖ek ‖ ≤ ρ‖xk − zk ‖ thus
J (zk ) − Jk (xk ) ≥
1
2‖zk − xk ‖2
∇A(zk )∇A(zk )∗+Γk− ρ‖zk − xk ‖2.
This and the assumption ∇A(zk )∇A(zk )∗ + Γk ≥ (2ρ + β) Id prove (2.4).
We now merely assume the conditions of the lemma in the limit:
Theorem a.2. Suppose q := −∇G(x) ∈ ∂F (x) satisfies F (z) − F (x) ≥ 〈q, z − x〉 + 12 ‖z − x ‖2
Γfor
all z and some operator Γ such that ∇A(x)∇A(x)∗ + Γ ≥ (2ρ + θ ) Id for some θ , ρ > 0. Take any
β ∈ (0,θ ) satisfying (2.3) and initialize z0 close enough to x . In Algorithm 2.1 use (a.1). Then the
conclusions of Theorem 2.1 hold.
Proof. Let qk := ek −∇Gk (xk ) ∈ ∂F (xk ). By the outer semicontinuity of the convex subdifferen-
tial ∂F [20], and the continuity of ∇A and A, it is clear that for all ϵ > 0 that there exists r ′ > 0
such that ‖xk − x ‖, ‖zk − x ‖ ≤ r ′ ensures ‖qk − q‖ ≤ ϵ , ∇A(zk )∇A(zk )∗ + Γ ≥ (2ρ + β) Id, and
F (zk )−F (xk ) ≥ 〈qk , zk −xk 〉+ 12 ‖z
k −xk ‖2Γ. Therefore, if we can ensure that zk k ∈N, x
k k ∈N ⊂
B(x , r ′) for some small enough r ′ > 0, the claim follows from Lemma a.1.
Since xk = w−1(zk+1 − zk ) + zk , it suffices to show for some small r > 0, for all k ∈ N, that
zk ∈ B(x , r ), and that ‖zk − zk−1‖ ≤ r . We moreover claim that J (zk ) ≤ J (x) + δr 2ε/(2w) for
some δ ∈ (0, 1]. We prove all of this by induction. The induction basis follows from initializing
z0 = z−1 close enough to x , that is, with r > 0 small enough. For the induction step, assume the
claim holds for k . We will prove that it holds for k + 1. Indeed, by Lemma a.1, (2.4) holds for
k . Thus, by the proof Theorem 2.1, (2.10) holds for k : J (zk ) − J (zk+1) > wε2 ‖zk − xk ‖2. By the
inductive assumption and J (zk+1) ≥ J (x), thus
ε
2w‖zk+1 − zk ‖2 ≤ J (zk ) − J (zk+1) ≤ J (zk ) − J (x) ≤
δr 2ε
2w.
This shows ‖zk+1 − zk ‖ ≤ r . Since J (zk+1) ≤ J (zk ), also J (zk+1) ≤ J (x) + δr 2ε/(2w).
It remains to prove zk+1 ∈ B(x , r ). We have q = −∇A(x)A(x) and for z ∈ B(x , r ′′)with r ′′ small
enough, A(x) = A(z) + ∇A(x)(x − z) +O(‖z − x ‖2). Therefore, arguing similarly to Lemma a.1,
J (z) − J (x) ≥1
2‖z − x ‖2
∇A(x )∇A(x )∗+Γ−O(‖z − x ‖2) ≥ c‖z − x ‖2
28
for any 0 < c < θ+2ρ and z ∈ B(x , r ′′). Since zk ∈ B(x , r ) and, as we have shown, ‖zk+1−zk ‖ ≤ r ,
we have zk+1 ∈ B(x , 2r ). Therefore, taking r < r ′′/2, we have zk+1 ∈ B(x , r ′′). Taking z = zk+1,
it now followsδr 2ε
2w≥ J (zk ) − J (x) ≥ J (zk+1) − J (x) ≥ c ‖zk+1 − x ‖2.
Therefore, if δ > 0 is small enough, zk+1 ∈ B(x , r ). This finishes the induction and the proof.
We now need to obtain some local strong convexity of F . We concentrate on total variation;
in the EIT problems that we consider in Section 4, more local strong convexity could be obtained
from the box constraints. Related geometric approaches in [45, 25, 26, 28, 16] show the local
linear convergence of convex optimization methods, and even globally to submanifolds. The
next lemma establishes the fundamental idea of the approach. The condition in it has been
related to the strong (metric) subregularity of the subdifferentials ∂F [1].
Lemma a.3. Let F : Rn → R be convex and q ∈ int ∂F (x) for some x ∈ Rn . Then for any γ > 0,
for some ρ > 0, F (z) − F (x) ≥ 〈q, z − x〉 +γ
2 ‖z − x ‖2 for all z ∈ B(x , ρ).
Proof. By the definition of the convex subdifferential,
F (z) − F (x) ≥ supq′∈∂F (x )
〈q′, z − x〉 = 〈q, z − x〉 + supq′∈∂F (x )
〈q′ − q, z − x〉
Because q ∈ int ∂F (x), there exists ϵ > 0 such that B(q, ϵ) ⊂ ∂F (x). We can therefore take
q′ = q +γ
2 (z − x) providedγ
2 ‖z − x ‖ ≤ ϵ , that is, if z ∈ B(x , ρ) for ρ = 2ϵ/γ . This immediately
yields the claim.
For the next lemma, we recall we that ‖д‖p,1 :=∑n
i=1 ‖дi · ‖p , where д ∈ Rn×m and we write
дi · = (д11, . . . ,д1m).
Lemma a.4. Let F (x) := ‖Wx ‖p,1 for someW ∈ R(n×m)×n . Assume for all i = 1, . . . ,n the existence
of ki ∈ 1, . . . ,n such that [Wx]ki · = 0 andWki · ,i , 0. Then int ∂F (x) , ∅.
Proof. We have ∂F (x) = W ∗∂‖ · ‖p,1(Wx), where ∂‖ · ‖p,1(д) =∏n
i=1 ∂‖ · ‖p (дi · ). From
our assumptions, for all i = 1, . . . ,n we have ∂‖ · ‖p ([Wx]ki · ) = Bp∗ for the dual unit ball
Bp∗ := q ∈ Rm | ‖q‖p∗ ≤ 1 with 1/p + 1/p∗ = 1. Hence, for all i = 1, . . . ,n, the projection of
∂F (x) to the i:th coordinate,
[∂F (x)]i = [W ∗∂‖ · ‖p,1(Wx)]i =
n∑
k=1
〈Wk · ,i , [∂‖ · ‖p,1(Wx)]k ·〉
=
∑
k,ki
〈Wk · ,i , [∂‖ · ‖p,1(Wx)]k ·〉 + 〈Wki · ,i ,Bp∗〉.
The last term has non-empty interior. Hence int[∂F (x)]i , ∅ for all i = 1, . . . ,n. Since int ∂F (x) ⊃∏nk=1 int[∂F (x)]i , the claim follows.
The next theorem shows that forward-differences discretised total variation is locally strongly
convex around a “strictly piecewise constant” x .
29
Theorem a.5. Let F (x) = ‖∇hx ‖p,1 for∇h ∈ R(n1×n2×2)×(n1×n2) the forward differences operator with
(discrete) Neumann boundary conditions and cell width h > 0. Assume that x ∈ n1 × n2 is strictly
piecewise constant in the sense that for each pixel coordinate (i, j) ∈ 1, . . . ,n1 × 1, . . . ,n2
there exists a neighboring pixel coordinate
(ki j ,ki j ) ∈ Ni, j := 1, . . . ,n1 × 1, . . . ,n2 ∩ (i, j), (i + 1, j), (i, j + 1), (i − 1, j), (i, j − 1)
with [∇hx]ki jki j · = 0. Then int ∂F (x) , ∅. In particular, for any γ > 0 and q ∈ int ∂F (x) and
ρ > 0 such that F (z) − F (x) ≥ 〈q, z − x〉 +γ
2 ‖z − x ‖2 for all z ∈ B(x , ρ).
Proof. The strict piecewise constancy assumption verifies withW = ∇h for all i = 1, . . . ,n1 and
j = 1, . . . ,n2 the existence of (k,k) = (ki j ,ki j ) ∈ 1, . . . ,n1 × 1, . . . ,n2 such that [Wx]kk ·=
0 andWkk · ,i j , 0. The non-empty interior of the subdifferential is now a consequence of
Lemma a.4. The strong convexity at x then follows from Lemma a.3.
If the solution is not strictly piecewise constant at some pixel, then the fitting term G has to
provide the corresponding second-order growth. This is reasonable to expect, as total variation
whenever allowed by the fitting term, would produce piecewise constant solutions.
Corollary a.6. Let F (x) = ‖∇hx ‖p,1 for ∇h ∈ R(n1×n2×2)×(n1×n2) the forward differences operator
with (discrete) Neumann boundary conditions. Let x ∈ [∂C J ]−1(0) be a Clarke-critical point of J .
For all pixels (i, j) ∈ 1, . . . ,n1 × 1, . . . ,n2 such that −[∇G(x)]i j < int[∂F (x)]i j (in particular, if
(i, j) fails the strict piecewise constancy assumption of Theorem a.5 in the sense that there exists no
(ki j ,ki j ) ∈ Ni, j with [∇hx]ki jki j · = 0), assume that [∇A(x)∇A(x)∗]i j,i j ≥ 2ρ + θ for some θ > 0.
Take any β ∈ (0,θ ) satisfying (2.3) and initialize z0 close enough to x . In Algorithm 2.1 use (a.1).
Then the conclusions of Theorem 2.1 hold.
Proof. With q := −∇G(x) let S be the set of pixel coordinates (i, j) satisfy qi j ∈ int[∂F (x)]i j .
Then, if (i, j) < S, we have −[∇G(x)]i j ∈ bd[∂F (x)]i j . We take γ = 2ρ + θ and Γ such that
[Γ]i j,i j = γ for pixels (i, j) ∈ S and zero in all other entries. Then, proceeding as in Lemma a.3,
we deduce the existence of ρ > 0 such that
F (z) − F (x) ≥ 〈q, z − x〉 +1
2‖z − x ‖2
Γ(z ∈ B(x , ρ)).
By our assumptions we also have [∇A(x)∇A(x)∗]i j,i j ≥ Γ = γ Id = (2ρ + θ ) Id. The rest follows
from Theorem a.2.
appendix b additional cases (7–12)
Case 7 is complementary to Case 2; it uses the same geometry and same regularization scheme
(Scheme 3) but true conductivity is different. In this case, the target contains a square-shaped
inclusion with conductivity of 10−3 S/m and a conductive circular inclusion with conductivity
0.28 S/m. The conductivity of the constant background is 0.028 S/m. The results of Case 3 are
shown in Figures 15–16, and Table 8.
Figure 15 shows that RIPGNwith relaxation parametersw = 1 andw = 9/10 does not converge.
Furthermore, the relative error is considerably higher as the total variation regularization tends
30
100
101
102
k
100
102
104
106
108
1010
J(
k)
RIPGN w=1/4
RIPGN w=1/2
RIPGN w=3/4
RIPGN w=9/10
RIPGN w=1
Newton
102
103
104
t(s)
100
102
104
106
108
1010
J(
k)
RIPGN w=1/4
RIPGN w=1/2
RIPGN w=3/4
RIPGN w=9/10
RIPGN w=1
Newton
Figure 15: Case 7. Value of the objective function J as function of iteration number k (left), and
computational time t (right) for the RIPGN and Newton’s method.
(a) True. (b) RIPGNw = 14 . (c) RIPGNw = 1
2 . (d) RIPGNw = 34 .
(e) RIPGNw = 910 . (f) RIPGNw = 1. (g) Newton.
0.01
0.02
0.03
0.04
0.05
0.06
0.07
S/m
Figure 16: Case 7. True conductivity (a), RIPGN-based reconstructions corresponding to five
relaxation parametersw (b)–(f), and the Newton-based reconstruction (g).
to round the shape of the resistive inclusion [17]. In addition, the range of the conductivity is
flattened. It is also notable that the fit in this case is better in terms of the objective function than
in Case 2. Interpolating the true conductivity into the inversion mesh gives J (σtrue) = 1.0918 · 105
and RE = 3.8874 %.
Similarly to Case 7, Case 8 is complementary to Case 3. In this case, the comparison to NL-
PDPS is omitted due to excessively long computational times of NL-PDPS. The results of Case 8
are shown in Figure 17–18, and Table 9. Again, the computational times and the relative errors
are improved when compared to the smoothed TV solutions in Case 3 (cf. Table 8), similarly to
what happened between Cases 2 and 4. Also, the differences in computational times and relative
errors between Case 4 and 5 are analogous to differences between Case 2 and 3.
31
Table 8: Case 7. The number of iterations required for convergence, value of the objective
function at the last iterate, computational time, and relative error of the estimate for
the RIPGN and Newton’s method.
Algorithm Iterations (K) J (σ ) Time (s) RE(%)
RIPGNw = 1/4 34 73.942 1145.2 10.012
RIPGNw = 1/2 16 74.238 526.82 10.035
RIPGNw = 3/4 11 74.02 342.07 10.069
RIPGNw = 9/10 8 4.5458 · 108 235.21 166.67
RIPGNw = 1 8 6.3257 · 108 243.83 295.45
Newton 12 80.658 2387.1 11.161
100
101
102
k
100
102
104
106
108
1010
J(
k)
RIPGN w=1/4
RIPGN w=1/2
RIPGN w=3/4
RIPGN w=9/10
RIPGN w=1
102
103
104
t(s)
100
102
104
106
108
1010
J(
k)
RIPGN w=1/4
RIPGN w=1/2
RIPGN w=3/4
RIPGN w=9/10
RIPGN w=1
Figure 17: Case 8. Value of the objective function J as function of iteration number k (left), and
computational time t (right) for the RIPGN.
(a) True. (b) RIPGNw = 14 . (c) RIPGNw = 1
2 .
(d) RIPGNw = 34 . (e) RIPGNw = 9
10 . (f) RIPGNw = 1.
0.01
0.02
0.03
0.04
0.05
0.06
0.07
S/m
Figure 18: Case 8. True conductivity (a), RIPGN-based reconstructions corresponding to five
relaxation parametersw (b)–(f).
32
Table 9: Case 5. The number of iterations required for convergence, value of the objective
function at the last iterate, computational time, and relative error of the estimate for
the RIPGN.
Algorithm Iterations (K) J (σ ) Time (s) RE(%)
RIPGNw = 1/4 34 69.493 1025.7 8.6152
RIPGNw = 1/2 16 70.155 498.23 8.6516
RIPGNw = 3/4 12 69.57 367.28 8.6591
RIPGNw = 9/10 8 5.2339·108 234.4 157.91
RIPGNw = 1 8 6.6964·108 239.45 346.57
appendix b.0.1 cases 9 & 10: water tank experiments
In Cases 9–10, we evaluate RIPGN with experimental data, using a water tank, the geometry of
which corresponds to Cases 1–3 (and 7–8). The same objective function (Scheme 3; Section 4.2.3)
and parameters chosen in Cases 3 and 8 are used in these reconstructions. All reconstructions
are computed with relaxation parameterw = 3/4.
Reconstructions in Cases 9–10 are shown in Figure 19. In both cases, the plastic inclusions
appear as areas of low conductivity, and in Case 10, the metal inclusion causes an area of
increased conductivity. These areas are able to capture the locations of the inclusions well and
are easily distinguished from the background as the conductivities of the background and these
areas are flat and sharp-edged. The background conductivity in both cases is between 0.02 S/m
and 0.03 S/m, which is in the range of typical drinking water in room temperatures, and as
expected, the conductivity near the plastic inclusion is very low compared to the background.
However, there is some contrast loss in the conductivity around the metal inclusion in Case 10;
the conductivity in this region is only about twice as much as the background (see Section 5.2.4).
Furthermore, in both cases, the shapes of the inclusions are slightly distorted. This kind of
distortion can be caused by a small discrepancy between the geometry of the mesh and the actual
measurement setup and other modeling errors. The roundness of the objects could reinforced by,
for example, increasing the value of the regularization parameter α , but the parameter selection
for the regularization is beyond the scope of this paper.
The results of the water tank experiments (Cases 9-10) confirm that the RIPGN method
proposed in this paper is applicable to EIT imaging also with real measurement data.
33
(a) Measurement 1. (b) Reconstruction
1.
(c) Measurement 2. (d) Reconstruction
2.
0.01
0.02
0.03
0.04
0.05
0.06
S/m
Figure 19: Case 9 (top row) and Case 10 (bottom row). Photos of the measurement setup (left
column) and the TV-based RIPGN-reconstructions withw = 3/4.
100
101
102
k
105
J(
k)
RIPGN w=1/4
RIPGN w=1/2
RIPGN w=3/4
RIPGN w=9/10
RIPGN w=1
Newton
102
103
104
t(s)
105
J(
k)
RIPGN w=1/4
RIPGN w=1/2
RIPGN w=3/4
RIPGN w=9/10
RIPGN w=1
Newton
Figure 20: Case 11. Value of the objective function J as function of iteration number k (left), and
computational time t (right) for the RIPGN and Newton’s method.
appendix b.0.2 cases 11 & 12: sensing skin experiments
Case 11 is complementary to Case 4; the measurements are done using the same sensing skin
setup and computations use the same scheme (Scheme 3). An additional crack was made on the
sensing for this measurement. Figure 21 (top left) shows a photograph of the sensing skin in Case
11. The results from this dataset are shown in Figures 20–21 and in Table 10. In this case, RIPGN
with relaxation parameterw = 1/4 converges better than with the other relaxation parameters,
includingw = 1/2. Although the convergence is better withw = 1/4, Figure 21 shows that impact
of the relaxation parameter on the reconstruction quality is still fairly negligible. Contrarily,
Figure 20 and Table 10 show that, again, the relaxation parameter heavily affects the computation
times.
Case 12 is complementary to Case 5; it uses Scheme 3 and the same measurements as in Case
34
(a) True. (b) RIPGNw = 14 . (c) RIPGNw = 1
2 . (d) RIPGNw = 34 .
(e) RIPGNw = 910 . (f) RIPGNw = 1. (g) Newton.
1
2
3
4
5
6
S
Figure 21: Case 11. Photo of the sensing skin (crack highlighted) (a), RIPGN-based reconstruc-
tions corresponding to five relaxation parametersw (b)–(f), and the Newton-based
reconstruction (g).
Table 10: Case 11. The number of iterations required for convergence, value of the objective
function at the last iterate, and computational time for the RIPGN and the Newton’s
method.
Algorithm Iterations (K) J (σ ) Time (s)
RIPGNw = 1/4 65 23011 2747.8
RIPGNw = 1/2 19 23262 775.53
RIPGNw = 3/4 11 23365 436.94
RIPGNw = 9/10 29 23533 1206.2
RIPGNw = 1 8 24544 311.8
Newton 69 23072 1188.2
35
100
101
102
k
105
J(
k)
RIPGN w=1/4
RIPGN w=1/2
RIPGN w=3/4
RIPGN w=9/10
RIPGN w=1
102
103
t(s)
105
J(
k)
RIPGN w=1/4
RIPGN w=1/2
RIPGN w=3/4
RIPGN w=9/10
RIPGN w=1
Figure 22: Case 12. Value of the objective function J as function of iteration number k (left), and
computational time t (right) for the RIPGN.
(a) True. (b) RIPGNw = 14 . (c) RIPGNw = 1
2 .
(d) RIPGNw = 34 . (e) RIPGNw = 9
10 . (f) RIPGNw = 1.
1
2
3
4
5
6
S
Figure 23: Case 12. Photo of the sensing skin (crack highlighted) (a),RIPGN-based reconstructions
corresponding to five relaxation parametersw (b)–(f).
Table 11: Case 12. The number of iterations required for convergence, value of the objective
function at the last iterate, and computational time for the RIPGN and the Newton’s
method.
Algorithm Iterations (K) J (σ ) Time (s)
RIPGNw = 1/4 43 28761 1235.8
RIPGNw = 1/2 33 28984 950.35
RIPGNw = 3/4 16 29042 460.37
RIPGNw = 9/10 14 29280 412.17
RIPGNw = 1 14 30021 394.34
11. Results in Case 11, in are shown in Figure 22–23 and Table 11. Differences between Case 11
and Case 12 are fairly similar to differences between Case 4 and 5.
36
(a) True. (b) RIPGNw = 14 . (c) RIPGNw = 1
2 . (d) RIPGNw = 34 .
(e) RIPGNw = 910 . (f) RIPGNw = 1. (g) Newton.
0.01
0.015
0.02
0.025
0.03
0.035
0.04
S/m
Figure 24: Case 1. True conductivity (a), RIPGN-based reconstructions corresponding to five
relaxation parametersw (b)–(f), and the Newton-based reconstruction (g).
(a) True. (b) RIPGNw = 14 . (c) RIPGNw = 1
2 . (d) RIPGNw = 34 .
(e) RIPGNw = 910 . (f) RIPGNw = 1. (g) Newton.
0.005
0.01
0.015
0.02
0.025
S/m
Figure 25: Case 2. True conductivity (a), RIPGN-based reconstructions corresponding to five
relaxation parametersw (b)–(f), and the Newton-based reconstruction (g).
appendix c additional reconstructions in cases 1–6
Figures 24–29 show all reconstruction images computed in Cases 1–6, respectively.
37
(a) True. (b) RIPGNw = 14 . (c) RIPGNw = 1
2 . (d) RIPGNw = 34 .
(e) RIPGNw = 910 . (f) RIPGNw = 1. (g) NL-PDPS.
0.005
0.01
0.015
0.02
0.025
S/m
Figure 26: Case 4. True conductivity (a), RIPGN-based reconstructions corresponding to five
relaxation parametersw (b)–(f), and the NL-PDPS-based reconstruction (g).
(a) True. (b) RIPGNw = 14 . (c) RIPGNw = 1
2 . (d) RIPGNw = 34 .
(e) RIPGNw = 910 . (f) RIPGNw = 1. (g) Newton.
1
2
3
4
5
6
S
Figure 27: Case 4. Photo of the sensing skin (crack highlighted) (a), RIPGN-based reconstruc-
tions corresponding to five relaxation parametersw (b)–(f), and the Newton-based
reconstruction (g).
38
(a) True. (b) RIPGNw = 14 . (c) RIPGNw = 1
2 .
(d) RIPGNw = 34 . (e) RIPGNw = 9
10 . (f) RIPGNw = 1.
1
2
3
4
5
6
S
Figure 28: Case 5. Photo of the sensing skin (crack highlighted) (a), RIPGN-based reconstructions
corresponding to five relaxation parametersw (b)–(f).
(a) True. (b) RIPGNw = 14 . (c) RIPGNw = 1
2 .
(d) RIPGNw = 34 . (e) RIPGNw = 9
10 . (f) RIPGNw = 1.
0.005
0.01
0.015
0.02
0.025S
/m
Figure 29: Case 6. True conductivity (a), RIPGN-based reconstructions corresponding to five
relaxation parametersw (b)–(f). A tomographic slice of the distribution along plane
p(x) = −x1 − x2 + x3 = 14 is shown in the figures.
39
appendix d complementary proximal mappings
Table 12 collects the proximal mappings required in the algorithm implementations.
Table 12: Proximal mappings of G utilized in the algorithm implementations. For the hypercube
V , proj[Vmin,Vmax](xi ) = max (min (xi ,Vmax) ,Vmin).
G(x) i’th component of proxtG (x)
0 xiδV (x) projV (xi )
δV (x) +β
2 ‖x − zk ‖2 projV
( 1txi+βz
ki
1t+β
)
δV (x) +β
2 ‖x − zk ‖2
+Bmin(x) + Bmax(x)
projV
(l 2min
zmin+1txi+βz
ki
l 2min+1t+β
), xi < zmin
projV
( 1txi+βz
k
1t+β
), zmin ≤ xi ≤ zmax
projV
(l 2maxzmax+
1txi+βz
ki
l 2max+1t+β
), xi > zmax
references
[1] F. J. Aragón Artacho and M.H. Geoffroy, Characterization of metric regularity of subdif-
ferentials, Journal of Convex Analysis 15 (2008), 365–380.
[2] K. Astala and L. Päivärinta, Calderón’s inverse conductivity problem in the plane, Annals
of Mathematics (2006), 265–299.
[3] H. Attouch, J. Bolte, and B. Svaiter, Convergence of descent methods for semi-algebraic
and tame problems: proximal algorithms, forward–backward splitting, and regular-
ized Gauss–Seidel methods, Mathematical Programming 137 (2013), 91–129, doi:10.1007/
s10107-011-0484-9.
[4] J.M. Bardsley, A. Seppänen, A. Solonen, H. Haario, and J. Kaipio, Randomize-then-
optimize for sampling and uncertainty quantification in electrical impedance tomography,
SIAM/ASA Journal on Uncertainty Quantification 3 (2015), 1136–1158.
[5] A. Beck, First-Order Methods in Optimization, SIAM, Philadelphia, PA, 2017, doi:10.1137/1.
9781611974997.
[6] J. Bolte, S. Sabach, and M. Teboulle, Nonconvex Lagrangian-based optimization: moni-
toring schemes and global convergence, Mathematics of Operations Research 43 (2018),
1051–1404, doi:10.1287/moor.2017.0900.
[7] J. V. Burke and M. C. Ferris, A Gauss—Newton method for convex composite optimization,
Mathematical Programming 71 (1995), 179–194, doi:10.1007/bf01585997.
40
[8] A. P. Calderón, On an inverse boundary value problem, Computational & Applied Mathe-
matics 25 (2006), 133–138.
[9] A. Chambolle and T. Pock, A first-order primal-dual algorithm for convex problems with
applications to imaging, Journal of Mathematical Imaging and Vision 40 (2011), 120–145,
doi:10.1007/s10851-010-0251-1.
[10] K. S. Cheng, D. Isaacson, J. Newell, and D. G. Gisser, Electrode models for electric current
computed tomography, IEEE Transactions on Biomedical Engineering 36 (1989), 918–924.
[11] F. Clarke, Optimization and Nonsmooth Analysis, Society for Industrial and Applied Mathe-
matics, 1990, doi:10.1137/1.9781611971309.
[12] C. Clason, S. Mazurenko, and T. Valkonen, Acceleration and global convergence of a first-
order primal–dual method for nonconvex problems, SIAM Journal on Optimization 29
(2019), 933–963, doi:10.1137/18m1170194, arXiv:1802.03347.
[13] T. A. Davis,Algorithm 832: UMFPACKV4.3—an Unsymmetric-patternMultifrontalMethod,
ACM Trans. Math. Softw. 30 (2004), 196–199, doi:10.1145/992200.992206, hp://doi.acm.org/
10.1145/992200.992206.
[14] M.H. DeGroot, Optimal statistical decisions, volume 82, John Wiley & Sons, 2005.
[15] O. Ferreira, M. Gonçalves, and P. Oliveira, Convergence of the Gauss–Newton Method for
Convex Composite Optimization under a Majorant Condition, SIAM Journal on Optimiza-
tion 23 (2013), 1757–1783, doi:10.1137/110841606.
[16] G. Garrigos, L. Rosasco, and S. Villa, Convergence of the forward-backward algorithm:
Beyond the worst case with the help of geometry (2017), arXiv:arXiv:1703.09477.
[17] G. González, V. Kolehmainen, and A. Seppänen, Isotropic and anisotropic total variation
regularization in electrical impedance tomography, Computers & Mathematics with Appli-
cations 74 (2017), 564–576.
[18] M. Hallaji, A. Seppänen, and M. Pour-Ghaz, Electrical impedance tomography-based sens-
ing skin for quantitative imaging of damage in concrete, Smart Materials and Structures 23
(2014), 085001, doi:10.1088/0964-1726/23/8/085001, hps://doi.org/10.1088%2F0964-1726%
2F23%2F8%2F085001.
[19] M. Hanke, A regularizing Levenberg-Marquardt scheme, with applications to inverse
groundwater filtration problems, Inverse Problems 13 (1997), 79, doi:10.1088/0266-5611/13/1/
007.
[20] J. B. Hiriart-Urruty and C. Lemaréchal, Fundamentals of Convex Analysis, Springer, 2001,
doi:10.1007/978-3-642-56468-0.
[21] J. Kaipio and E. Somersalo, Statistical and computational inverse problems, volume 160,
Springer Science & Business Media, 2006.
41
[22] B. Kaltenbacher, A. Neubauer, and O. Scherzer, Iterative Regularization Methods for Non-
linear Ill-Posed Problems, number 6 in Radon Series on Computational and Applied Mathe-
matics, De Gruyter, 2008.
[23] R. Kohn and M. Vogelius, Determining conductivity by boundary measurements, Commu-
nications on Pure and Applied Mathematics 37 (1984), 289–298.
[24] R. V. Kohn and M. Vogelius, Determining conductivity by boundary measurements II.
Interior results, Communications on Pure and Applied Mathematics 38 (1985), 643–667.
[25] A. S. Lewis, Active Sets, Nonsmoothness, and Sensitivity, SIAM Journal on Optimization 13
(2002), 702–725, doi:10.1137/s1052623401387623.
[26] A. S. Lewis and S. Zhang, Partial Smoothness, Tilt Stability, and Generalized Hessians,
SIAM Journal on Optimization 23 (2013), 74–94, doi:10.1137/110852103.
[27] C. Li and X. Wang, On convergence of the Gauss-Newton method for convex composite
optimization, Mathematical Programming 91 (2002), 349–356, doi:10.1007/s101070100249.
[28] J. Liang, J. Fadili, and G. Peyré, Local Linear Convergence of Forward–
Backward under Partial Smoothness, Advances in Neural Informa-
tion Processing Systems 27 (2014), 1970–1978, hp://papers.nips.cc/paper/
5260-local-linear-convergence-of-forward-backward-under-partial-smoothness.pdf.
[29] A. Lipponen, A. Seppanen, and J. P. Kaipio, Electrical impedance tomography imaging
with reduced-order model based on proper orthogonal decomposition, Journal of Electronic
Imaging 22 (2013), 023008.
[30] S. Mazurenko, J. Jauhiainen, and T. Valkonen, Primal-dual block-proximal splitting for a
class of non-convex problems, 2019, arXiv:1911.06284. submitted.
[31] R. Mifflin, Semismooth and semiconvex functions in constrained optimization, SIAM Jour-
nal on Control And Optimization 15 (1977), 959–972, doi:10.1137/0315061.
[32] J. L. Mueller and S. Siltanen, Linear and Nonlinear Inverse Problems with Practical Applica-
tions, SIAM, 2012, doi:10.1137/1.9781611972344.
[33] J. Nocedal and S. Wright, Numerical Optimization, Springer Series in Operations Research
and Financial Engineering, Springer New York, 2006, doi:10.1007/978-0-387-40065-5.
[34] J. Pang and L. Qi, Nonsmooth Equations: Motivation and Algorithms, SIAM Journal on
Optimization 3 (1993), 443–465, doi:10.1137/0803021.
[35] T. Pock and A. Chambolle, Diagonal preconditioning for first order primal-dual algorithms
in convex optimization, in Computer Vision (ICCV), 2011 IEEE International Conference on,
2011, 1762–1769, doi:10.1109/iccv.2011.6126441.
[36] L. Q. Qi, Convergence analysis of some algorithms for solving nonsmooth equations,Math.
Oper. Res. 18 (1993), 227–244, doi:10.1287/moor.18.1.227.
42
[37] L. Q. Qi and J. Sun, A nonsmooth version of Newton’s method,Mathematical Programming
58 (1993), 353–367, doi:10.1007/bf01581275.
[38] L. I. Rudin, S. Osher, and E. Fatemi, Nonlinear total variation based noise removal algo-
rithms, Physica D: nonlinear phenomena 60 (1992), 259–268.
[39] M. Salo, Calderón problem, Lecture Notes (2008).
[40] S. Salzo and S. Villa, Convergence analysis of a proximal Gauss-Newton method, Compu-
tational Optimization and Applications 53 (2012), 557–589, doi:10.1007/s10589-012-9476-9.
[41] E. Somersalo, M. Cheney, and D. Isaacson, Existence and uniqueness for electrode models
for electric current computed tomography, SIAM Journal on Applied Mathematics 52 (1992),
1023–1040.
[42] J. Sylvester and G. Uhlmann, A global uniqueness theorem for an inverse boundary value
problem, Annals of mathematics (1987), 153–169.
[43] G. Uhlmann, Electrical impedance tomography and Calderón’s problem, Inverse Problems
25 (2009), 123011, doi:10.1088/0266-5611/25/12/123011.
[44] T. Valkonen, A primal-dual hybrid gradient method for non-linear operators with ap-
plications to MRI, Inverse Problems 30 (2014), 055012, doi:10.1088/0266-5611/30/5/055012,
arXiv:1309.5032.
[45] T. Valkonen, Preconditioned proximal point methods and notions of partial subregularity,
2017, arXiv:1711.05123. Submitted.
[46] T. Valkonen, Block-proximal methods with spatially adapted acceleration, Electronic Trans-
actions on Numerical Analysis 51 (2019), 15–49, doi:10.1553/etna_vol51s15, arXiv:1609.07373.
[47] T. Valkonen, First-order primal-dual methods for nonsmooth nonconvex optimisation,
2019, arXiv:1910.00115. submitted.
[48] P. J. Vauhkonen, Image reconstruction in three-dimensional electrical impedance tomography,
Kuopion yliopisto, 2004.
[49] A. Voss, Imaging moisture flows in cement-based materials using electrical capacitance
tomography, PhD thesis, University of Eastern Finland, 2020.
43