Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research...

153
Sparse least squares problems with dense rows and preconditioned iterative methods. Jennifer Scott University of Reading and STFC Rutherford Appleton Laboratory Miroslav Tůma Faculty of Mathematics and Physics, Charles University, Prague Padua, February 2020 1 / 64

Transcript of Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research...

Page 1: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Sparse least squares problems with dense rows andpreconditioned iterative methods.

Jennifer Scott

University of Reading and STFC Rutherford Appleton Laboratory

Miroslav Tůma

Faculty of Mathematics and Physics, Charles University, Prague

Padua, February 2020

1 / 64

Page 2: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Outline

1 Introduction

2 Possible research directions

3 Some historical comments

4 Arbitrary sparse-dense (ASD) approach

5 Sparse stretching

6 Schur complement approach

7 Null sparse approach

8 Conclusions

2 / 64

Page 3: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Introduction

The Linear Least Squares Problem (LS):

minx∈IRn

∥Ax − b∥2,

where A ∈ IRm×n with m ≥ n is large and sparse, b ∈ IRm

3 / 64

Page 4: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Introduction

The Linear Least Squares Problem (LS):

minx∈IRn

∥Ax − b∥2,

where A ∈ IRm×n with m ≥ n is large and sparse, b ∈ IRm

Interested in solving the problem by preconditioned iterations

3 / 64

Page 5: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Introduction

The Linear Least Squares Problem (LS):

minx∈IRn

∥Ax − b∥2,

where A ∈ IRm×n with m ≥ n is large and sparse, b ∈ IRm

Interested in solving the problem by preconditioned iterations

Many obstacles on the way to construct efficient and generalpreconditioners

▸ Enormous variability of LS problems (and we like to consideralgebraic problem)

▸ The sparsity structure of AT A is always behind the scene in theCholesky/QR approaches.

3 / 64

Page 6: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Introduction

QR is always behind the scene even when the normal equations are notformed.

4 / 64

Page 7: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Introduction

QR is always behind the scene even when the normal equations are notformed.

Undergraduate stuff:A = QR

AT A = RT QT QR

4 / 64

Page 8: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Introduction

QR is always behind the scene even when the normal equations are notformed.

Undergraduate stuff:A = QR

AT A = RT QT QR

The fill is exactly as predicted by Cholesky of AT A if A has thestrong Hall property

4 / 64

Page 9: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Introduction

What if A contains a small but a problematic part?

5 / 64

Page 10: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Introduction

What if A contains a small but a problematic part?

An example of such a trivial problematic part is a chunk of dense rows:

A

A AT

5 / 64

Page 11: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Introduction

What if A contains a small but a problematic part?

An example of such a trivial problematic part is a chunk of dense rows:

A

A AT

Dense rows may represent a standard global coupling on the top oflocal sparse relations and we need to treat them in the same way asthe other constraints.

5 / 64

Page 12: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Introduction

Sometimes, some constraints have to be satisfied with a higheraccuracy than the rest: linearly constrained problem (LEP).

minx∈IRn

∥Ax − b∥2, Ex = f

These constraint rows (Ex = f) are often less sparse than the otherrows.

6 / 64

Page 13: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Introduction

Sometimes, some constraints have to be satisfied with a higheraccuracy than the rest: linearly constrained problem (LEP).

minx∈IRn

∥Ax − b∥2, Ex = f

These constraint rows (Ex = f) are often less sparse than the otherrows.

Dense rows may appear due to transformation/decomposition. Forexample, after projecting the problem onto a subspace of givenconstraints and trying to express the transformation explicitly:

Ex = f → finding Z such that EZ = 0→ ZT AZ

In some cases, Z is known to contain dense rows (Arioli, Maryška,Rozložník, T., 2004.)

6 / 64

Page 14: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Introduction

Sometimes, some constraints have to be satisfied with a higheraccuracy than the rest: linearly constrained problem (LEP).

minx∈IRn

∥Ax − b∥2, Ex = f

These constraint rows (Ex = f) are often less sparse than the otherrows.

Dense rows may appear due to transformation/decomposition. Forexample, after projecting the problem onto a subspace of givenconstraints and trying to express the transformation explicitly:

Ex = f → finding Z such that EZ = 0→ ZT AZ

In some cases, Z is known to contain dense rows (Arioli, Maryška,Rozložník, T., 2004.)

Schur complement systems look like weighted normal equations . . .

6 / 64

Page 15: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Introduction

But not only dense rows are the problematic part in incompletefactorizations: the problem is more general:

▸ Rank-1 (rank-k) modifications of (approximate) factorizations fromsome rows of AT A (Tismenetsky (1991); Kaporin (1998); Scott, T.(2014)) may generate dense contributions into the Schur complement.

Before the update

7 / 64

Page 16: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Introduction

But not only dense rows are the problematic part in incompletefactorizations: the problem is more general:

▸ Rank-1 (rank-k) modifications of (approximate) factorizations fromsome rows of AT A (Tismenetsky (1991); Kaporin (1998); Scott, T.(2014)) may generate dense contributions into the Schur complement.

After the update

8 / 64

Page 17: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Introduction

But not only dense rows are the problematic part in incompletefactorizations: the problem is more general:

▸ Rank-1 (rank-k) modifications of (approximate) factorizations fromsome rows of AT A (Tismenetsky (1991); Kaporin (1998); Scott, T.(2014)) may generate dense contributions into the Schur complement.

After the update

How to make AT A to minimize this effect?8 / 64

Page 18: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Introduction

But not only dense rows are the problematic part in incompletefactorizations: the problem is more general:

▸ Rank-1 (rank-k) modifications of (approximate) factorizations fromsome rows of AT A (Tismenetsky (1991); Kaporin (1998); Scott, T.(2014)) may generate dense contributions into the Schur complement.

After the update

How to make AT A to minimize this effect? Split.

8 / 64

Page 19: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Research directions inside our framework(1)

This talk mentions the following approaches to solve the problem

9 / 64

Page 20: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Research directions inside our framework(1)

This talk mentions the following approaches to solve the problem

1 Combining the sparse and dense parts

9 / 64

Page 21: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Research directions inside our framework(1)

This talk mentions the following approaches to solve the problem

1 Combining the sparse and dense parts▸ Arbitrary sparse-dense (ASD) approach (Scott., T., 2017)

9 / 64

Page 22: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Research directions inside our framework(1)

This talk mentions the following approaches to solve the problem

1 Combining the sparse and dense parts▸ Arbitrary sparse-dense (ASD) approach (Scott., T., 2017)▸ Iterative approach based on CG (CGLS1)

9 / 64

Page 23: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Research directions inside our framework(1)

This talk mentions the following approaches to solve the problem

1 Combining the sparse and dense parts▸ Arbitrary sparse-dense (ASD) approach (Scott., T., 2017)▸ Iterative approach based on CG (CGLS1)▸ If rank(A) > rank(sparse_part_of_A): specific modifications

needed.

9 / 64

Page 24: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Research directions inside our framework(1)

This talk mentions the following approaches to solve the problem

1 Combining the sparse and dense parts▸ Arbitrary sparse-dense (ASD) approach (Scott., T., 2017)▸ Iterative approach based on CG (CGLS1)▸ If rank(A) > rank(sparse_part_of_A): specific modifications

needed.▸ We rely on an implicit combination of the sparse and dense part of the

approximate inverse coupled together inside CG to get

z =M−1r.

9 / 64

Page 25: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Research directions inside our framework(1)

This talk mentions the following approaches to solve the problem

1 Combining the sparse and dense parts▸ Arbitrary sparse-dense (ASD) approach (Scott., T., 2017)▸ Iterative approach based on CG (CGLS1)▸ If rank(A) > rank(sparse_part_of_A): specific modifications

needed.▸ We rely on an implicit combination of the sparse and dense part of the

approximate inverse coupled together inside CG to get

z =M−1r.

2 Transforming dense to sparse at the expense of getting the problemlarger.

9 / 64

Page 26: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Research directions inside our framework(1)

This talk mentions the following approaches to solve the problem

1 Combining the sparse and dense parts▸ Arbitrary sparse-dense (ASD) approach (Scott., T., 2017)▸ Iterative approach based on CG (CGLS1)▸ If rank(A) > rank(sparse_part_of_A): specific modifications

needed.▸ We rely on an implicit combination of the sparse and dense part of the

approximate inverse coupled together inside CG to get

z =M−1r.

2 Transforming dense to sparse at the expense of getting the problemlarger.

▸ Sparsifying the dense part by matrix stretching (Scott, T., 2019)

9 / 64

Page 27: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Research directions inside our framework(1)

This talk mentions the following approaches to solve the problem

1 Combining the sparse and dense parts▸ Arbitrary sparse-dense (ASD) approach (Scott., T., 2017)▸ Iterative approach based on CG (CGLS1)▸ If rank(A) > rank(sparse_part_of_A): specific modifications

needed.▸ We rely on an implicit combination of the sparse and dense part of the

approximate inverse coupled together inside CG to get

z =M−1r.

2 Transforming dense to sparse at the expense of getting the problemlarger.

▸ Sparsifying the dense part by matrix stretching (Scott, T., 2019)▸ Hoping to get overall “uniform problem sparsity”

9 / 64

Page 28: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Research directions inside our framework(1)

This talk mentions the following approaches to solve the problem

1 Combining the sparse and dense parts▸ Arbitrary sparse-dense (ASD) approach (Scott., T., 2017)▸ Iterative approach based on CG (CGLS1)▸ If rank(A) > rank(sparse_part_of_A): specific modifications

needed.▸ We rely on an implicit combination of the sparse and dense part of the

approximate inverse coupled together inside CG to get

z =M−1r.

2 Transforming dense to sparse at the expense of getting the problemlarger.

▸ Sparsifying the dense part by matrix stretching (Scott, T., 2019)▸ Hoping to get overall “uniform problem sparsity”▸ Traps on the way: size increase / ill-conditioning

9 / 64

Page 29: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Research directions inside our framework(2)

The approaches (continued)

10 / 64

Page 30: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Research directions inside our framework(2)

The approaches (continued)

3 Schur complement approach → saddle-point formulation (Scott, T.,2018)

10 / 64

Page 31: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Research directions inside our framework(2)

The approaches (continued)

3 Schur complement approach → saddle-point formulation (Scott, T.,2018)

▸ The approach enables to use direct factorization or an incompletefactorization

10 / 64

Page 32: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Research directions inside our framework(2)

The approaches (continued)

3 Schur complement approach → saddle-point formulation (Scott, T.,2018)

▸ The approach enables to use direct factorization or an incompletefactorization

▸ An acceleration is needed also in case of regularization when runningdirect solver.

10 / 64

Page 33: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Research directions inside our framework(2)

The approaches (continued)

3 Schur complement approach → saddle-point formulation (Scott, T.,2018)

▸ The approach enables to use direct factorization or an incompletefactorization

▸ An acceleration is needed also in case of regularization when runningdirect solver.

▸ Can use indefinite or positive definite factorization depending on theproblem.

10 / 64

Page 34: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Research directions inside our framework(2)

The approaches (continued)

3 Schur complement approach → saddle-point formulation (Scott, T.,2018)

▸ The approach enables to use direct factorization or an incompletefactorization

▸ An acceleration is needed also in case of regularization when runningdirect solver.

▸ Can use indefinite or positive definite factorization depending on theproblem.

4 Null-space approach → saddle-point formulation (Scott, T., inpreparation)

10 / 64

Page 35: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Research directions inside our framework(2)

The approaches (continued)

3 Schur complement approach → saddle-point formulation (Scott, T.,2018)

▸ The approach enables to use direct factorization or an incompletefactorization

▸ An acceleration is needed also in case of regularization when runningdirect solver.

▸ Can use indefinite or positive definite factorization depending on theproblem.

4 Null-space approach → saddle-point formulation (Scott, T., inpreparation)

▸ Transformation of the saddle point formulation

10 / 64

Page 36: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Research directions inside our framework(2)

The approaches (continued)

3 Schur complement approach → saddle-point formulation (Scott, T.,2018)

▸ The approach enables to use direct factorization or an incompletefactorization

▸ An acceleration is needed also in case of regularization when runningdirect solver.

▸ Can use indefinite or positive definite factorization depending on theproblem.

4 Null-space approach → saddle-point formulation (Scott, T., inpreparation)

▸ Transformation of the saddle point formulation▸ Computation of null-space bases of wide dense matrices and more

other interesting problems.

10 / 64

Page 37: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Research directions inside our framework(2)

The approaches (continued)

3 Schur complement approach → saddle-point formulation (Scott, T.,2018)

▸ The approach enables to use direct factorization or an incompletefactorization

▸ An acceleration is needed also in case of regularization when runningdirect solver.

▸ Can use indefinite or positive definite factorization depending on theproblem.

4 Null-space approach → saddle-point formulation (Scott, T., inpreparation)

▸ Transformation of the saddle point formulation▸ Computation of null-space bases of wide dense matrices and more

other interesting problems.

All the approaches have specific strengths, weaknesses and a potentialto be further developed.

10 / 64

Page 38: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

History

Some historical comments (a few of many - we apologize that only asketch is shown)

11 / 64

Page 39: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

History

Some historical comments (a few of many - we apologize that only asketch is shown)

▸ A lot of authors studied direct methods to solve this or a relatedproblem since the 1980s (a lot more interesting papers thanmentioned): George, Heath (1980) (sparse Givens rotations);Björck, Duff (1980) (updates, modifications of Peters-WilkinsonLU-based method, rank-deficiency, weights); Heath (1982) (LSextensions including linear constraints); Björck (1984) (generalupdating scheme with both sparse and dense constraints); Ng(1992) (an interesting scheme to handle rank-deficiency); Sun(1995, 1997) (dense rows in sequential or parallel computingenvironment), and a lot more.

11 / 64

Page 40: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

History

Some historical comments (a few of many - we apologize that only asketch is shown)

▸ A lot of authors studied direct methods to solve this or a relatedproblem since the 1980s (a lot more interesting papers thanmentioned): George, Heath (1980) (sparse Givens rotations);Björck, Duff (1980) (updates, modifications of Peters-WilkinsonLU-based method, rank-deficiency, weights); Heath (1982) (LSextensions including linear constraints); Björck (1984) (generalupdating scheme with both sparse and dense constraints); Ng(1992) (an interesting scheme to handle rank-deficiency); Sun(1995, 1997) (dense rows in sequential or parallel computingenvironment), and a lot more.

▸ An authoritative summary of this research in the monographBjörck (1996), see also Björck, 2015.

11 / 64

Page 41: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

History

Some historical comments (continued)▸ A lot of work to methods using randomization (Rokhlin, Tygert,

2008; Avron, Maymounkov, Toledo, 2010; Drineas, Mahoney,Muthukrishnan, Sarlós, 2011; Meng, Saunders, Mahoney, 2014)

12 / 64

Page 42: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

History

Some historical comments (continued)▸ A lot of work to methods using randomization (Rokhlin, Tygert,

2008; Avron, Maymounkov, Toledo, 2010; Drineas, Mahoney,Muthukrishnan, Sarlós, 2011; Meng, Saunders, Mahoney, 2014)

▸ A lot of work in applications as numerical optimization; see, e.g.,Wright (1992); Lustig et al. (1992); Goldfarb, Shanno (2004);Oliveira, Sorensen (2005), Gondzio (1991), Andersen et al. (1996).

12 / 64

Page 43: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

History

Some historical comments (continued)▸ A lot of work to methods using randomization (Rokhlin, Tygert,

2008; Avron, Maymounkov, Toledo, 2010; Drineas, Mahoney,Muthukrishnan, Sarlós, 2011; Meng, Saunders, Mahoney, 2014)

▸ A lot of work in applications as numerical optimization; see, e.g.,Wright (1992); Lustig et al. (1992); Goldfarb, Shanno (2004);Oliveira, Sorensen (2005), Gondzio (1991), Andersen et al. (1996).

▸ Much less work using preconditioned iterative methods:Preconditioners often put on the top of the pioneering work onLSQR (Paige, Saunders, 1982) or LSMR (Fong, Saunders, 2011);Li, Saad, 2006 (MIQR - multilevel IQR), Avron, Ng, Toledo(2009) (low-rank perturbations of A); interesting ideas within therandomized approach (Avron, 2010; Drineas et al. 2011 above;

12 / 64

Page 44: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

History

Some historical comments (continued)▸ A lot of work to methods using randomization (Rokhlin, Tygert,

2008; Avron, Maymounkov, Toledo, 2010; Drineas, Mahoney,Muthukrishnan, Sarlós, 2011; Meng, Saunders, Mahoney, 2014)

▸ A lot of work in applications as numerical optimization; see, e.g.,Wright (1992); Lustig et al. (1992); Goldfarb, Shanno (2004);Oliveira, Sorensen (2005), Gondzio (1991), Andersen et al. (1996).

▸ Much less work using preconditioned iterative methods:Preconditioners often put on the top of the pioneering work onLSQR (Paige, Saunders, 1982) or LSMR (Fong, Saunders, 2011);Li, Saad, 2006 (MIQR - multilevel IQR), Avron, Ng, Toledo(2009) (low-rank perturbations of A); interesting ideas within therandomized approach (Avron, 2010; Drineas et al. 2011 above;

▸ New research continues to use ideas from classical papers asPeters, Wilkinson (1970), Sautter (1978), Woodbury (1949,1950).

12 / 64

Page 45: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Notation

Notation for the mixed sparse-dense problem: sparse problem with afew dense rows

A = (As

Ad)

C = (ATs AT

d )(As

Ad) = AT

s As +ATd Ad ≡ Cs +Cd

▸ As ∈ IRms×n is sparse, Ad ∈ IRmd×n is dense, (ms ≫md).▸ Full column rank of A (not necessarily of As)

13 / 64

Page 46: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

ASD: Arbitrary sparse-dense preconditioning

Iterative approach based on CG (CGLS1) (Scott, T., 2017): theapproach uses approximations of both

Cs = ATs As and Cd = AT

d Ad

in one preconditioner M .

14 / 64

Page 47: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

ASD: Arbitrary sparse-dense preconditioning

Iterative approach based on CG (CGLS1) (Scott, T., 2017): theapproach uses approximations of both

Cs = ATs As and Cd = AT

d Ad

in one preconditioner M .The parts are merged together and applied to the residual as

z =M−1r

14 / 64

Page 48: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

ASD: Arbitrary sparse-dense preconditioning

Iterative approach based on CG (CGLS1) (Scott, T., 2017): theapproach uses approximations of both

Cs = ATs As and Cd = AT

d Ad

in one preconditioner M .The parts are merged together and applied to the residual as

z =M−1r

Woodbury formulas (1949, 1950) update by dense and update bysparse behind this merge

14 / 64

Page 49: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

ASD: Arbitrary sparse-dense preconditioning

Iterative approach based on CG (CGLS1) (Scott, T., 2017): theapproach uses approximations of both

Cs = ATs As and Cd = AT

d Ad

in one preconditioner M .The parts are merged together and applied to the residual as

z =M−1r

Woodbury formulas (1949, 1950) update by dense and update bysparse behind this mergeExamples of (hidden) Woodbury-like formulas follow

14 / 64

Page 50: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

ASD: Arbitrary sparse-dense preconditioning

Iterative approach based on CG (CGLS1) (Scott, T., 2017): theapproach uses approximations of both

Cs = ATs As and Cd = AT

d Ad

in one preconditioner M .The parts are merged together and applied to the residual as

z =M−1r

Woodbury formulas (1949, 1950) update by dense and update bysparse behind this mergeExamples of (hidden) Woodbury-like formulas follow

Theorem

If Cs = LsLTs and ξ1 minimizes ∥AsL−T

s z − bs∥2 exactly, the exact leastsquares solution of our problem can be written as x = L−T

s (ξ1 + Γ1),ρd = bd −AdL−T

s ξ1 and

Γ1 = L−1

s ATd (Imd

+AdL−Ts L−1

s ATd )−1ρd.

14 / 64

Page 51: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

ASD: Arbitrary sparse-dense preconditioning

Another example of Woodbury-like formula

15 / 64

Page 52: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

ASD: Arbitrary sparse-dense preconditioning

Another example of Woodbury-like formula

Theorem

If Cs = LsLTs and ξ1 is an approximate solution to the problem

minz ∥AsL−Ts z − bs∥

2, the exact least squares solution of the equivalent

problem above can be written as z = ξ1 + Γ1, where ρs = bs −AsL−Ts ξ1,

ρd = bd −AdL−Ts ξ1 and

Γ1 = L−1

s ATs ρs +L−1

s ATd (Imd

+AdL−Ts L−1

s ATd )−1(ρd −AdL−T

s L−1

s ATs ρs).

15 / 64

Page 53: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

ASD: Arbitrary sparse-dense preconditioning

Another example of Woodbury-like formula

Theorem

If Cs = LsLTs and ξ1 is an approximate solution to the problem

minz ∥AsL−Ts z − bs∥

2, the exact least squares solution of the equivalent

problem above can be written as z = ξ1 + Γ1, where ρs = bs −AsL−Ts ξ1,

ρd = bd −AdL−Ts ξ1 and

Γ1 = L−1

s ATs ρs +L−1

s ATd (Imd

+AdL−Ts L−1

s ATd )−1(ρd −AdL−T

s L−1

s ATs ρs).

The following examples use only the first one formula that was alwaysreliable.

15 / 64

Page 54: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

ASD: Arbitrary sparse-dense preconditioning

SCSD8-2r_a (m=60,550; n=8,650): size of Cs

0 10 20 30 40 50 60 70 80 90 1000

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2x 10

6

number of dense rows

size

of t

he m

atrix

of n

orm

al e

quat

ions

size of the matrix of normal equations

16 / 64

Page 55: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

ASD: Arbitrary sparse-dense preconditioning

SCSD8-2r_a (m=60,550; n=8,650): size of Cs

0 10 20 30 40 50 60 70 80 90 1000

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2x 10

6

number of dense rows

size

of t

he m

atrix

of n

orm

al e

quat

ions

size of the matrix of normal equations

This is the determined md ↑.

16 / 64

Page 56: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

ASD: Moving rows one by one from As to Ad

SCSD8-2r_a: iteration counts + size_p/size(AT A)

0 10 20 30 40 50 60 70 80 90 1000

20

40

60

80

100

120

number of dense rows

itera

tion

coun

t

iteration count

0 10 20 30 40 50 60 70 80 90 10010

0

101

102

number of dense rowsra

tio p

reco

nditi

oner

siz

e / n

orm

al e

quat

ions

siz

e

ratio preconditioner size / normal equations size

Figure: Problem Meszaros/scsd8− 2r. Iteration counts (left), and ratio of thepreconditioner size to the size of AT A (right) as the number of dense rows thatare removed from A is increased.

17 / 64

Page 57: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

ASD: Moving rows one by one from As to Ad

SCSD8-2r_a: timings

0 10 20 30 40 50 60 70 80 90 1000

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

number of dense rows

time

to c

ompu

te th

e pr

econ

ditio

ner

time to compute the preconditioner

0 10 20 30 40 50 60 70 80 90 1000

0.05

0.1

0.15

0.2

0.25

number of dense rowstim

e fo

r th

e pr

econ

ditio

ned

itera

tions

time for the preconditioned iterations

Figure: Problem Meszaros/scsd8− 2r. Time to compute the preconditioner(left) and time for CGLS (right) as the number of dense rows that are removedfrom A is increased.

18 / 64

Page 58: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

ASD: Moving rows one by one from As to Ad

stormg2_1000 (m=1,377,306; n=528,185): size of Cs

0 50 100 1500

0.5

1

1.5

2

2.5

3

3.5

4

4.5x 10

7

number of dense rows

size

of t

he m

atrix

of n

orm

al e

quat

ions

size of the matrix of normal equations

Figure: Problem Mittelmann/stormg2_1000. Size of ATs As.

19 / 64

Page 59: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

ASD: Moving rows one by one from As to Ad

stormg2_1000: large problem: iteration counts + size_p/size(AT A)

0 50 100 1500

200

400

600

800

1000

1200

1400

1600

1800

number of dense rows

itera

tion

coun

t

iteration count

0 50 100 15010

−1

100

101

102

number of dense rowsra

tio p

reco

nditi

oner

siz

e / n

orm

al e

quat

ions

siz

e

ratio preconditioner size / normal equations size

Figure: Problem Mittelmann/stormg2_1000. Iteration counts (left), Ratio ofthe preconditioner size to the size of AT A (right) as the number of dense rowsthat are removed from A is increase.

20 / 64

Page 60: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Moving rows one by one from As to Ad

stormg2_1000: timings

0 50 100 1504

6

8

10

12

14

16

18

20

22

24

number of dense rows

time

to c

ompu

te th

e pr

econ

ditio

ner

time to compute the preconditioner

0 50 100 1500

50

100

150

200

250

300

350

400

number of dense rows

time

for

the

prec

ondi

tione

d ite

ratio

ns

time for the preconditioned iterations

Figure: Problem Mittelmann/stormg2_1000. Time to compute thepreconditioner (left), time for the preconditioned iterations (right).

21 / 64

Page 61: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Experimental evaluation of ASD

Dense rows not exploited Dense rows exploitedIdentifier size_p T _p Its T _i md size_ps T _p Its T _i

lp_fit2p 17,985 0.26 ‡ ‡ 25 4,940 0.09 1 0.01scsd8-2r 51,885 0.25 90 0.11 50 51,855 0.05 7 0.02scagr7-2r 197,067 3,34 244 0.53 7 152,977 0.06 1 0.01scfxm1-2r 227,835 0.59 187 0.51 58 227,823 0.14 33 0.23neos1 789,471 † † † 74 789,471 5.27 132 3.71neos2 † † † † 90 795,323 5.46 157 4.84stormg2-125 395,595 0.27 ‡ ‡ 121 7,978,135 0.22 16 0.29PDE1 † † † † 1 1,623,531 12.7 696 1.28neos † † † † 20 2,874,699 4.93 232 15.0stormg2_1000 3,157,095 19.1 ‡ ‡ 121 3,125,987 19.1 18 2.92cont1_l † † † † 1 11,510,370 4.82 1 0.33

22 / 64

Page 62: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

ASD: conclusions

More possibilities to merge formulas.

23 / 64

Page 63: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

ASD: conclusions

More possibilities to merge formulas.

It is not clear why some of the formulas work very well and some ofthem can be “destroyed” by slight perturbations of the factorizations- a careful analysis needed

23 / 64

Page 64: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

ASD: conclusions

More possibilities to merge formulas.

It is not clear why some of the formulas work very well and some ofthem can be “destroyed” by slight perturbations of the factorizations- a careful analysis needed

As for the experiments, we treated dense as dense intentionallywithout any enhancements. To see how it is qualitatively.

23 / 64

Page 65: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

ASD: conclusions

More possibilities to merge formulas.

It is not clear why some of the formulas work very well and some ofthem can be “destroyed” by slight perturbations of the factorizations- a careful analysis needed

As for the experiments, we treated dense as dense intentionallywithout any enhancements. To see how it is qualitatively.

What if As has null columns?

23 / 64

Page 66: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

ASD: conclusions

More possibilities to merge formulas.

It is not clear why some of the formulas work very well and some ofthem can be “destroyed” by slight perturbations of the factorizations- a careful analysis needed

As for the experiments, we treated dense as dense intentionallywithout any enhancements. To see how it is qualitatively.

What if As has null columns?

A possible way is to solve the problem with more right-hand sides.

A = (A1 A2) ≡ (As1

Ad1Ad2

) , (1)

▸ The solution can be expressed as a combination of partial solutions ofminz ∥A1z − b∥

2and minW ∥A1W −A2∥F .

23 / 64

Page 67: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems

Stretching: a specific sparsification by splitting dense rows into sparsepieces.

24 / 64

Page 68: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems

Stretching: a specific sparsification by splitting dense rows into sparsepieces.

The problem is then augmented also by columns.

A

A

24 / 64

Page 69: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems

Stretching: a specific sparsification by splitting dense rows into sparsepieces.

The problem is then augmented also by columns.

A

A

Such strategy called stretching discussed (among others) by Grcar(1990), Vanderbei (1991), Gondzio (1991), Alvarado (1997), Adler(2000), Adler, Björck (2000), Duff, Scott (2005).

24 / 64

Page 70: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems

Stretching: a specific sparsification by splitting dense rows into sparsepieces.

The problem is then augmented also by columns.

A

A

Such strategy called stretching discussed (among others) by Grcar(1990), Vanderbei (1991), Gondzio (1991), Alvarado (1997), Adler(2000), Adler, Björck (2000), Duff, Scott (2005).

Up to now it has not been an approach of choice

24 / 64

Page 71: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems

Derivation: special case of one dense row▸

Ad = (e f) (2)

25 / 64

Page 72: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems

Derivation: special case of one dense row▸

Ad = (e f) (2)

▸ Add one variable s and a modified dense row getting the problemequivalent in the sense of getting the same x as in the original problem

min(xT s)T

RRRRRRRRRRRRRR

RRRRRRRRRRRRRR⎛⎜⎝

Ase Asf 0

e f 0

e −f√

2

⎞⎟⎠⎛⎜⎝

xe

xf

s

⎞⎟⎠ −⎛⎜⎝

bs

bd

0

⎞⎟⎠RRRRRRRRRRRRRR

RRRRRRRRRRRRRR2(3)

25 / 64

Page 73: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems

Derivation: special case of one dense row▸

Ad = (e f) (2)

▸ Add one variable s and a modified dense row getting the problemequivalent in the sense of getting the same x as in the original problem

min(xT s)T

RRRRRRRRRRRRRR

RRRRRRRRRRRRRR⎛⎜⎝

Ase Asf 0

e f 0

e −f√

2

⎞⎟⎠⎛⎜⎝

xe

xf

s

⎞⎟⎠ −⎛⎜⎝

bs

bd

0

⎞⎟⎠RRRRRRRRRRRRRR

RRRRRRRRRRRRRR2(3)

▸ Orthogonally transform the problem applying a Givens rotation G weget

minz∣∣Az − b∣∣2

with

G = 1√2(1 1

1 −1) , A = ⎛⎜⎝

Ase Asf 0√2 e 0 1

0√

2 f −1

⎞⎟⎠ , z = ⎛⎜⎝xe

xf

s

⎞⎟⎠ , b = ⎛⎜⎝bs

bd/√2

bd/√2

⎞⎟⎠ .

25 / 64

Page 74: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems

And this is the sparsified (stretched) matrix

A = ⎛⎜⎝Ase Asf 0√

2 e 0 1

0√

2 f −1

⎞⎟⎠

26 / 64

Page 75: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems

And this is the sparsified (stretched) matrix

A = ⎛⎜⎝Ase Asf 0√

2 e 0 1

0√

2 f −1

⎞⎟⎠The transformation can be used for more rows and more parts

26 / 64

Page 76: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems

And this is the sparsified (stretched) matrix

A = ⎛⎜⎝Ase Asf 0√

2 e 0 1

0√

2 f −1

⎞⎟⎠The transformation can be used for more rows and more parts

But, there are problems with stretching. The first of them: how manyparts? Grcar (1990):“ the main challenge ... lies in determinining theappropriate choice of the number of rows ... to split into ... ”

26 / 64

Page 77: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems

And this is the sparsified (stretched) matrix

A = ⎛⎜⎝Ase Asf 0√

2 e 0 1

0√

2 f −1

⎞⎟⎠The transformation can be used for more rows and more parts

But, there are problems with stretching. The first of them: how manyparts? Grcar (1990):“ the main challenge ... lies in determinining theappropriate choice of the number of rows ... to split into ... ”

We will illustrate this problem by examples.

26 / 64

Page 78: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems

And this is the sparsified (stretched) matrix

A = ⎛⎜⎝Ase Asf 0√

2 e 0 1

0√

2 f −1

⎞⎟⎠The transformation can be used for more rows and more parts

But, there are problems with stretching. The first of them: how manyparts? Grcar (1990):“ the main challenge ... lies in determinining theappropriate choice of the number of rows ... to split into ... ”

We will illustrate this problem by examples.

They show that sparsity can very soon again decrease.

26 / 64

Page 79: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: very trivial model

0 10 20 30 40 50 60 70

0

10

20

30

40

50

60

70

nz = 1420 10 20 30 40 50 60 70

0

10

20

30

40

50

60

70

nz = 755

Figure: Structure of A (left) and C (right) for the matrix A with a single denserow and As diagonal with n=64 and dense row stretched into 8 parts.

27 / 64

Page 80: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: very trivial model

0 10 20 30 40 50 60 70

0

10

20

30

40

50

60

70

nz = 1420 10 20 30 40 50 60 70

0

10

20

30

40

50

60

70

nz = 755

Figure: Structure of A (left) and C (right) for the matrix A with a single denserow and As diagonal with n=64 and dense row stretched into 8 parts.

This seems to be very tempting.

27 / 64

Page 81: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: very simple model

0 10 20 30 40 50 60 7010

2

103

104

number of stretched parts

fill

in th

e st

retc

hed

mat

rices

fill in the stretched matrix of normal equations fill in the stretched matrix

Figure: Number of entries in A and C when A has a single dense row that is splitinto an increasing number of parts and As is diagonal (n = 64).

28 / 64

Page 82: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: very simple model

0 10 20 30 40 50 60 7010

2

103

104

number of stretched parts

fill

in th

e st

retc

hed

mat

rices

fill in the stretched matrix of normal equations fill in the stretched matrix

Figure: Number of entries in A and C when A has a single dense row that is splitinto an increasing number of parts and As is diagonal (n = 64).

No-fill in the Cholesky factorization

Seems to be very promising28 / 64

Page 83: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: less trivial model

0 10 20 30 40 50 60

0

10

20

30

40

50

60

nz = 3520 10 20 30 40 50 60 70 80

0

10

20

30

40

50

60

70

80

nz = 382

Figure: Matrices based on discrete 2D Laplacian: unstretched (left), stretched(right).

29 / 64

Page 84: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: less trivial model

0 10 20 30 40 50 60

0

10

20

30

40

50

60

nz = 3520 10 20 30 40 50 60 70 80

0

10

20

30

40

50

60

70

80

nz = 382

Figure: Matrices based on discrete 2D Laplacian: unstretched (left), stretched(right).

Seems to be promising again

29 / 64

Page 85: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: generic example

0 10 20 30 40 50 60 70 80

0

10

20

30

40

50

60

70

80

nz = 9910 10 20 30 40 50 60 70 80

0

10

20

30

40

50

60

70

80

nz = 3077

Figure: AT A (left) and its Cholesky factor (right).

Ooops, this is not sparse at all ...

30 / 64

Page 86: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: fill-in

0 200 400 600 800 1000 120010

4

105

106

107

number of stretched parts

fill

in th

e st

retc

hed

mat

rices

fill in the stretched matrix of normal equations fill in the Cholesky factor of normal equations

Figure: Fill-in in AT A and its Cholesky factor (triangular parts).

31 / 64

Page 87: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: fill-in

0 200 400 600 800 1000 120010

4

105

106

107

number of stretched parts

fill

in th

e st

retc

hed

mat

rices

fill in the stretched matrix of normal equations fill in the Cholesky factor of normal equations

Figure: Fill-in in AT A and its Cholesky factor (triangular parts).

Reorderings do not help, straightforward stretching not possible!.

31 / 64

Page 88: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: fill-in

0 200 400 600 800 1000 120010

4

105

106

107

number of stretched parts

fill

in th

e st

retc

hed

mat

rices

fill in the stretched matrix of normal equations fill in the Cholesky factor of normal equations

Figure: Fill-in in AT A and its Cholesky factor (triangular parts).

Reorderings do not help, straightforward stretching not possible!.

More general cases: problems with the fill-in and also withill-conditioning (despite the bounds by Adlers and Björck, 2000).

31 / 64

Page 89: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: fill-in

0 200 400 600 800 1000 120010

4

105

106

107

number of stretched parts

fill

in th

e st

retc

hed

mat

rices

fill in the stretched matrix of normal equations fill in the Cholesky factor of normal equations

Figure: Fill-in in AT A and its Cholesky factor (triangular parts).

Reorderings do not help, straightforward stretching not possible!.

More general cases: problems with the fill-in and also withill-conditioning (despite the bounds by Adlers and Björck, 2000).

The problem with fill-in is primarily not in the need of additionalmemory, but in contributing to loss of information in preconditioners!

31 / 64

Page 90: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: fill-in

The normal matrix:

AT A = m

∑k=1

ukT

uk, AT = (u1T , . . . ,um

T ).That is, uk, k = 1, . . . , m are rows of A

32 / 64

Page 91: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: fill-in

The normal matrix:

AT A = m

∑k=1

ukT

uk, AT = (u1T , . . . ,um

T ).That is, uk, k = 1, . . . , m are rows of A

What if a pattern of a row uj is contained in the pattern of a row ui

(dominated by ui)?

32 / 64

Page 92: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: fill-in

The normal matrix:

AT A = m

∑k=1

ukT

uk, AT = (u1T , . . . ,um

T ).That is, uk, k = 1, . . . , m are rows of A

What if a pattern of a row uj is contained in the pattern of a row ui

(dominated by ui)?

⇒ Pattern of uj is not needed to get the pattern of AT A.

A =⎛⎜⎜⎜⎜⎜⎜⎝

1 2 3 4 5 6

⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮uj ∗ ∗⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ui ∗ ∗ ∗⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮

⎞⎟⎟⎟⎟⎟⎟⎠32 / 64

Page 93: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: fill-in

Let us see this: consider

(As

ui)

The normal matrix

(ATs ui

T )T (As

ui)

has the same sparsity pattern as ATs As.

33 / 64

Page 94: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: fill-in

Let us see this: consider

(As

ui)

The normal matrix

(ATs ui

T )T (As

ui)

has the same sparsity pattern as ATs As.

ui → F , A → (As

F)

33 / 64

Page 95: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: fill-in

Let us see this: consider

(As

ui)

The normal matrix

(ATs ui

T )T (As

ui)

has the same sparsity pattern as ATs As.

ui → F , A → (As

F)

The idea: stretch F T into blocks dominated by rows in As!

33 / 64

Page 96: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: fill-in

Let us see this: consider

(As

ui)

The normal matrix

(ATs ui

T )T (As

ui)

has the same sparsity pattern as ATs As.

ui → F , A → (As

F)

The idea: stretch F T into blocks dominated by rows in As!

AT A = (AT F

0 ST)( A 0

F T S) = (AT A +F T F ST F T

FS ST S)

Struct(AT A + F T F ) ⊆ Struct(AT A)33 / 64

Page 97: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: sparse stretching

Finding the stretched and dominated pieces

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

∗ ∗∗ ∗ ∗∗ ∗∗ ∗ ∗ ∗∗ ∗ ∗∗ ∗∗ ∗ ∗∗ ∗ ∗ ∗ ∗ ∗∗ ∗∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠Figure: A matrix of order 15 × 12 with one dense row

34 / 64

Page 98: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: sparse stretching

rows of A (edges)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

1

2

3

4

5

6

7

8

12

10

dense row (nodes)

35 / 64

Page 99: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: sparse stretching

rows of A (edges)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

1

2

3

4

5

6

7

8

12

10

dense row (nodes)

Minimum set cover problem+ ostp cessing

36 / 64

Page 100: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: sparse stretching

rows of A (edges)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

1

2

3

4

5

6

7

8

12

10

dense row (nodes)

First step: Minimum set cover problem37 / 64

Page 101: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: sparse stretching

rows of A (edges)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

1

2

3

4

5

6

7

8

12

10

dense row (nodes)

Second step: find disjoint segments38 / 64

Page 102: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems

Matrix transformed by the stretching fully automaticallyparameter-free using

▸ minimum set cover heuristic▸ sparsification to minimize the total fill-in

39 / 64

Page 103: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems

Matrix transformed by the stretching fully automaticallyparameter-free using

▸ minimum set cover heuristic▸ sparsification to minimize the total fill-in

Based just on combinatorial techniques.

We call this technique sparse stretching

39 / 64

Page 104: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems

Matrix transformed by the stretching fully automaticallyparameter-free using

▸ minimum set cover heuristic▸ sparsification to minimize the total fill-in

Based just on combinatorial techniques.

We call this technique sparse stretching

Much better than the standard (old) stretching in

▸ Sparsity of normal equations.▸ Cholesky factor size▸ Normal equations conditioning.▸ Black-box construction of algebraic construction.▸ Iteration counts

39 / 64

Page 105: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: old and sparse stretching

Number of entries: AT A and Cholesky factor versus number of parts

Red dot means the result for the sparse stretching

0 50 100 150 200 250

number of stretched parts

5000

6000

7000

8000

9000

10000

11000

entr

ies

in th

e st

retc

hed

norm

al m

atrix

0 50 100 150 200 250

number of stretched parts

1.5

2

2.5

3

3.5

4

4.5

5

5.5

6

entr

ies

in th

e C

hole

sky

fact

or

104

Figure: Comparison of the entries in the stretched normal matrix (left) and itsCholesky factor (right) for problem WM1 with one dense row appended.

40 / 64

Page 106: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: old and sparse stretching

Number of entries: AT A and Cholesky factor versus number of parts

Red dot means the result for the sparse stretching

0 50 100 150 200 250 300 350 400 450 500

number of stretched parts

1

2

3

4

5

6

7

entr

ies

in th

e st

retc

hed

norm

al m

atrix

104

0 50 100 150 200 250 300 350 400 450 500

number of stretched parts

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

2.2

entr

ies

in th

e C

hole

sky

fact

or

105

Figure: Comparison of the entries in the stretched normal matrix (left) and itsCholesky factor (right) for problem LP_AGG with one dense row appended.

41 / 64

Page 107: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: old and sparse stretching

0 100 200 300 400 500

nz = 29268

0

50

100

150

200

250

300

350

400

450

500

0 100 200 300 400 500

nz = 24878

0

50

100

150

200

250

300

350

400

450

500

Figure: For problem LP_AGG with one dense row appended, the sparsity pattern ofthe stretched normal matrix for standard stretching (left) and sparse stretching(right).

42 / 64

Page 108: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: old and sparse stretching

Figure: For problem LP_AGG with one dense row appended, the sparsity pattern ofL + LT of the Cholesky factor of the stretched normal matrix for standardstretching (left) and sparse stretching (right).

43 / 64

Page 109: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: old and sparse stretching

The sizes really transfer into the iteration counts

0 50 100 150 200 250 300 350 400 450 500

number of stretched parts

0

10

20

30

40

50

60

70

80

90

100

num

ber

of p

reco

nditi

oned

iter

atio

ns

number of preconditioned iterations

0 50 100 150 200 250 300 350 400 450 500

number of stretched parts

2

2.5

3

3.5

4

4.5

5

prec

ondi

tione

r si

ze

104

preconditioner size

Figure: Comparison of the iteration counts (left) and preconditioner size (right)for the matrix LPAGG. The curve corresponds to the number of entries varyingwith the number of parts into which is the dense row stretched. CGLSpreconditioned by HSL_MI35.

44 / 64

Page 110: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching: old and sparse stretching

Identifier strategy mds nnz(C) nnz(L) nflops

aircraft none 0 1.421×106 7.048×10

6 1.764×1010

standard 17 5.474×104 4.911×10

5 1.759×107

sparse 17 5.474×104 4.911×10

5 1.759×107

sc205-2r none 0 6.510×106 8.002×10

7 1.995×1011

standard 8 1.408×105 1.841×10

6 8.043×107

sparse 8 1.408×105 1.852×10

6 8.150×107

scagr7-2r none 0 2.215×107 9.502×10

7 4.369×1011

standard 7 1.979×106 9.269×10

6 1.423×1010

sparse 7 1.841×105 1.564×10

6 6.538×107

scrs8-2r none 0 6.217×106 1.718×10

7 3.215×1010

standard 22 8.013×105 2.675×10

6 1.244×109

sparse 22 8.968×104 1.258×10

6 7.303×107

scsd8-2r none 0 1.957×106 1.196×10

7 1.960×1010

standard 50 6.287×105 6.924×10

6 5.467×109

sparse 50 1.794×105 4.357×10

6 2.414×109

south31 none 0 1.536×108 1.581×10

8 1.850×1012

standard 381 3.224×105 4.366×10

6 2.546×108

sparse 381 3.224×105 4.260×10

6 2.319×108

HSL_MI35 (Scott, T., 2016) 45 / 64

Page 111: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: old and sparse stretching

What about conditioning?

0 50 100 150 200 2500

2

4

6

8

10

12x 10

12

number of stretched parts

cond

ition

num

ber

(con

dest

)

0 50 100 150 200 250 300 350 400 450 500

number of stretched parts

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

cond

ition

num

ber

(con

dest

)

1013

Figure: The condition number of the stretched normal matrix for problems WM1

(left) and LP_AGG (right) with one dense row appended.

46 / 64

Page 112: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: old and sparse stretching

What about conditioning?

0 50 100 150 200 2500

2

4

6

8

10

12x 10

12

number of stretched parts

cond

ition

num

ber

(con

dest

)

0 50 100 150 200 250 300 350 400 450 500

number of stretched parts

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

cond

ition

num

ber

(con

dest

)

1013

Figure: The condition number of the stretched normal matrix for problems WM1

(left) and LP_AGG (right) with one dense row appended.

The condition number goes steadily up in practice46 / 64

Page 113: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: old and sparse stretching

Ill-conditioning in practice is in agreement with theoretical bounds.

Adlers-Björck theory (see Adlers, Björck, 2000; Scott, T., 2019)

Theorem

An upper bound for the condition number of the stretched matrix (pstretched rows, k parts) A with γ = 1/2√p k∣∣Ad∣∣2 is

κ2(A) ≤ κ2(A)k (1 + 2p k∣∣Ad∣∣22∣∣A∣∣22

)(k + 1 + σn(A)2∣∣Ad∣∣22 ) .

47 / 64

Page 114: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: old and sparse stretching

Ill-conditioning in practice is in agreement with theoretical bounds.

Adlers-Björck theory (see Adlers, Björck, 2000; Scott, T., 2019)

Theorem

An upper bound for the condition number of the stretched matrix (pstretched rows, k parts) A with γ = 1/2√p k∣∣Ad∣∣2 is

κ2(A) ≤ κ2(A)k (1 + 2p k∣∣Ad∣∣22∣∣A∣∣22

)(k + 1 + σn(A)2∣∣Ad∣∣22 ) .

And this is really not optimistic ...

47 / 64

Page 115: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: old and sparse stretching

Condition number increase when stretching more rows

0 1 2 3 4 5 6 7 8

number of stretched rows

106

108

1010

1012

1014

1016

cond

ition

num

ber

estim

ate

0 5 10 15 20 25 30 35

number of stretched rows

100

101

102

103

itera

tion

coun

t

Figure: Condition number estimate (right) and iteration count (left) for problemsctap1-2b as the number of dense rows increases.

48 / 64

Page 116: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: old and sparse stretching

Condition number increase when stretching more rows

0 1 2 3 4 5 6 7 8

number of stretched rows

106

108

1010

1012

1014

1016

cond

ition

num

ber

estim

ate

0 5 10 15 20 25 30 35

number of stretched rows

100

101

102

103

itera

tion

coun

t

Figure: Condition number estimate (right) and iteration count (left) for problemsctap1-2b as the number of dense rows increases.

Do we really need to stretch everything?

48 / 64

Page 117: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: old and sparse stretching

Remedy: partial stretching▸ Stretching only when needed. For example, when we have null columns

in As

▸ The rest treated by the ASD

Identifier m n md mds nnz(C) nnz(L) nflops

aircraft 7517 3754 17 4 1.575×104 9.984×10

4 2.552×106

sc205-2r 62423 35213 8 1 9.602×104 9.084×10

5 2.875×107

scagr7-2r 46679 32847 7 1 1.443×105 8.531×10

5 2.572×107

scrs8-2r 27691 14364 22 7 6.698×104 6.716×10

5 2.797×107

scsd8-2r 60550 8650 50 5 5.895×104 † †

south31 36321 18425 381 5 1.884×104 2.306×10

4 1.565×105

49 / 64

Page 118: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: old and sparse stretching

Remedy: partial stretching▸ Stretching only when needed. For example, when we have null columns

in As

▸ The rest treated by the ASD

Identifier m n md mds nnz(C) nnz(L) nflops

aircraft 7517 3754 17 4 1.575×104 9.984×10

4 2.552×106

sc205-2r 62423 35213 8 1 9.602×104 9.084×10

5 2.875×107

scagr7-2r 46679 32847 7 1 1.443×105 8.531×10

5 2.572×107

scrs8-2r 27691 14364 22 7 6.698×104 6.716×10

5 2.797×107

scsd8-2r 60550 8650 50 5 5.895×104 † †

south31 36321 18425 381 5 1.884×104 2.306×10

4 1.565×105

Much better in many cases, but not always.

49 / 64

Page 119: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: old and sparse stretching

What about a more sophisticated graph preprocessing: weightedvariant of the set cover?

A = (As

Ad) =⎛⎜⎜⎜⎜⎜⎜⎜⎜⎝

1 1 1 1 0

0 0 0 0 1

1 1 0 0 0

5 6 0 0 0

0 0 7 8 9

3/√2 3/√2 3/√2 3/√2 3/√2

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎠. (4)

50 / 64

Page 120: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: old and sparse stretching

What about a more sophisticated graph preprocessing: weightedvariant of the set cover?

A = (As

Ad) =⎛⎜⎜⎜⎜⎜⎜⎜⎜⎝

1 1 1 1 0

0 0 0 0 1

1 1 0 0 0

5 6 0 0 0

0 0 7 8 9

3/√2 3/√2 3/√2 3/√2 3/√2

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎠. (4)

(As

F T) =

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

1 1 1 1 0

0 0 0 0 1

1 1 0 0 0

5 6 0 0 0

0 0 7 8 9

3 3 3 3 0

0 0 0 0 3

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠

or

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

1 1 1 1 0

0 0 0 0 1

1 1 0 0 0

5 6 0 0 0

0 0 7 8 9

3 3 0 0 0

0 0 3 3 3

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠

(5)

50 / 64

Page 121: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: old and sparse stretching

What about a more sophisticated graph proprocessing: weighted setcover

A = (As

Ad) =⎛⎜⎜⎜⎜⎜⎜⎜⎜⎝

1 1 1 1 0

0 0 0 0 1

1 1 0 0 0

5 6 0 0 0

0 0 7 8 9

3/√2 3/√2 3/√2 3/√2 3/√2

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎠. (6)

ATs As + FF T =

⎛⎜⎜⎜⎜⎜⎜⎝

36 41 10 10 0

41 47 10 10 0

10 10 59 66 63

10 10 66 64 72

0 0 63 72 91

⎞⎟⎟⎟⎟⎟⎟⎠or

⎛⎜⎜⎜⎜⎜⎜⎝

36 41 1 1 0

41 47 1 1 0

1 1 59 66 72

1 1 66 74 81

0 0 72 81 91

⎞⎟⎟⎟⎟⎟⎟⎠.

51 / 64

Page 122: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: old and sparse stretching

What about a more sophisticated graph proprocessing: weighted setcover

A = (As

Ad) =⎛⎜⎜⎜⎜⎜⎜⎜⎜⎝

1 1 1 1 0

0 0 0 0 1

1 1 0 0 0

5 6 0 0 0

0 0 7 8 9

3/√2 3/√2 3/√2 3/√2 3/√2

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎠. (6)

ATs As + FF T =

⎛⎜⎜⎜⎜⎜⎜⎝

36 41 10 10 0

41 47 10 10 0

10 10 59 66 63

10 10 66 64 72

0 0 63 72 91

⎞⎟⎟⎟⎟⎟⎟⎠or

⎛⎜⎜⎜⎜⎜⎜⎝

36 41 1 1 0

41 47 1 1 0

1 1 59 66 72

1 1 66 74 81

0 0 72 81 91

⎞⎟⎟⎟⎟⎟⎟⎠.

Apparently, the second choice is intuitively better.51 / 64

Page 123: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: old and sparse stretching

0 5 10 15 20 25 30 35

number of appended rows

0

200

400

600

800

1000

1200

1400

1600

1800

itera

tion

coun

t

non-weighted set cover weighted set cover

0 5 10 15 20 25 30 35

number of appended rows

0

100

200

300

400

500

600

itera

tion

coun

t

non-weighted set cover weighted set cover

Figure: Iteration counts for problem lp_agg as the number of appended denserows increases. Results are for sparse stretching using the weighted andnon-weighted vertex set cover with lsize = 50 (left) and lsize = 60 (right).

52 / 64

Page 124: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: old and sparse stretching

0 20 40 60 80 100 120

lsize

0

500

1000

1500

2000

2500

itera

tion

coun

t

non-weighted set cover weighted set cover

40 50 60 70 80 90 100 110 120

lsize

0

500

1000

1500

2000

2500

3000

itera

tion

coun

t

non-weighted set cover weighted set cover

Figure: Iteration counts for problem pltexpa with one appended dense row as thethe parameter lsize increases. Results are for sparse stretching using theweighted and non-weighted vertex set cover for density ρ = 0.1 (left) and ρ = 0.5

(right).

53 / 64

Page 125: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: old and sparse stretching

What if the ill-conditioning can be overcome by treating the problemas having the saddle-point structure?

AT A = (AT F T

0 ST )(A 0

F S) = (AT A + F T F AT F

F T A ST S)

See that (in an unscaled case):

ST S =

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

2 −1−1 2 −1−1 2 −1−1 2 −1⋯ ⋯ ⋯−1 2 −1−1 2

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠54 / 64

Page 126: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: old and sparse stretching

Very preliminary: preconditioners based on

P0 = (Cs +F T F FS

ST F T ST S) , P1 = (diag(Cs) FS

ST F T ST S) .

0 10 20 30 40 50

number of appended rows

101

102

103

itera

tion

coun

t

P0 preconditioner P1 preconditioner

0 10 20 30 40 50

number of appended rows

102

103

104

prec

ondi

tione

r si

ze

P0 preconditioner P1 preconditioner

Figure: Iteration counts (left) and preconditioner size (right) for P 0 and P 1

applied to problem lp_agg as the number appended dense rows increases. Herelsize = 5.

55 / 64

Page 127: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Matrix stretching for the LS problems: old and sparse stretching

Very preliminary: preconditioners based on

P0 = (Cs +F T F FS

ST F T ST S) , P1 = (diag(Cs) FS

ST F T ST S) .

0 10 20 30 40 50

number of appended rows

101

102

103

itera

tion

coun

t

P0 preconditioner P1 preconditioner

0 10 20 30 40 50

number of appended rows

102

103

104

prec

ondi

tione

r si

ze

P0 preconditioner P1 preconditioner

Figure: Iteration counts (left) and preconditioner size (right) for P 0 and P 1

applied to problem lp_agg as the number appended dense rows increases. Herelsize = 5.Many more possibilities to get modified constraint preconditioners.

55 / 64

Page 128: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Schur complement approach

Fully embedded in the Schur complement approach that combines adirect solver, modifications, regularization to get a preconditioner:

System matrix varying α

K(α) = (Cs(α) ATd

Ad −Imd

) .

56 / 64

Page 129: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Schur complement approach

Fully embedded in the Schur complement approach that combines adirect solver, modifications, regularization to get a preconditioner:

System matrix varying α

K(α) = (Cs(α) ATd

Ad −Imd

) .

Once the dense rows are clearly detected, the preconditioned iterativemethod can be extremely successful in solving some hard problems.

56 / 64

Page 130: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Standard null space approach

Optimization motivation of the null-space approach

minimize f(u)subject to B u = g, (7)

57 / 64

Page 131: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Standard null space approach

Optimization motivation of the null-space approach

minimize f(u)subject to B u = g, (7)

The saddle-point problem for the direction vector u, H is the localHessian.

(H BT

B 0)(u

v) = (f −Hu

g) , (8)

57 / 64

Page 132: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Standard null space approach

Optimization motivation of the null-space approach

minimize f(u)subject to B u = g, (7)

The saddle-point problem for the direction vector u, H is the localHessian.

(H BT

B 0)(u

v) = (f −Hu

g) , (8)

Algorithm

Dual variable method for solving the saddle-point problem

1. Find Z with columns forming a basis for N(B)2. Find u such that Bu = g.3. Solve ZT HZz = ZT (f −Hu).4. Set x = u +Zz.5. Find y such that BT y = f −Hu.

57 / 64

Page 133: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Standard null space approach

Dual variable method to solve it heavily relies on the fact that thebottom right block of the saddle-point matrix is zero

58 / 64

Page 134: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Standard null space approach

Dual variable method to solve it heavily relies on the fact that thebottom right block of the saddle-point matrix is zero

Saddle-point from the LS problemThe LS problem can be written as solving the following system

(Cs ATd

Ad −I)( x

Adx) = (c

0) . (9)

and the problem with the nonzero bottom right block should beovercome.

58 / 64

Page 135: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Standard null space approach

Dual variable method to solve it heavily relies on the fact that thebottom right block of the saddle-point matrix is zero

Saddle-point from the LS problemThe LS problem can be written as solving the following system

(Cs ATd

Ad −I)( x

Adx) = (c

0) . (9)

and the problem with the nonzero bottom right block should beovercome.

Denote the problem using more general notation as

A(uv) = (H BT

B −C)(u

v) = (f

g) , (10)

58 / 64

Page 136: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

The null space approach to solve sparse/dense LS problems

Theorem

Consider the saddle-point problem above, rank(B) = r ≤ k, C SPD.Assume that E = (Z Y ) ∈ Rn×n is nonsingular, Z ∈ Rn×n−r is such that

BE = (0k,n−r Br), Br ∈ Rk×r, Br ≠ 0. Set E to be

E = [ E 0n,k

0k,n Ik] ∈ R(n+k)×(n+k).

If H is positive definite on the null space of B then A = ETAE issymmetric and invertible, with SPD leading principal submatrix ZT HZ ofthe order n − r. The transformed saddle point system is

A(uv) = ET (f

g) , (u

v) = E (u

v) .

59 / 64

Page 137: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

The null space approach to solve sparse/dense LS problems

Remind the LS problem in the saddle-point formulation

(Cs ATd

Ad −I)( x

Adx) = (c

0) . (11)

60 / 64

Page 138: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

The null space approach to solve sparse/dense LS problems

Remind the LS problem in the saddle-point formulation

(Cs ATd

Ad −I)( x

Adx) = (c

0) . (11)

Lemma

Consider A = (As

Ad) , As ∈ Rms×n, Ad ∈ Rmd×n. If A is of full rank, then

Cs = ATs As is positive definite on the null space of Ad.

60 / 64

Page 139: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

The null space approach to solve sparse/dense LS problems

Remind the LS problem in the saddle-point formulation

(Cs ATd

Ad −I)( x

Adx) = (c

0) . (11)

Lemma

Consider A = (As

Ad) , As ∈ Rms×n, Ad ∈ Rmd×n. If A is of full rank, then

Cs = ATs As is positive definite on the null space of Ad.

It does not matter whether the As itself has a full column rank (abunch specific treatments needed in other solution approaches).

60 / 64

Page 140: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

The null space approach to solve sparse/dense LS problems

Remind the LS problem in the saddle-point formulation

(Cs ATd

Ad −I)( x

Adx) = (c

0) . (11)

Lemma

Consider A = (As

Ad) , As ∈ Rms×n, Ad ∈ Rmd×n. If A is of full rank, then

Cs = ATs As is positive definite on the null space of Ad.

It does not matter whether the As itself has a full column rank (abunch specific treatments needed in other solution approaches).

Using a sparse basis for the range space is enough to have awell-defined solver.

60 / 64

Page 141: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

The null space approach to solve sparse/dense LS problems

The solution components of the null-space approach:

61 / 64

Page 142: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

The null space approach to solve sparse/dense LS problems

The solution components of the null-space approach:

1 Null-space basis computation for B ≡ Ad: the sparse null space basisof a wide dense matrix.

61 / 64

Page 143: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

The null space approach to solve sparse/dense LS problems

The solution components of the null-space approach:

1 Null-space basis computation for B ≡ Ad: the sparse null space basisof a wide dense matrix.

2 Preconditioner construction for the transformed system: the Hessianis transformed by a matrix having both null space-basis andrange-space basis E = (Z Y ) ∈ Rn×n.

⎛⎜⎝ZT HZ ZT HY 0n−r,k

Y T HZ Y T HY BTr

0k,n−r Br −C

⎞⎟⎠ .

61 / 64

Page 144: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

The null space approach to solve sparse/dense LS problems

Threshold thresh = 0.73

0 100 200 300 400

nz = 5258

0

50

100

150

200

250

300

350

400

450

LPAGG (615 × 488, UFL Sparse Matrix Collection) + 10 dense rows.

62 / 64

Page 145: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Conclusions

We will comment mainly on the space for subsequent research

63 / 64

Page 146: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Conclusions

We will comment mainly on the space for subsequent research

ASD: still a lot theoretical challenges.

63 / 64

Page 147: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Conclusions

We will comment mainly on the space for subsequent research

ASD: still a lot theoretical challenges.

Stretching implies new concepts to find dense rows - based on theways to split and embed their row structures.

63 / 64

Page 148: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Conclusions

We will comment mainly on the space for subsequent research

ASD: still a lot theoretical challenges.

Stretching implies new concepts to find dense rows - based on theways to split and embed their row structures.

Combination with the structure of factorization?

63 / 64

Page 149: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Conclusions

We will comment mainly on the space for subsequent research

ASD: still a lot theoretical challenges.

Stretching implies new concepts to find dense rows - based on theways to split and embed their row structures.

Combination with the structure of factorization?

Stretching implies a new approach to solve the problem of nullcolumns of As: stretching makes a bridge between As and Ad andkeeps the As full column rank.

63 / 64

Page 150: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Conclusions

We will comment mainly on the space for subsequent research

ASD: still a lot theoretical challenges.

Stretching implies new concepts to find dense rows - based on theways to split and embed their row structures.

Combination with the structure of factorization?

Stretching implies a new approach to solve the problem of nullcolumns of As: stretching makes a bridge between As and Ad andkeeps the As full column rank.

Another stretching problem: the stretched rows may be processeddifferently from the rest (?)

63 / 64

Page 151: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Conclusions

We will comment mainly on the space for subsequent research

ASD: still a lot theoretical challenges.

Stretching implies new concepts to find dense rows - based on theways to split and embed their row structures.

Combination with the structure of factorization?

Stretching implies a new approach to solve the problem of nullcolumns of As: stretching makes a bridge between As and Ad andkeeps the As full column rank.

Another stretching problem: the stretched rows may be processeddifferently from the rest (?)

Null-space approach: a viable strategy as well as a way to get overthe singularity of As (Scott, T., 2019, in preparation).

63 / 64

Page 152: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Conclusions

We will comment mainly on the space for subsequent research

ASD: still a lot theoretical challenges.

Stretching implies new concepts to find dense rows - based on theways to split and embed their row structures.

Combination with the structure of factorization?

Stretching implies a new approach to solve the problem of nullcolumns of As: stretching makes a bridge between As and Ad andkeeps the As full column rank.

Another stretching problem: the stretched rows may be processeddifferently from the rest (?)

Null-space approach: a viable strategy as well as a way to get overthe singularity of As (Scott, T., 2019, in preparation).

Many more questions than expected at the beginning.

63 / 64

Page 153: Sparse least squares problems with dense rows and ... · Outline 1 Introduction 2 Possible research directions 3 Some historical comments 4 Arbitrary sparse-dense (ASD) approach 5

Last but not least

Thank you for your attention!

64 / 64