Weighted estimates of Calderón-Zygmund operators on vector ...
Transcript of Weighted estimates of Calderón-Zygmund operators on vector ...
Weighted estimates of Calderón-Zygmund
operators on vector-valued function spaces
by
Amalia Culiuc
B. A., Mount Holyoke College; South Hadley, MA 2011
Sc. M., Brown University; Providence, RI, 2013
A dissertation submitted in partial fulfillment of the
requirements for the degree of Doctor of Philosophy
in the Department of Mathematics at Brown University
Providence, Rhode Island
May 2016
© Copyright 2016 by Amalia Culiuc
This dissertation by Amalia Culiuc is accepted in its present form by
the Department of Mathematics as satisfying the dissertation requirement
for the degree of Doctor of Philosophy.
DateSergei Treil, Ph. D., Advisor
Recommended to the Graduate Council
DateJill Pipher, Ph. D., Reader
DateFrancesco Di Plinio, Ph. D., Reader
Approved by the Graduate Council
DatePeter Weber
Dean of the Graduate School
iii
Vita
Amalia Culiuc was born in Bucharest, Romania and attended Mount Holyoke Col-
lege, where she received a Bachelor’s degree in Mathematics and Economics, magna
cum laude, in 2011. She attended graduate school at Brown University, where she
received a Master of Science degree in mathematics in May 2013.
iv
Acknowledgements
I thank my advisor, Sergei Treil, for all his advice, help, and support, as well as for
suggesting these problems and being there for me at every step along the way. Many
thanks also to my committee members, Jill Pipher and Francesco Di Plinio for all
their feedback and their support, which extends far beyond this thesis.
To my academic brother and collaborator, Brett Wick, for his invaluable help, to
Kelly Bickel, for her patience, encouragement, and an excellent collaboration, and to
Fedja Nazarov, for some of the ideas that went into this thesis.
To Michael Lacey for just about everything. I couldn’t ask for a better mentor
during the next stage of my career.
To my academic sister, Hyun Kwon, for a conversation that changed my academic
path.
To my harmonic analysis family at Brown: Yumeng Ou, Tess Anderson, Jingguo
Lai, Armen Vagharshakyan, and, of course, Francesco Di Plinio. Thank you for
being my friends, my role models, and sources of inspiration. In particular, thank
you, Francesco and Yumeng, for being my closest supporters during this past year.
I will miss you more than words can say.
To my friends in the math department at Brown: Jackie Anderson, Kenny Ascher,
Shamil Asgarli, Alex Barron, Dori Bejleri, Paul Carter, Matt Cole, Sam Connolly,
v
Elizabeth Crites, Diana Davis, Brian Friedin, Victoria Gras Andreu, Alicia Harper,
Vivian Healey, Wade Hindes, Younghun Hong, Tom Hulse, Peihong Jiang, Semin
Kim, Seoyoung Kim, Chan Kuan, Mehmet Kiral, Nhat Le, Jonah Leshin, Li-Mei Lim,
David Lowry-Duda, Numann Malik, Igor Minevich, Sam Molcho, Dinakar Muthiah,
Peter McGrath, Edward Newkirk, Isaac Solomon, Ian Sprung, Minh-Hoang Tran,
Martin Ulirsch, Alex Walker, Dale Winter, Laura Walton, Ashley Weber, Elliot
Wells, Miles Wheeler, Wei Pin Wong, Sunny Xiao, and Ren Yi. Extra thanks to my
office mates, Laura and Dori, for making the office feel like a second home to me.
To Audrey Aguiar, Lori Nascimento, Doreen Pappas, Carol Oliveira, and Larry
Larrivee, the best staff a department could have.
To my students, who, just like me, often doubt themselves: Eren Alkan, Yokabed
Ashenafi, Katie Barry, Nik Baya, Eli Berkowitz, Chantel Brown, Ryan Burke, Kiara
Butrosoglu, Emma Byrne, Sally Cai, Valentina Cano, Chien Teng Chia, Crystal
Chen, Shirin Chen, Juan Colin, Matt Cooper, Emma Currier, Neville Dadina, Bran-
don Dale, Victor Dang, Joshua Daniel, Petros Dawit, Rachaell Diaz, Alex Djorno, Al-
bert Dong, Irene Du, Ercole Durini di Monza, Gloria Essien, Marimar Fletcher, Grant
Fong, Andrew Friedman, Meghan Friedmann, Johanna Garfinkel, Maddie Gaw, Leah
Goldman, Leonard Gleyzer, Aaron Gokaslan, Sam Greenberg, Evan Gross, Jack
Haworth, Phebe Hinman, Nicola Ho, Isiah Iniguez, Lucy Jia, Bailey Jones, Min
Jeong Kang, Emily Kasbohm, Nikki Kaufman, Anand Lalwani, Kaiwen Li, Rebecca
Li, Francesca Lim, Amy Lipman, Susan Liu, Jacinta Lomba, Marco Lorenzo Luy,
Ryan Ma, Molly Magid, Megs Malpani, Amy Miao, Mili Mitra, Jasper Miura, Ri-
cardo Mullings, Dan Murphy, Mia Murphy, Sakura Nakada, Kenta Nakagawa, Zach
Neronha, Valerie Nguon, Kemi Odusanya, Angel Ortiz, Clare Peabody, Shaughn
Pender, Brian Pfaff, Marina Renton, Zach Ricca, Sachin Sastri, Sam SaVaun, Isabel
vi
Scherl, Jon Schlafer, Ned Schweikert, Penelope Shao, Drew Solomon, Zach Spector,
Ellen Sukharevsky, Yashil Sukurdeep, Hans Sun, Heather Sweeney, Brittani Tay-
lor, Valeria Tiourina, Charlotte Tisch, Brian Tung, Carolina Velasco, Fifi Walker,
Joanna Walsh, Ben Winston, Jordan White, Zach Woessner, Mingyi Wu, Jonathan
Yakubov, Amanda Yan, Yuval Yossefy, Mario Zaharioudakis, Wennie Zhang, Favi
Zuniga. You are all capable of so much more than you think and I am so fortunate
to have met all of you.
To my undergraduate institution, Mount Holyoke College, particularly to my
undergraduate advisor, Jessica Sidman, and my role models, Margaret Robinson,
Giuliana Davidoff, and Harriet Pollatsek.
To the person who inspired my first love for math: my high school teacher,
Iolanda Podeanu.
To my parents, who always had more faith in me than I did.
vii
Contents
1 Introduction 1
2 Preliminaries 8
2.1 Atomic filtered spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Calderón-Zygmund operators . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Scalar weights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 Matrix weights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.5 The matrix A2 class . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3 Scalar weighted estimates 15
3.1 Two weight bounds for M qa . . . . . . . . . . . . . . . . . . . . . . . 16
3.2 Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3 Proof of the two weight estimate . . . . . . . . . . . . . . . . . . . . . 20
4 The Carleson Embedding Theorem 25
4.1 The Carleson embedding theorem . . . . . . . . . . . . . . . . . . . . 26
4.2 Trivial reductions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.3 Invertibility of W . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.4 The Bellman functions . . . . . . . . . . . . . . . . . . . . . . . . . . 30
viii
4.5 From Bellman functions to the estimate . . . . . . . . . . . . . . . . . 31
4.6 Final step: estimating ∑QRQ(0) . . . . . . . . . . . . . . . . . . . . 32
4.7 Verifying the properties of Bs . . . . . . . . . . . . . . . . . . . . . . 36
5 Matrix weighted two weight estimates for well-localized operators 42
5.1 Expectations and martingale differences . . . . . . . . . . . . . . . . 44
5.2 Generalized band operators . . . . . . . . . . . . . . . . . . . . . . . 45
5.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.4 Weighted martingale differences . . . . . . . . . . . . . . . . . . . . . 51
5.5 Density of simple functions . . . . . . . . . . . . . . . . . . . . . . . . 53
5.6 Well-Localized operators . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.7 From band operators to well-localized operators . . . . . . . . . . . . 57
5.8 Estimates of well-localized operators . . . . . . . . . . . . . . . . . . 60
5.9 Applications to the estimates of Haar shifts . . . . . . . . . . . . . . 63
5.10 The A2 theorem and linear dependence on complexity . . . . . . . . . 66
5.11 Weighted paraproducts . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.12 Estimates of the paraproducts . . . . . . . . . . . . . . . . . . . . . . 71
5.13 Estimates of well-localized operators . . . . . . . . . . . . . . . . . . 74
5.14 Estimate of the main part . . . . . . . . . . . . . . . . . . . . . . . . 75
5.15 Estimates of parts involving constant functions . . . . . . . . . . . . . 78
5.16 Estimates of the Haar shifts . . . . . . . . . . . . . . . . . . . . . . . 79
5.17 Comparison of different truncations . . . . . . . . . . . . . . . . . . . 81
5.18 Proof of Lemma 5.9.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Bibliography 85
ix
CHAPTER 1
Introduction
The theory of weighted estimates has been an area of active research in harmonic
analysis since the 1960s, when the seminal work of Helson and Szegö [HS60] pro-
vided necessary and sufficient conditions for the boundedness of the Hilbert trans-
form in weighted Lebesgue spaces. In the 1970s, Hunt, Muckenhoupt, and Wheeden
[HMW73] formulated new conditions, giving a completely different description of
the weights for which the Hilbert transform is bounded in the weighted Lp space.
Presently known as the Muckenhoupt Ap conditions, these statements have held, and
continue to hold, a prominent place in the weighted theory literature.
In 1974, Coifman and Fefferman [CF74] showed that the necessity and sufficiency
of the Ap conditions extends to the boundedness of a wider family of singular integral
operators, which includes, but is not limited to the Hilbert transform. Namely, it
was shown that a Calderón-Zygmund operator is bounded on the weighted Lp(w)
space if and only if the weight w belongs to the so-called Ap class, or, equivalently,
if a quantity known as the Ap characteristic of the weight (denoted [w]Ap) is finite.
1
2
A natural question that arises from these estimates regards a more precise charac-
terization of Lp boundedness. If w is an Ap weight, one may ask in what way does the
norm of a Calderón-Zygmund operator on Lp(w) depend on the Ap characteristic of
w. Such norm estimates have applications in the study of elliptic partial differential
equations and quasiconformal mappings (see for instance[FKP91], [FG13])
Of particular importance is the case when the weight w is in A2. Rubio de Fran-
cia’s groundbreaking work from 1984 [RdF84] introduced the extrapolation theorem,
which essentially reduces problems in Lp to the weighted L2 case. As summarized
informally by Rubio de Francia’s colleague Antonio Córdoba, “there are no Banach
function spaces, just weighted L2". As a result, any estimate of the dependence of
the L2(w) norm of an operator on [w]A2 can be translated into a statement about Lp
norms.
In the context of extrapolation, a significant amount of research during the 1990s
and 2000s went into the so-called A2 conjecture, which claims that not only is any
Calderón-Zygmund operator T bounded on the weighted space L2(w) if and only if
w is an A2 weight, but the sharp norm dependence is in fact linear, i.e.
‖T‖L2(w)→L2(w) . [w]A2 .
Throughout the 1990s and 2000s, a series of important contributions were made,
starting with S. Buckley’s work from 1993 [Buc93], which proved that the weighted
norm of the Hardy-Littlewood maximal operator depends linearly on the A2 char-
acteristic and that the estimate is sharp. Throughout the following two decades,
sharp linear dependence was established for various other operators. In 2000, J.
Wittwer proved the statement for Haar multipliers [Wit00], and then, in 2002, S.
Petermichl and A. Volberg proved it for the Ahlfors-Beurling transform [PV02]. In
3
2007 [Pet07] and 2008 respectively [Pet08], Petermichl showed that the linear bound
also holds for the Hilbert and Riesz transforms. Further advances were made by O.
Beznosova ([Bez08] for the dyadic paraproduct), M. Lacey, S. Petermichl, and M.
Reguera ([LPR10] for the Haar shifts), D. Cruz-Uribe, J. Martell, and C. Pérez (a
simplified proof for the Haar shifts [CUMP10], [CUMP12]), and, finally, C. Pérez, S.
Treil, and A. Volberg (a proof for general Calderón- Zygmund operators, but with
the bound [w]A2 log(1 + [w]A2) instead of the optimal [w]A2 [PTV10]).
The conjecture was finally proved in full generality in 2010 by Hytönen [Hyt12b]
(see also [Hyt12a]) through an argument that reduces the study of Calderón-Zygmund
operators to the uniform boundedness of simpler objects called dyadic shifts, which
exhibit the desired linear dependence. Various different and substantially simpler ap-
proaches have been provided since. In 2012, A. Lerner [Ler13] showed that Calderón–
Zygmund norms could be estimated by a special class of operators called sparse op-
erators, and, as such, the A2 Theorem is a simple consequence of the “local mean
oscillation decomposition" introduced in [Ler12]. Yet another approach was given
by M. Lacey in [Lac15], who gave a straightforward way of establishing pointwise
control of a Calderón–Zygmund operator by a sparse operator, further simplifying
Lerner’s argument. Developments in the area of pointwise estimates for multilinear
singular integrals by sparse form are also presented in [CDPO16].
While the A2 conjecture has been settled in the scalar case, the question remains
open in the setting of vector-valued function spaces with matrix weights. Other than
its intrinsic interest, such a setting is important for its applications to geometric
function theory, Toeplitz operators, multivariate prediction theory, and even the
study of finitely generated shift invariant subspaces of unweighted Lp(Rd) spaces (see
[IM01] [NT96], [Nie10], [Vol97]). In the matrix weighted setting, the A2 conjecture
4
claims that if W is a weight in the appropriately defined matrix A2 class and T is a
Caldéron-Zygmund operator, then T is bounded on the weighted space L2(W), and
the constant appearing in the L2(W) norm estimate for T depends linearly on the
A2 characteristic.
In spite of a great amount of recent work, progress in the area of matrix weighted
inequalities has been slow. The setup imposes a variety of difficulties, from issues
of commutativity and preserving homogeneity, to challenges regarding the very def-
initions of objects that appear in the scalar setting. While the boundeness of T on
L2(W) was settled almost a decade ago through the work of Nazarov and Treil, and
respectively Volberg ([NT96], [Vol97]), the explicit dependence on the A2 character-
istic is far from being fully understood. The problem of proving linear bounds in
[W]A2 for singular integral operators is currently the object of ongoing research.
Some recent estimates have been obtained for specific operators by adapting
arguments from the scalar setting. For the Hilbert transform H, it was shown by
Bickel, Petermichl, and Wick [BPW14] that
‖H‖L2(W)→L2(W) . [W]32A2 log[W]A2 .
For the Riesz transform R and the Ahlfors-Beurling transform B, the author,
together with B. Wick [CW15], proved similar bounds:
‖R‖L2(W)→L2(W ) . [W]32A2 log[W]A2
‖B‖L2(W)→L2(W) . [W]32A2 log[W]A2
Clearly, these estimates are suboptimal and, in fact, they can be further improved,
5
as we will show in the final chapter of this thesis. Their proof relies on the fact that
the operators above can all be represented as averages of dyadic shifts. For the Hilbert
transform, this was shown by S. Petermichl in [Pet00]. For the Riesz transform, it
is a result of S. Petermichl, S. Treil, and A. Volberg [PTV02], and for the Ahlfors-
Beurling transform it is a result of O. Dragičević and A. Volberg [DV03]. Providing
uniform bounds for the dyadic shifts, which in this situation are significantly simpler
objects, yields the results for H, R, and B. However, linear dependence does not
follow as it would in the scalar case.
Another family of operators which have been studied in the matrix setting are
the so-called sparse operators. An operator S : L2(R,Cd)→ L2(R,Cd) is said to be
sparse if
Sf =∑Q∈S〈f〉Q 1Q(x),
where S is a collection of dyadic cubes satisfying
∑R∈ChS(Q)
|R| ≤ 12 |Q|,
for all Q ∈ S and ChS(Q) stands for the children of Q in S, i.e. the maximal
dyadic subcubes of Q in S. In [BW14], Bickel and Wick showed that if S is a sparse
operator, then
‖S‖L2(W )→L2(W ) . [W ]32A2 .
The importance of estimating sparse operators is clear by analogy with the scalar
setup. Recent proofs of the scalar A2 conjecture (such as those provided by Lerner
and Lacey) have taken the approach of controlling Calderón-Zygmund operators by
sparse operators. Therefore, if one can prove linear bounds for sparse operators,
one will potentially have all the required ingredients for a proof of the matrix A2
6
conjecture.
In the final chapter of this thesis, which overlaps with the text of [BCTW16], we
will prove the bound [W ]32A2 for another important class: the well-localized operators.
A discussion of the difficulties that prevent us from obtaining linear bounds, and thus
proving the matrix A2 conjecture will also be included. We will, however, extend the
setup to include not only dyadic lattices, but also general atomic filtrations. This
chapter generalizes the result presented in [NTV08] to the context of matrix weighted
spaces.
The rest of the thesis is organized as follows. In Chapter 2 we introduce the no-
tion of atomic filtered spaces and give examples of atomic filtrations, including, but
not limited to, the dyadic case. We also define the objects to be studied throughout:
scalar A2 weights, matrix weighted spaces, and the corresponding matrix A2 class.
The following three chapters constitute two distinct parts. The first, comprised of
Chapter 3, whose content overlaps with [Cul15], discusses a scalar problem: the two
weight boundedness of a class of operators given by `q norms in the space of se-
quences indexed by atoms. The second part, comprised of the final two chapters,
provides new results and various improvements to the known arguments in the matrix
weighted literature. We begin by introducing an essential tool for matrix weighted
estimates, which was up until recently an open problem: the matrix Carleson em-
bedding theorem. This theorem was obtained in [CT15] by the author and S. Treil,
following a suggestion of F. Nazarov. Unlike previous embedding results, the one
presented in this thesis (and in [CT15]) gives a complete analogue of the scalar case,
with no additional assumptions on the weight or the structure of the space. We
employ this tool in the last chapter to study two weight bounds for well-localized on
vector-valued function spaces. Our results parallel those in [NTV08], but provide an
7
improvement of the arguments in [NTV08] even when restricted to the scalar setting.
CHAPTER 2
Preliminaries
In this chapter we introduce some of the definitions and notation to be used through-
out the thesis. Further definitions, as well as any potential changes in notation will
be provided as they arise, usually at the beginning of each chapter. However, the
general atomic filtration setting is to be assumed in all that follows, whether or not
it is explicitly mentioned.
2.1 Atomic filtered spaces
Let (X ,F , σ) be a sigma-finite measure space with a filtration {Fn}, that is, a
sequence of increasing sigma-algebras Fn ⊂ F . Here F is taken to be the smallest
sigma-algebra containing ⋃Fn.We make the assumption that Fn is atomic, meaning
that there exists a countable collection Dn of disjoint sets Q of finite measure (which
we will call atoms or cubes) with the property that every set in Fn can be written
as a union of atoms (cubes) Q ∈ Dn.
8
9
Denote by D the collection of all atoms, D =⋃n∈ZDn. A set Q could belong
to multiple generations Dn, so atoms Q ∈ Dn should formally be represented as
pairs (Q, n). However, to simplify notation, we will suppress the dependence on n
and write Q instead of (Q, n); if the “time” n is needed, it will be represented by
rkQ, i.e. if Q stands for the atom (Q, n) then we will say n = rkQ. The inclusion
R ⊂ Q for atoms should be understood as set inclusion, together with the inequality
rkR ≥ rkQ. In particular, for any r ∈ Z, ChrQ will stand for the children of order
r of Q, the collection of atoms R ⊂ Q with rkR = r + rkQ. For r = 1, we write
ChQ and avoid the superscript.
For a measurable set E, and the underlying measure σ on the space X , we will
often use the notation |E| to represent σ(E) and dx for dσ(x).
Example 2.1.1. A standard example of an atomic filtration is the standard dyadic
filtration in Rd with the Lebesgue measure.
For n ∈ Z, let
Dn := {2−n((0, 1]d + k
): k ∈ Zd}
be the collection of dyadic cubes of size 2−n. Then each Fn is the σ-algebra generated
by Dn, and F is the Borel σ-algebra.
Note that in this example we do not have atoms of different ranks coinciding as
sets.
The standard dyadic filtration also leads to more interesting examples, such as
the one below:
Example 2.1.2. Consider a measurable set X ⊂ Rd, again endowed with the Lebesgue
measure. For each n ∈ Z, define the collection of atoms Dn as the collection of all
non-empty intersections Q∩X , where Q runs over all dyadic cubes of size 2−n from
10
the previous example. If, for example, X = Q0 = (0, 1]d, then Q0 ∈ Dn for all n ≤ 0,
so we have cubes of different ranks, coinciding as sets.
By taking more complicated sets X , we can have more complicated structures of
atoms and their ranks. Furthermore, we can expand upon these examples by letting
the underlying measure σ be any arbitrary Radon measure.
2.2 Calderón-Zygmund operators
Let σ be a measure on Rd. A singular integral operator is an operator T on the
function space Lp(σ) given formally by the expression
Tf(x) =ˆK(x, y)f(y)dσ(y),
where the kernel K(x, y) exhibits a singularity near x = y (i.e. K(x, ·) and K(y, ·)
are not locally integrable as functions of x and y respectively). A classical example
of such an operator is the Hilbert transform on the real line:
Hf(x) =ˆR
1x− y
f(y)dy.
In particular, a singular integral operator is said to belong to the Calderón-
Zygmund class if it is bounded on L2 and its kernel K satisfies the following growth
and cancellation conditions:
• |K(x, y)| ≤ C|x− y|−d, x 6= y
• |K(x, y)−K(x′, y)|, |K(y, x)−K(y, x′)| ≤ C |x−x′|δ|x−y|d+δ , for |x− x′| ≤ |x−y|
2 ,
for some constants C > 0 and δ > 0.
11
One can easily see that the Hilbert transform is a Calderón-Zygmund operator.
Other examples of such operators are the Riesz transforms
Rf(x) =ˆRn
xj − yj|x− y|d+1f(y)dσ(y), 1 ≤ j ≤ d
and the Ahlfors-Beurling transform
Bf(z) =ˆC
f(w)(z − w)2 dσ(w).
2.3 Scalar weights
A (scalar) weight w is a nonnegative, locally integrable function. For a weight w,
one can define the space Lp(w) as the normed function space with norm given by
‖f‖Lp(w) :=(ˆ|f(x)|pw(x)dx
)1/p
<∞.
One can also define the Ap characteristic of w, denoted [w]Ap :
[w]Ap := supQ
( Q
w(x)dx)(
Q
w−p′p (x)dx
) pp′
where the supremum is taken over all atoms Q and p′ is the Hölder conjugate expo-
nent to p, 1p
+ 1p′
= 1. We will say that w belongs to the Muckenhoupt Ap class if the
quantity [w]Ap is finite. In particular, w is said to be an A2 weight if
[w]A2 := supQ
( Q
w(x)dx)(
Q
w−1(x)dx)<∞.
12
2.4 Matrix weights
We can extend the weighted theory to the setup of vector-valued function spaces.
Let F0 be the collection of sets E ∩ F where E ∈ F and F is a finite union of
atoms. A d× d matrix-valued measure W on X is a countably additive function on
F0 with values in the set of non-negative operators on Fd, where F is either C or
R. Equivalently, W = (wj,k)dj,k=1 is a d × d matrix whose entries wj,k are (possibly
signed or even complex-valued) measures, finite on atoms, and such that for any
E ∈ F0 the matrix (wj,k(E))dj,k=1 is positive semidefinite. Note that the measure W
is always finite on atoms.
Given such a measure W and measurable functions f = (f1, f2, . . . , fd)T and
g = (g1, g2, . . . , gd)T with values in Fd, we can define the integrals
ˆX
⟨dWf, g
⟩Fd
:=d∑
j,k=1
ˆXfkgjdwj,k
ˆX
dWf
where the second integral is a vector whose jth coordinate is given by
d∑k=1
ˆXfkdwj,k.
Remark 2.4.1. Readers not comfortable with matrix-valued measures can always,
without loss of generality, restrict themselves to working with absolutely continu-
ous measures and matrix-valued functions. Namely, it is an easy corollary of the
non-negativity of the matrix measure W that all the measures wj,k are absolutely
13
continuous with respect to the measure
w := tr W :=d∑
k=1wk,k.
Therefore, we can write dW = Wdw, where W is a w-a.e. positive semi-definite
matrix-valued function and
ˆX
dWf =ˆXWfdw,
ˆX
⟨dWf, g
⟩Fd
=ˆX
⟨Wf, g
⟩Fd
dw.
For a measure W, the weighted space L2(W) is defined as the set of all measurable
Fd-valued functions (where F is R or C) such that
‖f‖2L2(W)
:=ˆX
⟨dWf, f
⟩Fd<∞.
As usual, we will consider the quotient space over the set of functions of norm 0.
2.5 The matrix A2 class
As in the scalar case, if W is a matrix measure, one can define a version of the A2
characteristic:
[W]A2 = |Q|−2 supQ
∥∥∥W(Q) 12 W− 1
2 (Q)∥∥∥2<∞,
where the supremum is taken over all atoms Q
Remark 2.5.1. Although the matrix A2 condition looks like a natural extension of its
scalar counterpart, the matrix Ap condition has a more complicated form for p 6= 2
and requires the introduction of so-called Ap metrics. See [Vol97] for a more in depth
14
discussion.
The matrix A2 conjecture claims that if T is Calderón-Zygmund and W is a
weight in the A2 class, then
‖T‖L2(W)→L2(W) . [W]A2 .
CHAPTER 3
Scalar weighted estimates
The content of this chapter overlaps with that of [Cul15] by the author.
Let (X ,F , σ) be a σ-finite measure space with an atomic filtration Fn as described
in the previous chapter and let D be the collection of atoms. Since a classical example
of such a filtration is that given by a dyadic lattice on Rd, we may often refer to D
as a lattice and to its elements as cubes. In spite of this language, we will not be
making any further assumptions on the underlying structure of the space, including
for example any assumptions about the homogeneity of X with the measure σ. Let
µ and ν be measures, finite on all Q ∈ D.
For a sequence of functions a = {aQ}Q∈D , aQ : X → [0,∞) indexed by the
sequence of cubes, define the operator M qa , given by
M qafµ(x) =
∑Q∈Dx∈Q
∣∣∣∣∣(ˆ
Q
fdµ
)aQ
(x)1Q
(x)∣∣∣∣∣q
1/q
for 1 < q <∞
15
16
and
M∞a fµ(x) = sup
{∣∣∣∣∣(ˆ
Q
fdµ
)aQ
(x)1Q
(x)∣∣∣∣∣ : Q ∈ D such that x ∈ Q
}.
In this chapter we will show that under a so-called Sawyer type testing condition,
the operator f 7→ M qafµ is bounded Lp(µ) → Lp(ν) for p ≤ q. Testing conditions
of this type were named after E. Sawyer, who introduced them in [Saw82] for the
purpose of studying the two weight estimates for the classical maximal operator M .
The testing condition presented in [Saw82] essentially amounts to a uniform estimate
on characteristic functions of dyadic cubes. Later, in [Saw88], Sawyer proved that
for operators such as fractional integrals, Poisson kernels, and other nonnegative
kernels, the two weight estimate still holds if one assumes the testing condition not
only on the operator itself, but also on its formal adjoint. For the positive martingale
operators such results were obtained in [NTV99] (p = 2) and later in [LS09] (general
p) (see also [Tre12] for an easier argument).
3.1 Two weight bounds for M qa
The main result presented this chapter is a simple proof of the theorem below. The
novelty of this theorem, compared to Sawyer’s result in [Saw82] is that it allows aQ
to be nonnegative functions and also considers the case q < ∞ (as opposed to just
q = ∞). Furthermore, our argument is not restricted to the dyadic case, and will
also hold in a nonhomogeneous setting.
17
For Q ∈ D, define the truncated operator M q
a,Qby
M q
a,Qfµ(x) :=
∑R∈D,R⊂Q
x∈R
∣∣∣∣∣(ˆ
R
fdµ
)aR
(x)1R
(x)∣∣∣∣∣q
1/q
,
with the obvious modification for q =∞.
We have the following theorem:
Theorem 3.1.1. Let 1 < p ≤ q ≤ ∞. The operator M qa satisfies
‖M qafµ‖pLp(ν)
≤ A‖f‖Lp(µ) ∀f ∈ Lp(µ)(3.1)
if and only if the following testing condition holds for the truncation M q
a,Q:
(3.2) ‖M qa,Q(1
Qµ)‖Lp(ν) ≤ Bµ(Q)1/p, for any Q ∈ D.
Moreover, for the best constants A and B, we have B ≤ A ≤ C(p)B,
C(p) =((1 + 1/p)p+1 p
)1/pp′,
where p′ is the Hölder conjugate of p, 1/p+ 1/p′ = 1.
3.2 Observations
Before proving the theorem, we make a few observations. First, it is easy to see that
for p = q =∞, the result is trivial with A = B. Note that limp→∞
C(p) = 1.
A second observation is that the classical dyadic (martingale) maximal operator
18
M is a particular case of operator M qa where q = ∞, a
Q≡ σ(Q)−1
1Q, and D is
a dyadic lattice in Rn. Therefore, one can view the M qa as a generalization of the
classical martingale maximal function.
In [Saw82], E. Sawyer considered slightly more general maximal operators M =
Mα which are a particular case of our M qa with q =∞ and a
Q≡ σ(I)−α1
Q, 0 < α ≤
1. He characterized the measures µ, ν and σ for which the inequality
‖Mfσ‖Lp(ν) ≤ A‖f‖
Lp(µ)∀f ∈ Lp(µ)(3.3)
holds.
Note that without loss of generality one can assume that µ is absolutely continu-
ous with respect to σ, dµ = wdσ (adding a singular part to µ does not change (3.3)).
So, making the standard change of weight f 7→ wp′/pf and denoting µ := w−p
′/pσ we
transform the above estimate (3.3) to (3.1). Then in this notation the necessary and
sufficient condition obtained by E. Sawyer is exactly the testing condition (3.2).
For the classical dyadic maximal operator M = Mα, the truncation M∞a,Q1Q
µ
defined above is equivalent to 1QM1
Qµ, so up to a change of measure, our setup is
identical to [Saw82].
Finally, we remark that the reduction to the two measure setup, eliminating
the underlying measure σ, is now considered standard for weighted estimates. In
the three measures setup for the classical maximal operator as in [Saw82] all the
information about σ is captured by the (constant in this case) functions aQ.
To obtain Sawyer’s estimate for the non-martingale maximal function, one can use
the two weighted estimate for the dyadic case and proceed by an averaging argument.
This reasoning is fairly standard and will not be discussed in this chapter.
Our proof simplifies the argument in [Saw82] and gives a stronger result: in
19
particular, the coefficients aQdo not need to be constant. It also has the additional
benefit of placing the Hardy-Littlewood maximal function in the context of a wide
range of similar operators.
The proof we present relies on the stopping time construction presented in [Tre12]
and the Martingale Carleson Embedding Theorem stated below.
Denote Q
fdµ = 1µ(Q)
ˆQ
fdµ.
Theorem 3.2.1. (Martingale Carleson Embedding Theorem) Let µ be a measure on
X and let {wQ}Q∈D , wQ ≥ 0 be a sequence satisfying the following condition:
∑Q⊂R,R∈D
wQ≤ Aµ(R), for any cube R ∈ D and some constant A.
Then for any measurable function f ≥ 0 and for any p ∈ (1,∞),
∑Q∈D
( Q
fdµ
)pwQ≤ (p′)pA‖f‖pLp(µ).
The Carleson Embedding Theorem with the constant (p′)p can be proved as a
straightforward consequence of the one weight Lp boundedness of the classical Hardy-
Littlewood maximal function (see [Tre12]). Other arguments that include the sharp
constant have been given by Nazarov, Treil, and Volberg [NTV01] for p = 2 and Lai
[Lai15] for p 6= 2 using Bellman function techinques. The exact Bellman function for
p > 1 was originally computed by Melas in [Mel05], but the sharp constant was not
explicitly stated.
20
3.3 Proof of the two weight estimate
We aim to prove Theorem 3.1.1. Again, we note that while we may refer to the
elements Q of D as cubes, they may not be dyadic cubes, and D is not necessarily
assumed to be the dyadic lattice on Rd.
Proof. First notice that the necessity of the testing condition and the estimate B ≤ A
are trivial: if M qa is bounded on Lp(µ) functions, it is, in particular, bounded on
caracteristic functions. Thus, by testingM qa on the functions 1Q, we obtain condition
(3.2).
To prove sufficiency, we begin by constructing a collection of stopping cubes
G ⊂ D, following the definitions and notation in [Tre12]. For any cube Q ∈ D, define
D(Q) to be the collection of subcubes of Q in D. For a fixed r > 1, let G∗(Q) be the
set of stopping cubes of Q, that is,
G∗(Q) ={R ∈ D(Q), R maximal, 1
µ(R)
ˆR
fdµ ≥ r1
µ(Q)
ˆQ
fdµ
},
where maximality is considered with respect to the partial ordering given by inclu-
sion.
Denote by E(Q) the collection of descendants of Q that are not stopping cubes
or descendants of the stopping cubes:
E(Q) = D(Q) \⋃
P∈G∗(Q)D(P ).
21
Note that, by definition, for any R ∈ E(Q),
(3.4) R
fdµ < r
Q
fdµ.
Also note that
(3.5)∑
R∈G∗(Q)µ(R) = µ
⋃R∈G∗(Q)
R
≤ µ(Q)r
.
To construct the collection G of stopping cubes, let N be a fixed large positive
integer and define the first generation G1 as
G1 = D−N .
Then, to obtain the subsequent generations, apply the inductive formula
Gn+1 =⋃
Q∈GnG∗(Q).
Define the collection of stopping cubes G to be the union
G =∞⋃n=1Gn.
Equation (3.5) implies that
(3.6)∑
R∈G,R⊂Qµ(R) ≤ r
r − 1µ(Q), ∀Q ∈ D.
To prove the theorem it is sufficient to prove uniform bounds inN for the operator
22
M q,Na ,
M q,Na f :=
∑n≥−N
∑Q∈Dn
(( Q
fdµ
)µ(Q)a
Q(x)1
Q(x))q1/q
(with the obvious change for q =∞) and then let N →∞.
Given the construction of stopping moments, it is easy to see that
∞⋃n=−N
Dn =⋃Q∈GE(Q).
and that the sets E(Q) are disjoint.
In the proof below we use notation for 1 < q < ∞. The proof for q = ∞ is
absolutely the same (up to obvious changes in the notation).
Denoting
FQ
(x) = ∑R∈E(Q)
(( R
fdµ
)µ(R)a
R(x)1
R(x))q1/q
,
we can writeM q,Na f =
∑Q∈G
F q
Q
1/q
, so the proof amounts to bounding
∥∥∥∥∥∥∥∑Q∈G
F q
Q
1/q∥∥∥∥∥∥∥Lp(ν)
.
Since ‖x‖`q ≤ ‖x‖`p for q ≥ p, we can estimate
∥∥∥∥∥∥∥∑Q∈G
F q
Q
1/q∥∥∥∥∥∥∥Lp(ν)
≤
∥∥∥∥∥∥∥∑Q∈G
F p
Q
1/p∥∥∥∥∥∥∥Lp(ν)
(3.7)
=ˆ ∑
Q∈GF p
Qdν
1/p
=∑Q∈G
ˆF p
Qdν
1/p
=∑Q∈G‖F
Q‖pLp(ν)
1/p
.
23
By definition,
∥∥∥FQ
∥∥∥pLp(ν)
=
∥∥∥∥∥∥∥ ∑R∈E(Q)
(( R
fdµ
)µ(R)a
R1R
)q1/q∥∥∥∥∥∥∥p
Lp(ν)
(3.8)
≤
∥∥∥∥∥∥∥ ∑R∈E(Q)
((r
Q
fdµ
)µ(R)a
R1R
)q1/q∥∥∥∥∥∥∥p
Lp(ν)
by (3.4)
= rp(
Q
fdµ
)p ∥∥∥∥∥∥∥ ∑R∈E(Q)
((ˆR
1Qdµ
)aR1R
)q1/q∥∥∥∥∥∥∥p
Lp(ν)
≤ rpBp
( Q
fdµ
)pµ(Q) by (3.2) .
Therefore, from inequalities (3.7) and (3.8) we obtain
∥∥∥∥∥∥∥∑Q∈G
F q
Q
1/q∥∥∥∥∥∥∥Lp(ν)
≤
∑Q∈G‖F
Q‖pLp(ν)
1/p
≤ rB
∑Q∈G
( Q
dµ
)pµ(Q)
1/p
.
The final step is to apply Theorem 3.2.1, taking
wQ
=
µ(Q) : Q ∈ G
0 : Q /∈ G
Equation (3.6) shows that the sequence wQsatisfies the Carleson measure condi-
tion. Hence ∑Q∈G
( Q
fdµ
)pµ(Q) ≤ r
r − 1(p′)p‖f‖pLp(µ).
Consequently,
24
∥∥∥M q,Na fµ
∥∥∥pLp(ν)
≤ rp+1
r − 1(p′)pBp‖f‖pLp(µ).
In particular, since no assumption was made on r other than r > 1, one can consider
the minimal value of the constant on the right hand side, which is attained when
r = p+1p. Then
∥∥∥M q,Na fµ
∥∥∥pLp(ν)
≤(
1 + 1p
)p+1
p(p′)pBp‖f‖pLp(µ).
Observe that the right hand side above does not depend on the choice of N .
Taking the limit as N approaches ∞ completes the proof for M qa .
CHAPTER 4
The Carleson Embedding Theorem
The content of this chapter overlaps with that of [CT15], by the author and her
advisor, S. Treil.
As evident from the previous chapter, the Carleson Embedding Theorem is an
invaluable tool in the proof of weighted estimates in scalar-valued function spaces.
As one may expect, a similar embedding theorem is essential for proving analogous
matrix weighted bounds. In what follows we introduce this result, obtained in [CT15]
by the author and S. Treil. It is important to mention that while earlier versions of
the matrix weighted Carleson Embedding Theorem were known through the work of
F. Nazarov, S. Treil, and A. Volberg in [TV97] and, more recently, Isralowitz, Kwon,
and Pott in [IKP14], and Bickel and Wick in [BW15], all these results required strong
additional assumptions, such as the weight belonging to the A2 class. It quickly
becomes clear that the [W ]A2 dependence is a particular concern for the problem of
proving sharp bounds.
25
26
The weighted embedding theorem presented below does not assume any prop-
erties for the matrix weight except local boundedness, and produces an embedding
constant that depends polynomially on the dimension of the space. As in the scalar
case, our embedding theorem states that the Carleson measure condition, which is
just a simple testing condition, implies the embedding.
For matrix weights, the Carleson measure condition (condition (ii) in Theorem
4.1.1 or condition (iii) in Theorem 4.1.2) is an inequality between positive semidefi-
nite matrices. For scalar weights in the domain, the right hand side of the inequality
is a multiple of the identity matrix I: in this situation, sacrificing constants, one can
replace matrices by their norms, and the matrix embedding theorem trivially follows
from the scalar one. Of course, the constants obtained by such trivial reduction are
far from optimal: constants of optimal order were obtained using more complicated
reasoning in [NPTV02]. For our setup, both sides of the Carleson measure condition
are general positive semidefinite matrices, so the simple strategy of replacing matri-
ces by norms or traces will not work. A more complicated idea, in the spirit of the
argument in [NPTV02], is used to obtain the result. We will introduce a family of
Bellman functions depending on a nonnegative parameter s. The convexity proper-
ties of these functions, together with an observation on their behavior as functions
of s, will give the desired estimates.
4.1 The Carleson embedding theorem
The main result of this chapter is the following theorem:
Theorem 4.1.1. Let W be a d × d matrix-valued measure and let AQ
be positive
semidefinite d× d matrices. The following statements are equivalent:
27
(i)∑Q∈D
∥∥∥∥∥A1/2Q
ˆQ
W(dx)f(x)∥∥∥∥∥
2
≤ A‖f‖2L2(W)
for all f ∈ L2(W);
(ii)∑Q∈DQ⊂Q0
W(Q)AQ
W(Q) ≤ BW(Q0) for all Q0 ∈ D.
Moreover, for the best constants A and B, we have B ≤ A ≤ CB, where C = C(d)
is a constant depending only on the dimension d.
Note that the underlying measure σ is absent from the statement of the theorem:
we do not need σ in the setup, we only need the filtration Fn. Alternatively, we can
pick σ to make the setup more convenient. For example, if we define
σ := tr W :=d∑
k=1wk,k,
then the measures wj,k are absolutely continuous with respect to σ. Thus, we can
always assume that our matrix-valued measure W is an absolutely continuous mea-
sure Wdσ, where W is a matrix weight, i.e. a locally integrable (meaning integrable
on all atoms Q) matrix-valued function with values in the set of positive semidefinite
matrices.
As before, for a measurable function f , we will denote by 〈f〉Qits average on Q,
〈f〉Q
:= σ(Q)−1ˆQ
fdσ,
and if σ(Q) = 0, we will say that 〈f〉Q
= 0. The same definition is used for both
vector and matrix-valued functions.
The theorem below is the restatement of Theorem 4.1.1 in this setup, obtained
by setting AQ
= |Q|−1AQ. More precisely, Theorem 4.1.1 is just the equivalence
(ii)⇐⇒ (iii) in Theorem 4.1.2. The equivalence (i)⇐⇒ (ii) will be explained below.
28
Theorem 4.1.2. Let W be a d × d matrix-valued weight and let AQ, Q ∈ D be a
sequence of positive semidefinite d× d matrices. Then the following are equivalent:
(i)∑Q∈D
∥∥∥∥A1/2Q〈W 1/2f〉
Q
∥∥∥∥2|Q| ≤ A‖f‖2
L2 .
(ii)∑Q∈D
∥∥∥∥A1/2Q〈Wf〉
Q
∥∥∥∥2|Q| ≤ A‖f‖2
L2(W ).
(iii) 1|Q0|
∑Q∈DQ⊂Q0
〈W 〉QAQ〈W 〉
Q|Q| ≤ B〈W 〉
Q0for all Q0 ∈ D.
Moreover, B ≤ A ≤ CB, where C = C(d) = e · d3(d+ 1)2.
4.2 Trivial reductions
The equivalence of (i) and (ii) is trivial. In (i), perform the change of variables
f := W 1/2f to obtain (ii) and similarly, in (ii) set f := W−1/2f to obtain (i). Note
that here we do not need to assume that the weight W is invertible a.e.: we just
interpret W−1/2 as the Moore–Penrose inverse of W 1/2.
The implication (i) =⇒ (iii) and the estimate A ≥ B become obvious if one sets
f = W 1/21Qe, e ∈ Fd in (i) (recall that F stands for either R or C). Equivalently,
to show that (ii) =⇒ (iii) it suffices to apply (ii) to the test functions f = 1Qe.
Therefore, it remains to prove that (iii) =⇒ (i), or equivalently, that (iii) =⇒ (ii).
4.3 Invertibility of W
Let us notice that without loss of generality we can assume that the weight W is
invertible a.e., and even more, that the weight W−1 is uniformly bounded. To show
29
this, define, for α > 0, the weight Wα by Wα(s) := W (s) + αI, and let
AαQ
:= 〈Wα〉−1Q〈W 〉
QAQ〈W 〉
Q〈Wα〉−1
Q.
If (iii) is satisfied, then trivially
1|Q0|
∑Q∈DQ⊂Q0
〈Wα〉QAα
Q〈Wα〉Q |Q| ≤ B〈W 〉
Q0≤ B〈Wα〉Q0
.
If Theorem 4.1.2 holds for invertible weights W , we get that for all f ∈ L2(W ) ∩ L2
∑Q∈D
∥∥∥∥(AαQ)1/2〈Wαf〉Q∥∥∥∥2|Q| ≤ A‖f‖2
L2(Wα).
Noticing that
‖f‖L2(Wα)
→ ‖f‖L2(W )
〈Wαf〉Q → 〈Wf〉Q
AαQ→ A
Q
as α → 0+ we immediately get (ii) for all f ∈ L2(W ) ∩ L2. Note that in this case
taking the limit inside the sum is justified, because an infinite sum of non-negative
numbers is the supremum of all finite subsums, and finite sums commute with limits.
Since the estimate (ii) holds on a dense set, extending the embedding operator by
continuity, we trivially obtain that (ii) holds for all f ∈ L2(W ).
30
4.4 The Bellman functions
By homogeneity, we can assume without loss of generality that B = 1. As discussed
above, we only need to prove the implication (iii) =⇒ (i).
Following a suggestion by F. Nazarov, we will do so by a “Bellman function with
a parameter” argument similar to one presented in [NPTV02]. Denote
FQ
= ‖f‖2L2(Q)
:= 〈|f |2〉Q
(4.1)
MQ
= 1|Q|
∑R⊂Q〈W 〉
RAR〈W 〉
R(4.2)
xQ
= 〈W 1/2f〉Q.(4.3)
For any real number s, 0 ≤ s <∞, define the family of Bellman functions
Bs(Q) = Bs(FQ , xQ ,MQ) =
⟨(〈W 〉
Q+ sM
Q
)−1xQ, x
Q
⟩Fd.(4.4)
Notice that FQ
is not explicitly involved in the definition of Bs(Q). However, we
retain it as a variable because it will be used in the estimates.
The functions Bs(Q) satisfy the following properties:
(i) The range property: 0 ≤ Bs(Q) ≤ FQ;
(ii) The key inequality:
Bs(Q) + sRQ
(s) ≤∑
Q′∈Ch(Q)
|Q′||Q|Bs(Q′)(4.5)
31
where
RQ
(s) = ‖A1/2Q〈W 〉
Q(〈W 〉
Q+ sM
Q)−1x
Q‖2.
The inequality Bs(Q) ≥ 0 is trivial, and the inequality Bs(Q) ≤ FQfollows immedi-
ately through an application of the Cauchy-Schwarz inequality. The details of this
computation are presented in the proof of Lemma 4.7.1 below. The key inequality
(4.5) is a consequence of Lemma 4.7.3, which we also prove below, in the final section
of this chapter.
4.5 From Bellman functions to the estimate
Let us assume for now that the properties of Bs(Q) hold true. Then we can use them
to prove our main theorem. Rewrite (4.5) as
|Q|Bs(Q) + |Q|sRQ
(s) ≤∑
Q′∈Ch(Q)|Q′|Bs(Q′).
Then, applying this estimate to each Bs(Q′), and then to each descendant of each
Q′, we get, going m generations down,
|Q|Bs(Q) +∑
Q′∈D:Q′⊂QrkQ′<rkQ+m
sRQ′
(s)|Q′| ≤∑
Q′∈D:Q⊂QrkQ′=rkQ+m
|Q′|Bs(Q′) ≤ ‖f1Q‖2L2 .
In the last inequality we used the fact that
Bs(Q) ≤ FQ
= 〈‖f( · )‖2〉Q
= |Q|−1‖f1Q‖2L2 .
32
Letting m→∞ and ignoring the non-negative term sBs(Q) in the left hand side,
we get that
s∑
Q′∈D:Q′⊂QRQ′
(s)|Q′| ≤ ‖f1Q‖2L2 .
Furthermore, summing the above inequality over all Q ∈ Dn, we obtain
s∑
Q′∈D: rkQ′≥nRQ′
(s)|Q′| ≤ ‖f‖2L2 .
Then, letting n→ −∞ and replacing Q′ by Q, we arrive to the estimate
(4.6) s∑Q∈DRQ
(s) ≤ ‖f‖2L2 .
Note that
RQ
(0) = ‖A1/2QxQ‖ = ‖A1/2
Q〈W 1/2f〉
Q‖,
so to prove (i) we need to estimate ∑QRQ(0). However, at this point in the proof,
we only have the estimate of s∑QRQ(s).
In the scalar case, the proof would be complete: since MQ≤ 〈W 〉
Q, we have
RQ
(0) ≤ 4RQ
(1), which gives us (i) with constant 4B. Due to non-commutativity,
such an estimate fails in the matrix case, so an extra step is needed.
4.6 Final step: estimating ∑QRQ
(0)
The final piece of the proof of Theorem 4.1.2 is the following lemma:
Lemma 4.6.1. For ε > 0
RQ
(0) ≤ C(ε, d)1ε
ˆ ε
0sR
Q(s)ds.
33
Moreover, for ε = 2/d we can have C(d) = e · d3(d+ 1)2.
Remark 4.6.2. Applying Lemma to (4.6), we get
∑Q∈DRQ
(0) ≤ e · d3(d+ 1)2‖f‖2L2 ,
which proves Theorem 4.1.2.
We conclude this section by providing a proof of the lemma above.
Proof. Observe that it follows from the cofactor inversion formula that the entries of
the matrix (〈W 〉Q
+ sMQ
)−1 are of the form pj,k(s)Q(s) , where
Q(s) = QQ
(s) = det(〈W 〉Q
+ sMQ
)
is a polynomial of degree at most d, and pj,k(s) are polynomials of degree at most
d− 1.
Therefore RQis a rational function in s,
RQ
(s) =PQ
(s)|Q
Q(s)|2 ,
where PQ
(s) is a polynomial of degree at most 2(d− 1) and PQ
(s) ≥ 0. We can then
write PQ
(s) = |PQ
(s)|2, where PQhas degree at most d− 1. Therefore,
RQ
(s) =∣∣∣∣∣∣PQ
(s)QQ
(s)
∣∣∣∣∣∣2
.
By hypothesis, MQ≤ 〈W 〉
Q, so the operator 〈W 〉
Q+ sM
Qis invertible for all s
such that Re(s) > −1. Thus the zeroes of QQ
(s) are all in the half plane Re(s) ≤ −1.
34
Let λ1, λ2, ..., λd be the roots of the polynomial QQ
(s) counting multiplicity. We
have ∣∣∣∣∣∣QQ
(s)QQ
(0)
∣∣∣∣∣∣ =d∏
k=1
∣∣∣∣∣s− λkλk
∣∣∣∣∣ .For a fixed s and Reλk ≥ −1 the term |s−λk|/|λk| attains its maximum at λk = −1.
Therefore, on the interval [0, ε],
(4.7)∣∣∣∣∣∣QQ
(s)QQ
(0)
∣∣∣∣∣∣ ≤ (1 + ε)d .
From the estimate above,
(4.8)ˆ ε
0s
∣∣∣∣∣∣PQ
(s)QQ
(0)
∣∣∣∣∣∣2
ds ≤ (1 + ε)2dˆ ε
0sR
Q(s)ds.
It will suffice then to find a constant C1 = C1(ε, d) such that for any polynomial
p of degree at most d− 1
(4.9) |p(0)|2 ≤ C1
ˆ ε
0s |p(s)|2 ds
ε.
Note that if we are not interested in determining the exact constant C(d), the argu-
ment is complete: we can just consider the space of polynomials of degree at most d
endowed with the norm
‖p‖ := ε−1ˆ e
0s|p(s)|2ds
and the linear functional p 7→ p(0). Since any linear functional on a finite-dimensional
normed space is bounded, we will immediately get (4.9).
If we want to estimate the constant C(d), some additional work is needed. First,
making the change of variables x = 2s/ε we can see that (4.9) is equivalent (with
35
the same constant C1) to
|p(0)|2 ≤ C1ε
4
ˆ 2
0x |p(x)|2 dx
or, equivalently, to the estimate
|p(1)|2 ≤ C1ε
4
ˆ 1
−1(1− x) |p(x)|2 dx(4.10)
for all polynomials p, deg p ≤ d− 1.
Consider the Jacobi polynomials P (1,0)n , which are orthogonal polynomials with
respect to the weight w,
w(x) = (1− x) = (1− x)1(1 + x)0.
Denote by J (1,0)n the normalized Jacobi polynomials,
J (1,0)n := ‖P (1,0)
n ‖−1L2(w)
P (1,0)n .
Since P (1,0)n (1) = n+ 1 and
∥∥∥P (1,0)n
∥∥∥2
L2(w)= 2
(n+1) , we have that
J (1,0)n (1)2 = (n+ 1)3
2 .(4.11)
Writing P =d−1∑n=0
cnJ(1,0)n we obtain
ˆ 1
−1(x− 1) (P (x))2 dx = ‖P‖2
L2(w)=
d−1∑n=0|cn|2
36
and by (4.11)
P (1) =d−1∑n=0
cn(n+ 1)3/2√
2.
From Cauchy–Schwarz,
|P (1)|2 ≤(d−1∑n=0|cn|2
)(d−1∑n=0
(n+ 1)3
2
)= 1
8d2(d+ 1)2‖P‖2
L2(w).
Comparing this inequality with (4.10), we can see that (4.10) and consequently (4.9)
hold with
C1 = C1(ε, d) = ε−1d2(d+ 1)2/2.
From (4.9) and (4.8),
RQ
(0) ≤ C(ε, d)1ε
ˆ ε
0sR
Q(s)ds,
with C(ε, d) = ε−1d2(d+ 1)2(1 + ε)2d/2.
By letting ε = 1/(2d), we have indeed that
C(d) = d3(d+ 1)2(
1 + 12d
)2d≤ e · d3(d+ 1)2.
4.7 Verifying the properties of Bs
It remains to show that the functions in the family Bs satisfy the Bellman function
properties. The range property (i) is proved in the following lemma:
Lemma 4.7.1. For Bs defined above in (4.4), Bs(Q) ≤ FQ.
37
Proof. Let e ∈ Fd be fixed. Since W is self-adjoint, an application of the Cauchy-
Schwarz inequality gives
∣∣∣∣∣ Q
〈W 1/2f, e〉∣∣∣∣∣ ≤
( Q
〈f, f〉)1/2 (
Q
〈W 1/2e,W 1/2e〉)1/2
.
Therefore, recalling the notation (4.1), (4.3), we get that for any vector e,
(4.12)
∣∣∣〈xQ, e〉∣∣∣2
〈〈W 〉Qe, e〉
≤ FQ.
Using Lemma 4.7.2 below we can write
〈(〈W 〉Q
+ sMQ
)−1x, x〉 = supe6=0
|〈xQ, e〉|2
〈(〈W 〉Q
+ sMQ
)e, e〉
≤ supe6=0
∣∣∣〈xQ, e〉∣∣∣2
〈〈W 〉Qe, e〉
≤ FQ.
which means exactly that Bs(Q) ≤ FQ.
Lemma 4.7.2. Let A ≥ 0 be an invertible operator in a Hilbert space H. Then for
any vector x ∈ H
〈A−1x, x〉 = supe∈H: e 6=0
|〈x, e〉|2
〈Ae, e〉
38
Proof. By definition,
〈A−1x, x〉 = ‖A−1/2x‖2
= supa∈H: ‖a‖6=0
|〈A−1/2x, a〉|2
‖a‖2
= supa∈H: ‖a‖6=0
|〈x,A−1/2a〉|2
‖a‖2 .
Performing the change of variables a = A1/2e, we conclude
〈A−1x, x〉 = supe∈H: ‖e‖6=0
|〈x, e〉|2
〈Ae, e〉.
Having verified the range property, we now turn to the main estimate (4.5). This
inequality is the consequence of the following lemma:
Lemma 4.7.3. Let H be a Hilbert space. For x ∈ H and for U being a bounded
invertible positive operator in H define
φ(U, x) := 〈U−1x, x〉H .
Then the function φ is convex, and, moreover, if
x0 =∑k
θkxk, ∆U := U0 −∑k
θkUk
where 0 ≤ θk ≤ 1, ∑k θk = 1, then
∑k
θkφ(Uk, xk)− φ(U0, x0) ≥ 〈U−10 ∆UU−1
0 x0, x0〉H(4.13)
39
Remark 4.7.4. To see that this lemma implies (4.5), fix s > 0. Denoting
U s
Q= 〈W 〉
Q+ sM
Q, x
Q= 〈W 1/2f〉
Q,
we observe that
Bs(Q) = φ(U s
Q, x
Q).
Let Qk, k ≥ 1 be the children of Q, and let θk = |Qk|/|Q|. Notice that
〈W 〉Q
=∑k
θk〈W 〉Qk
MQ
=∑k
θkMQk+ s〈W 〉
QAQ〈W 〉
Q,
so
U s
Q−∑k
θkUQk =: ∆U s = s〈W 〉QAQ〈W 〉
Q.
Therefore, applying Lemma 4.7.3 with
U0 = U s
Q
x0 = xQ
Uk = U s
Qk
xk = xQk
∆U = ∆U s,
we get (4.13), which translates exactly to the estimate (4.5).
We end this chapter with the proof of the lemma stated above.
Proof of Lemma 4.7.3. The function φ and the right hand side of (4.13) are invariant
40
under the change of variables
x 7→ U−1/20 x,(4.14)
U 7→ U−1/20 UU
−1/20 ,
so it is sufficient to prove (4.13) only for U0 = I.
In this case, define function Φ(τ), 0 ≤ τ ≤ 1 as
Φ(τ) =∑
θk
⟨(I + τ∆Uk)−1 (x0 + τ∆xk), (x0 + τ∆xk)
⟩H− 〈x0, x0〉H ,
where ∆xk = xk − x0 and ∆Uk = Uk − U0 = Uk − I.
Using the power series expansion of (I + τ∆U)−1 we get
Φ(τ) =τ(2∑
θk〈∆xk, x0〉H −∑
θk〈∆Ukx0, x0〉)
+ τ 2(∑
θk〈∆U2kx0, x0〉+
∑θk〈∆xk,∆xk〉 − 2
∑θk〈∆Ukx0,∆xk〉H
)+ o(τ 2)
Notice that ∑θk∆xk =
∑θk(xk − x0) = 0
and also ∑θk∆Uk = −∆U.
Hence
Φ(τ) =τ〈∆Ux0, x0〉+ τ 2∑ θk(‖∆Ukx0‖2 + ‖∆xk‖2 − 2〈∆Ukx0,∆xk〉
)(4.15)
+ o(τ 3).
41
Using the above formula for
x0 = x1 + x2
2U0 = U1 + U2
2 ,
we get that that the second differential of φ at U = I is non-negative. Note that the
function φ is clearly analytic, so all the differentials are well defined.
The change of variables (4.14) implies that the second differential of φ is non-
negative everywhere. In particular, we obtain that Φ′′(τ) ≥ 0, for all τ , so Φ is
convex.
Returning to the general choice of arguments U , x, we can see from (4.15) that
Φ′(0) = 〈∆Ux0, x0〉H .
Since Φ is convex and Φ(0) = 0,
Φ(1) ≥ Φ′(0) = 〈∆Ux0, x0〉H ,
which concludes the proof
CHAPTER 5
Matrix weighted two weight estimates for
well-localized operators
The content of this chapter overlaps with [BCTW16] by the author and K. Bickel,
S. Treil, and B. Wick.
In what follows we will give necessary and sufficient conditions for the two weight
L2 estimates of the so-called well-localized operators with matrix-valued weights. The
main examples of such operators are the Haar shifts, and their different generaliza-
tions, considered in the weighted spaces. More specifically, in this chapter we will
study two weight estimates of the form
‖T (Wf)‖L2(V)
≤ C‖f‖L2(W)
(5.1)
with matrix-valued measures V and W. Here T (Wf) is defined for the integral
42
43
operator T given formally by
T (Wf)(x) =ˆX
dW(y)K(x, y)f(y) =ˆXK(x, y)W (y)f(y)dw(y),(5.2)
where the kernel K(x, y) = k(x, y)⊗Id, for a scalar-valued kernel k(x, y) Recall that,
as before, W is the density of W with respect to the scalar measure w.
The main result presented here is that the Sawyer type testing conditions are
necessary and sufficient for the boundedness of the operators T . We will show that
it is sufficient to verify the estimates of the operator and its formal adjoint only on
characteristic functions of atoms.
The proof will follow the lines of [NTV08]. The main part of the operator is
estimated by bounding the corresponding paraproduct, and the bound on the para-
product follows from the Carleson embedding theorem presented in the previous
chapter. It is important to mention that the matrix weighted version of this the-
orem, with matrix-valued weights both in the domain and in the target space, is
essential for our argument. As such, this result is only possible in the context of the
theorem proved in Chapter 4.
The context of matrix-valued weights requires several technical steps that would
not be present in the scalar setting. However, even restricting to the scalar case,
the arguments remain interesting and new. To begin with, unlike previous works,
this chapter studies a very general filtration, and not necessarily the standard dyadic
setup. Furthermore, and perhaps more importantly, by carefully estimating the
“easy” parts and slightly changing the definition of well-localized operators, we are
able to get (even in the case of scalar measures) better estimates and stronger results
than ones in [NTV08].
One of our main results, Theorem 5.9.2 is specifically adapted to estimating the
44
so-called Haar shifts. Earlier, similar results were obtained (only in the scalar case)
with significant extra work from the results of [NTV08], or just by modifying the
proofs from [NTV08].
We will use the symbol TW for the operator f 7→ T (Wf) and, similarly, for the
scalar measure w, we will use Tw to denote the operator f 7→ T (fw), where
Twf(y) :=ˆXK(x, y)f(y)dw(y).(5.3)
The above operator, Tw, is defined for the scalar-valued functions as well as for the
functions with values in Fd; we will use the same notation for both cases, although,
formally, in the latter case we should write Tw ⊗ Id.
If dW = Wdw and dV = V dv, we can rewrite estimate (5.1) as
‖V 1/2TwW1/2f‖
L2(v)≤ C‖f‖
L2(w).(5.4)
A particular interesting case is that when the measures V and W are absolutely
continuous with respect to the underlying measure σ, so in (5.4) we have v = w = σ.
5.1 Expectations and martingale differences
Before stating the results, we will introduce some more notation and terminology.
Throughout this chapter, a function f will be called locally integrable if it is integrable
on every atom Q ∈ D.
For an atom Q and a locally integrable function f , we will denote by 〈f〉Q
its
45
average (with respect to the underlying measure σ),
〈f〉Q
:= σ(Q)−1ˆQ
fdσ,
adopting the convention that 〈f〉Q
= 0 if σ(Q) = 0.
Define the averaging operator, or expectation EQby
EQf := 〈f〉
Q1Q,(5.5)
and the martingale difference operator ∆Qby
∆Q
:=∑
P∈Ch(Q)EP− E
Q.(5.6)
Note that EQ
and ∆Q
are orthogonal projections in L2(σ), and that the subspaces
generated by the ∆Qare orthogonal to each other.
We will think of EQ
and ∆Q
as of operators in Lebesgue spaces (that is, EQf
and ∆Qf are defined only a.e.), so if for atoms Q1 ⊂ Q2 we have σ(Q2 \ Q1) = 0,
then ∆Q2
= 0.
5.2 Generalized band operators
To the collection D, one can associate a tree structure where each Q is connected to
the elements of the collection ChQ (the children of Q). Given this tree, let dtree(Q,R)
denote the “tree distance” between atoms Q and R, namely, the number of edges of
the shortest path connecting Q and R. If Q and R share no common ancestor, then
take dtree(Q,R) =∞.
The operators of interest possess a band structure related to this tree distance,
46
as defined below. These operators are called generalized band operators because they
generalize the band operators studied in [NTV08] and [BW14].
Definition 5.2.1. A bounded operator T : L2(σ) → L2(σ) is said to be a generalized
band operator with radius r if T can be written as
T =2∑
j,k=1
∑Q,R∈D
P j
RT jkR,Q
P k
QT jkR,Q
: P k
QL2(σ)→ P j
RL2(σ),(5.7)
where T j,kR,Q
= 0 if dtree(R,Q) > r and for any Q,
P 1Q
= ∆Q
P 2Q
= EQ.
Remark 5.2.2. We usually assume that the sum in (5.7) has only finitely many
nonzero terms, but in a more general situation convergence will be considered in
the weak operator topology with respect to some ordering of the pairs R,Q.
Remark 5.2.3. Each block T j,kR,Q
:= P j
RT jkR,Q
P k
Qis an operator bounded in L2(σ), and
can be represented as an integral operator with kernel Kj,k
R,Q. The kernel Kj,k
R,Qcan
be computed in the following way: for y ∈ Q, let Qy ∈ ChQ be the unique child of
Q containing y. Defining
Kj,k
R,Q(x, y) :=
σ(Q(y))−1
(T j,kR,Q
1Qy
)(x), y ∈ Q
0 y /∈ Q,(5.8)
one can easily see that
(T j,kR,Q
f)
(x) :=ˆXKj,k
R,Q(x, y)f(y)dσ(y)
47
for all functions f ∈ L2(σ) such that 1Qf is supported on a finite union of cubes
S ∈ ChQ. Note also that the kernel Kj,k
R,Qis supported on R×Q and is constant on
sets R′ ×Q′, for R′ ∈ ChR and Q′ ∈ ChQ.
Because the operators P kQ are orthogonal projections in L2(σ), T being a gener-
alized band operator of radius r implies that its adjoint T ∗ is also a generalized band
operator of radius r.
5.3 Examples
In this section, we give several examples of generalized band operators of various
radii.
Example 5.3.1. For a numerical sequence a = {aQ}Q∈D , define the “dyadic” oper-
ator T : L2(σ)→ L2(σ) by
Tf = Taf :=∑Q∈D
aQEQf.
Trivially, T is a generalized band operator of radius 0 as long as Ta is bounded on
L2(σ). To see this, set j = k = 2 in the definition of generalized band operators
(so expectations will appear on both sides), and let TQQ = aQ for each Q ∈ D, and
TR,Q
= 0 for each R ∈ D with R 6= Q.
Remark. For a sequence |a| := {|aQ|}Q∈D one can easily see that the pointwise
estimate
|Taf | ≤ T|a||f |
holds for all x ∈ X . Therefore, in the scalar case, the two weight estimates for Ta
48
follow from the two weight estimates for T|a|. Consequently, in the scalar case, the
operators with all aQ≥ 0 (the so-called positive dyadic operators) play a special role
in weighted estimates.
In the case of matrix-valued measures, it is not clear that the weighted estimates
of T|a| imply any corresponding estimates for Ta. In fact, we suspect that this is not
true, so we will not reserve any special place in our setup for the positive dyadic
operators.
Example 5.3.2. For r ∈ Z+ and a locally integrable function b, define a paraproduct
Π = Πrb of order r on L2(σ) as
Πf =∑Q∈D
EQf
∑R∈Chr Q
∆Rb.(5.9)
Clearly, if it is bounded on L2(σ), the above paraproduct is a generalized band operator
of radius r.
Remark. Since Πf is defined by the sum of an orthogonal series, the convergence of
the sum defining Π in the weak operator topology implies its unconditional conver-
gence in the strong operator topology.
Example 5.3.3. A Haar shift of complexity (m,n) is an operator T : L2(σ)→ L2(σ)
defined by
Tf =∑Q∈D
∑R∈Chn(Q),S∈Chm(Q)
∆RTR,S
∆S,(5.10)
TR,S
: ∆SL2(σ)→ ∆
RL2(σ),
where for each block TR,S
= ∆RTR,S
∆S, the canonical kernel K
R,Sof T
R,S(defined
49
by (5.8) and supported on R× S) satisfies the estimate
‖KR,S‖∞ ≤ σ(Q)−1.(5.11)
If T is a Haar shift of complexity (m,n), then trivially its adjoint T ∗ is also a Haar
shift of complexity (n,m). Any Haar shift of complexity (m,n) ( and in fact, any
bounded operator given by (5.10)) is a generalized band operator of radius r = m+n.
Remark. An operator defined by (5.10) is bounded if and only if all blocks TQ
TQ
:=∑
R∈Chn(Q),S∈Chm(Q)
∆RTR,S
∆S
are uniformly bounded: in this case the series in (5.10) converges unconditionally
(independently of the ordering) in the strong operator topology.
Notice that the normalization condition (5.11) implies that ‖TQ‖ ≤ 1. Indeed,
(5.11) implies that the block TQ
can be represented as an integral operator with
kernel KQ(supported on Q×Q) satisfying ‖K
Q‖∞ ≤ |Q|−1, so ‖K
Q‖L2(Q×Q)
≤ 1.
The concept of Haar shift can be generalized.
Definition 5.3.4. A generalized Haar shift of complexity (m,n) is an operator T ,
T : L2(σ)→ L2(σ)
Tf =2∑
j,k=1
∑Q∈D
∑R∈Chn(Q),S∈Chm(Q)
P j
RT j,kR,S
P k
S,(5.12)
T j,kR,S
: P k
SL2(σ)→ P j
RL2(σ),
50
where the kernel Kj,k
R,Sof the operator T
R,S:= P j
RT j,kR,S
P k
Ssatisfies
‖Kj,k
R,S‖∞ ≤ |Q|−1.(5.13)
It is convenient to introduce an alternative representation of a (generalized) Haar
shift by grouping terms T j,kR,Q
. Denoting
TQ
=2∑
j,k=1
∑R∈Chn(Q),S∈Chm(Q)
P j
RT j,kR,S
P k
S
(or taking the inner sum in (5.10) for regular Haar shifts), we can represent a gen-
eralized Haar shift as ∑Q∈D TQ . Note that the kernel KQ
of the integral operator
TQ
is supported on Q × Q and constant on R × S, where R, S ∈ Chr+1Q and
r = max{m,n}. Since the sets R × S with R ∈ ChnQ and S ∈ ChmQ are disjoint,
the kernel KQ satisfies the estimate
‖KQ‖∞ ≤ |Q|−1(5.14)
for Haar shifts, and the estimate ‖KQ‖∞ ≤ 4|Q|−1 for the generalized Haar shifts
(the constant 4 appears here because for each pair R ∈ ChnQ, S ∈ ChmQ there are
four possible operators T j,kR,S
). This discussion motivates the following general object
of study:
Definition 5.3.5. A generalized big Haar shift of complexity r is a bounded operator
T : L2(σ)→ L2(σ) defined by
T :=∑Q∈D
TQ,(5.15)
51
where each block TQ
is an integral operator with kernel KQ, such that K
Qis sup-
ported on Q×Q, constant on R×S with R, S ∈ Chr+1Q, and satisfies the estimate
(5.14). If, in addition, each block TQand its adjoint T ∗
Qannihilate constants 1
Q, we
will call the operator a big Haar shift of complexity r, removing the word generalized.
Finally, if an operator T admits the above representation but does not satisfy the
estimates (5.14), we will say that the operator T has the structure of a (generalized)
big Haar shift of order r.
Remark 5.3.6. It is easy to see that a (generalized) band operator of radius r has the
structure of a (generalized) big Haar shift of order r. Moreover, if the kernels Kj,k
R,Q
of the blocks T j,kR,Q
admit the estimate ‖Kj,k
R,Q‖∞ ≤ 1/4 (or the estimate ‖K
R,Q‖∞ ≤ 1
for kernels of the blocks TR,Q
for the case of a band operator), then the operator T
is a (generalized) big Haar shift. To see that we can just define
TQ
:=∑
R∈Chr Q
∑S∈D(Q)
rkS≤rkR
2∑j,k=1
T j,kR,S
+∑
S∈Chr Q
∑R∈D(Q)
rkR<rkS
2∑j,k=1
T j,kR,S
.(5.16)
5.4 Weighted martingale differences
For the matrix measure W (or V) discussed in a previous chapter, one can define
the W-weighted expectation EWQ
and the martingale difference ∆WQ
as
EWQf := 〈f〉W
Q1Q
(5.17)
〈f〉WQ
:= W(Q)−1(ˆ
Q
dWf
)
52
and
∆WQ
=∑
R∈ChQEWR− EW
Q.(5.18)
respectively, for all atoms Q ∈ D.
We will not assume here that the matrix W(Q) is invertible. Instead, given that´Q
dWf ∈ Ran W(Q), the operators will be well defined if we interpret W(Q)−1 as
the Moore–Penrose inverse. As before, if W(Q) = 0, we set EWQf = 0.
It is easy to see that EWQ
is the orthogonal projection in L2(W) onto the subspace
of constants {1Qe : e ∈ Fd}. It can also be easily shown that
EWQ
∆WQ
= ∆WQEWQ
= 0,
that ∆WQ
is an orthogonal projection, and that the subspaces generated by ∆WQ
and
∆WR
are orthogonal whenever Q 6= R.
For any Q ∈ D, let HQdenote the space of (non-weighted) Haar functions, H
Q:=
∆Q
(L2(σ)). HQis the subspace of L2(σ) spanned by functions h
Qsupported on Q,
constant on each element of ChQ, and orthogonal to constant vectors. Similarly,
for the weighted case, let HWQ
:= ∆WQ
(L2(W)) be the analogous subspace of L2(W)
spanned by functions hWQ.
Remark 5.4.1. The vector-valued Haar functions hQ
and hWQ
should not be inter-
preted as scalar-valued Haar functions times constant vectors in Fd. Throughout
this chapter, if any reference to the scalar Haar functions is needed, it will be clearly
indicated and reflected in the notation.
53
5.5 Density of simple functions
Recall that the operator TW , with TWf := T (Wf) was given by the integral rep-
resentation (5.2), provided that the integral was well defined. However, to verify
boundedness, we only need to know the bilinear form of the operator on a dense set.
We will proceed to show that (finite) linear combinations of functions 1Qe, Q ∈ D,
e ∈ Fd are dense in L2(W) in the following lemma:
Lemma 5.5.1. Let F be the smallest σ-algebra containing an increasing sequence
of atomic σ-algebras Fn, with sets of atoms Dn. Let L denote the space of linear
combinations of functions 1Qe with Q ∈ D = ∪nDn and e ∈ Fd. If W is a d × d
matrix valued measure defined on F , then L is dense in L2(W).
Proof. We claim that if f ∈ L2(W) satisfies 〈f, e〉L2(W)
= 0 for all e ∈ Fd and Q ∈ D,
then f = 0. To see this, fix e ∈ Fd and suppose that for any Q ∈ D,
ˆX
⟨dWf,1
Qe⟩Fd
= 0.
Define a scalar measure w by w := tr W = ∑dj=1 wj,j. As mentioned earlier, W
is absolutely continuous with respect to w and so there is a measurable, positive
semidefinite function W (x) such that dW = W (x)dw.
Then, by assumption, we have
ˆQ
〈W (x)f(x), e〉Fd dw = 0, ∀ Q ∈ D,
which implies the function 〈W (x)f(x), e〉Fd = 0,w-a.e. To see this easily, assume
that Fd = Rd (similar arguments work for Cd). As 〈W (x)f(x), e〉Fd is measurable,
the pre images Q1 := 〈W (x)f(x), e〉−1Fd ((0,∞)) and Q2 := 〈W (x)f(x), e〉−1
Fd ((−∞, 0])
54
are both in F . Thus, we can write Q1, Q2 as countable unions of disjoint atoms:
Q1 = ∪jQji and Q2 = ∪kQk
2. Then, we can compute
ˆQ1
〈W (x)f(x), e〉Fddw =∑j
ˆQj1
〈W (x)f(x), e〉Fddw = 0.
As 〈W (x)f(x), e〉Fd ≥ 0 on Q1, this implies that 〈W (x)f(x), e〉Fd1Q1 = 0, w-a.e.
Similar results hold on Q2 and so 〈W (x)f(x), e〉Fd = 0 w-a.e. As this works for each
e ∈ Fd, we obtain W (x)f(x) = 0, w-a.e. If W (x) is invertible a.e., then W 1/2 is
invertible and we can conclude that
W 1/2(x)f(x) = W−1/2(x)W (x)f(x) = 0
w-a.e. This immediately implies that
‖f‖2L2(W) =
ˆX〈dWf, f〉 =
ˆX
⟨W 1/2f,W 1/2f
⟩dw = 0,
so f is the zero element in the space L2(W).
By Lemma 5.5.1, to verify the boundedness of TW , it is enough to be able to
compute
〈TW
1Qe,1
Rv〉L2(V )
=¨ ⟨
dW(y)K(x, y)1Q
(x)e, dV(x)1R
(y)v⟩Fd
=¨ ⟨
W (y)K(x, y)e, V (x)1R
(y)v⟩Fd
dw(y)dv(x).
for all Q,R ∈ D and all e, v ∈ Fd. Thus, we say that an operator TW acts formally
55
from L2(W) to L2(V) if the bilinear form
〈TW1Qe,1
Rv〉L2(V)
(5.19)
is well defined for all Q,R ∈ D and all e, v ∈ Fd. Then the formal adjoint T ∗Vis given
by
〈TW1Qe,1
Rv〉L2(V)
= 〈1Qe, T ∗
V1Rv〉L2(W)
.
We also assume a very weak continuity property, namely that
〈TW1Qe,1
Rv〉L2(V)
=∑
S∈ChQ〈TW1
Se,1
Rv〉L2(V)
(5.20)
=∑
S∈ChR〈TW1
Qe,1
Sv〉L2(V)
;
this property is non-trivial only if Q or R have infinitely many children, which is
a possibility in our atomic filtration (and not necessarily dyadic) setup.
Consider the set L of all finite linear combinations of functions 1QeQ, Q ∈ D,
eQ∈ Fd. If the bilinear form (5.19) is defined, then
〈TWf, g〉L2(V)
= 〈f, T ∗Vg〉L2(W)
f, g ∈ L(5.21)
is well defined for all f, g ∈ L. Since ∆WQf ∈ L for f ∈ L, the expression
〈TW∆WQf,∆V
Rg〉L2(V)
is also well defined for all f, g ∈ L. Thus the expression
∆VRT∆W
Qis well defined, in the sense that its bilinear form is well defined for all
f, g ∈ L
56
5.6 Well-Localized operators
To state and prove the main results, it is convenient to introduce the formalism
of well-localized operators between weighted spaces, rather than work directly with
generalized band operators.
Definition 5.6.1. An operator TW acting formally from L2(W) to L2(V) is said to
be localized if for all e, v ∈ Fd,
〈TW1Qe,1
Rv〉L2(V)
= 0
whenever Q,R ∈ D share no common ancestors.
Definition 5.6.2. An operator TW acting formally from L2(W) to L2(V) is called
r-lower triangular if for all R,Q ∈ D and e ∈ Fd,
∆V
RTW1
Qe = 0
if either of the following conditions hold:
(i) R 6⊂ Q and rkR ≥ r + rkQ
(ii) R 6⊂ Q(r+1) and rkR ≥ rkQ− 1.
Here Q(r+1) is the ancestor of Q of order r + 1 , i.e. Q(r+1) is the unique atom in D
which contains Q and has the property that rkQ(r+1) = rkQ− (r + 1).
Definition 5.6.3. An operator TW acting formally from L2(W) to L2(V) is said to
be well-localized with radius r if it is localized and if both TW and its formal adjoint
T ∗V
are r-lower triangular.
57
Remark 5.6.4. This definition is very similar to the definition of well-localized oper-
ators from [NTV08], with two exceptions. First, the definition in [NTV08] contained
a typographical error: in the language of this chapter, the definition in [NTV08] only
required that rkR ≥ rkQ in condition (ii) of Definition 5.6.2 above. This was not
correct, as pointed out in [BW14] , and the inequality rkR ≥ rkQ is not sufficient
to get the results in [NTV08]. For further details about the necessity of condition
(ii), see the discussion in [BW14].
The other difference between the two statements, which is more significant, is
that in [NTV08], the operator TW was not required to be localized in the sense of
the above definition. By imposing this requirement on our operators, we are able
to obtain better estimates than those in [NTV08]. In particular, unlike the result
in [NTV08], which specifically studied the dyadic case, assumed that each cube
had at most 2N children, and produced estimates depending on this bound, we do
not assume any bounds on the number of children of a cube Q ∈ D. Since all the
examples we are interested in (presented in the previous section) give rise to localized
operators, we do not lose generality by including this requirement into the definition
of well-localized operators. Thus, even for the case of scalar measures, our bounds
are stronger than the ones presented in [NTV08].
5.7 From band operators to well-localized opera-
tors
We will now show that if T has a structure of a generalized big Haar shift of complex-
ity r, as described in definition 5.3.5, then the operator TW , given by TWf = T (Wf),
is a well-localized operator of radius r.
58
We assume that in the representation (5.15) there are only finitely many terms
TQ, and that each block T
Qis represented by an integral operator with a bounded
kernel. Note that the latter assumption is always true if Q and R have finitely many
children; for the generalized big Haar shifts it is just postulated (together with the
uniform estimate for the norm of the kernels).
The above two assumptions imply that the bilinear form (5.21) is well defined
for f, g ∈ L, so TW is well defined as an operator acting formally from L2(W) to
L2(V). In fact, it can even be shown that the bilinear form (5.21) is well defined
for all f ∈ L2(W), g ∈ L2(V), so one can conclude that TW is a bounded operator
L2(W)→ L2(V).
Lemma 5.7.1. Let T have a structure of a generalized big Haar shift of complexity
r, satisfying the assumptions above. Then for matrix-valued measures W and V
the operator TW, TWf = T (Wf), acting formally from L2(W) to L2(V) is a well-
localized operator of radius r.
Proof. The fact that the operator TW is localized, in the sense of Definition 5.6.1, is
obvious. We will show that TW is also r-lower triangular. Then, by symmetry, the
same result will hold for T ∗V.
To prove that TW is r-lower triangular it is sufficient to show that for any e ∈ Fd,
the function T (W1Qe) outside of Q(r+1) is constant on cubes R, rkR ≥ rkQ − 1,
and that outside of Q it is constant on cubes R, rkR ≥ rkQ+ r.
Let us analyze how the non-zero blocks TSact on W1
Qe. First, observe that
TS
(W1Qe) is non-zero outside of Q only if Q ( S. Since rkS ≤ rkQ− 1 for Q ( S,
the condition rkR ≥ rkQ+ r implies that
rkR ≥ rkQ+ r ≥ rkS + 1 + r.(5.22)
59
We know that the kernel of TSis constant on sets S ′ × S ′′ with S ′, S ′′ ∈ Chr+1 S.
Therefore TS
(W1Qe) is constant on cubes R such that
rkR ≥ rkS + r + 1.
So if R ∩Q = ∅ and rkR ≥ rkQ+ r, we can conclude from (5.22) that TS
(W1Qe)
is constant on R. Similarly, if TS
(W1Qe) does not vanish outside of Q(r+1), then
Q(r+1) ( S, so
rkS ≤ rkQ(r+1) − 1 = rkQ− (r + 1)− 1 = rkQ− r − 2.
or equivalently
rkS + r + 1 ≤ rkQ− 1
The condition rkR ≥ rkQ− 1 then implies that
rkR ≥ rkQ− 1 ≥ rkS + r + 1.(5.23)
But, as discussed above, TS
(W1Qe) is constant on cubes R such that rkR ≥ rkS +
r+ 1, so (5.23) implies that outside of Q(r+1) the function TS
(W1Qe) is constant on
cubes R, rkR ≥ rkQ− 1.
Remark 5.7.2. In Lemma 5.7.1 we assumed that the operator T is a sum of finitely
many blocks TQ, and that each block is an integral operator with bounded kernel.
However, if we assume that the matrix-valued measures V and W are absolutely
continuous with respect to the underlying measure σ, i.e. dV = V dσ, dV = Wdσ,
60
it is enough to require that the sum over Q ∈ D converge (in the weak operator
topology and with respect to some ordering) to a bounded operator in L2(σ).
The reasoning is fairly straightforward if V,W ∈ L2loc(σ), and can be extended to
the case V,W ∈ L1loc(σ) (where the index index “loc” means “finite on atoms.”) via
a limiting argument similar to [NTV08]. The paper [NTV08] studied scalar weights.
A corresponding line of reasoning for matrix-valued weights V , W is presented in
[BW14].
5.8 Estimates of well-localized operators
For a cube Q ∈ D let DW,kQ be the collection of functions of the form
∑R∈D(Q)
rkR=rkQ+k
∆WRf =
∑R∈Chk Q
∆WRf, f ∈ L.
We have the following theorem:
Theorem 5.8.1. Let TW be a well-localized operator of radius r acting formally from
L2(W) to L2(V). Then TW extends to a bounded operator from L2(W) to L2(V) if
and only if the following conditions
(i) ‖1QTW1
Qe‖
L2(V)≤ T1‖1Qe‖L2(W)
for all e ∈ Fd;
(ii) ‖1QTWf
Q‖L2(V)
≤ T2‖fQ‖L2(W)for all f ∈ DW,r
Q ;
and their dual counterparts (corresponding conditions for T ∗V
with V and W inter-
changed) hold for all Q ∈ D.
Furthermore,
‖TW‖L2(W)→L2(V)≤(C(d)1/2 + 1/2
)(T1 + T∗1) + (r + 1)1/2(T2 + T∗2).
61
Here T∗1, T∗2 are the constants appearing in the duals to the testing conditions (i)
and (ii) respectively, and C(d) is the dimensional constant from the matrix Carleson
embedding theorem presented in the previous chapter (Theorem 4.1.2). Moreover, for
the best possible bounds Tk, T∗k we have
Tk, T∗k ≤ ‖TW‖L2(W)→L2(V)
k = 1, 2.(5.24)
Remark 5.8.2. In the case when each Q ∈ D has at most N children (N < ∞), the
condition (ii) follows from the testing condition ‖TW1Qe‖
L2(V)≤ T3‖1Qe‖L2(W)
. In
this case, one can estimate T2 ≤ C(r,N)T3 and obtain similar inequalities for the
dual condition. This is exactly the approach that was used in [NTV08].
Condition (i) and its corresponding dual from Theorem 5.8.1 can be slightly re-
laxed. Given a well-localized operator TW and an atom Q ∈ D, define the truncation
TQW
by
TQWf =
∑R∈D(Q)
∆VRTWf,(5.25)
and define a similar truncation for the dual T ∗V.
The theorem above can be rephrased in terms of truncations:
Theorem 5.8.3. Let TW be a well-localized operator of radius r acting formally from
L2(W) to L2(V). Then TW extends to a bounded operator from L2(W) to L2(V) if
and only if the conditions
(i) ‖TQW
1Qe‖
L2(V)≤ T1‖1Qe‖L2(W)
for all e ∈ Fd;
(ii) ‖TQWfQ‖L2(V)
≤ T2‖fQ‖L2(W)for all f ∈ DW,r
Q ,
62
their dual counterparts (corresponding conditions for T ∗Vwith V and W interchanged)
and the following weak type estimate
(iii)∣∣∣∣〈TW1
Qe,1
Qv〉L2(V)
∣∣∣∣ ≤ T3‖1Qe‖L2(W)‖1
Qv‖
L2(V)for all e, v ∈ Fd,
hold for all Q ∈ D.
Moreover,
‖TW‖L2(W)→L2(V)≤ C(d)1/2(T1 + T∗1) + (r + 1)1/2(T2 + T∗2) + T3,
where again C(d) is the constant from the Matrix Carleson Embedding Theorem.
Moreover, for the best possible bounds Tk, T∗k we trivially have
Tk, T∗k ≤ ‖TW‖L2(W)→L2(V)
k = 1, 2, 3.(5.26)
There is no dual condition to (iii) in this theorem, because this condition is self-
dual. Note that Theorem 5.8.1 follows immediately from Theorem 5.8.3, because
trivially for all f ∈ L
‖TQWf‖
L2(V)≤ ‖1
QTWf‖
L2(V).
Note also that condition (i) of Theorem 5.8.1 implies condition (iii) of Theorem 5.8.3,
with the trivial estimate for the corresponding bounds,
T3 ≤T1 + T∗1
2 ;
here T3 is the bound from Theorem 5.8.3, and T1, T∗1 are the bounds from condition
(i) and its dual in Theorem 5.8.1.
Remark 5.8.4. The condition (iii) of Theorem 5.8.3 can be further relaxed. First,
63
we do not need this condition to hold for all cubes Q ∈ D: it is sufficient if this
condition holds for arbitrarily large cubes Q, meaning that for any Q0 ∈ D one
can find Q ∈ D, Q0 ⊂ Q for which (iii) holds. Secondly, if for any Q0 ∈ D we
have that for the increasing sequence of cubes Qn, where Qn+1 is the parent of Qn,
W(Qn) ≥ αnI as αn → +∞ and similarly for V, the condition (iii) can be removed
from Theorem 5.8.3.
5.9 Applications to the estimates of Haar shifts
While conditions (i) from Theorems 5.8.1 and 5.8.3 are fairly standard testing condi-
tions, and the condition (iii) from Theorem 5.8.3 is the standard weak boundedness
condition, the condition (ii) seems unnecessarily complicated.
However, if the measures V, W satisfy the two weight matrix A2 condition
supQ∈D|Q|−2‖V(Q)1/2W(Q)1/2‖2 =: [V,W]
A2<∞,(5.27)
and the operator T is a generalized big Haar shift ( see definition 5.3.5), then the
condition (ii) follows from the testing condition (i) and the A2 condition (5.27).
Let us introduce some notation. For a generalized big Haar shift T = ∑Q∈D TQ
and a cube Q ∈ D, define the localized operator TQ,
TQ :=∑
R∈D(Q)TR
(5.28)
For a matrix measure W, define the weighted version (TQ)W of TQ by
(TQ)Wf = TQ(Wf).
64
Note that (TQ)W is different from TQW
defined above in (5.25). The following lemma
holds:
Lemma 5.9.1. Let T be a a generalized big Haar shift of complexity r with finitely
many terms, and let the matrix measures V and W satisfy the two weight matrix A2
condition (5.27). Assume that for all Q ∈ D
‖(TQ)W1Qe‖
L2(V)≤ T‖1
Qe‖
L2(W)∀e ∈ Fd.(5.29)
Then for all Q ∈ D
‖TQW
1Qe‖
L2(V)≤(d1/2r[V,W]1/2
A2+ T
)‖1
Qe‖
L2(W)∀e ∈ Fd,(5.30)
‖TQWf‖
L2(V)≤(d1/2(2r + 1)[V,W]1/2
A2+ T
)‖f‖
L2(W)∀f ∈ DW,r
Q.(5.31)
Moreover, for all sufficiently large Q ∈ D
‖TW1Qe‖
L2(V)≤ T‖1
Qe‖
L2(W)(5.32)
Lemma 5.9.1 implies that for a generalized big Haar shift T of complexity r with
finitely many terms, the bounds in the testing conditions in Theorem 5.8.3 can be
estimated as
T1 ≤ d1/2r[V,W]1/2A2
+ T,
T2 ≤ d1/2(2r + 1)[V,W]1/2A2
+ T,
T3 ≤ T,
with similar estimates for the dual bounds T∗1,2. Note that T3 ≤ T∗, so T3 ≤
65
(T+T∗)/2. Using these estimates and applying Theorem 5.8.3, we get the following
result.
Theorem 5.9.2. Let T be a generalized big Haar shift of complexity r (with finitely
many terms), and let the matrix measures V and W satisfy the A2 condition (5.27).
Let
(i) ‖(TQ)W1Qe‖
L2(V)≤ T‖1
Qe‖
L2(W)for all Q ∈ D and all vectors e ∈ Fd,
and also let the corresponding condition for T ∗ (with V and W interchanged) hold
with constant T∗.
Then
‖TW‖L2(W)→L2(V)≤(C(d)1/2 + (r + 1)1/2 + 1/2
)(T + T∗)
+ 2d1/2(C(d)1/2r + (2r + 1)(r + 1)1/2
)[V,W]1/2
A2;
here again C(d) is the constant from the Matrix Carleson Embedding Theorem.
The testing condition (i) of Theorem 5.9.2 is necessary, and moreover satisfies
the following estimate:
Proposition 5.9.3. The best possible constants T, T∗ in (i) of Theorem 5.9.2 satisfy
T, T∗ ≤ ‖TW‖L2(W)→L2(V)+ C1(d) · r · [V,W]1/2
A2.(5.33)
We postpone the proof of this proposition and Lemma 5.9.1 till Section 5.16.
66
5.10 The A2 theorem and linear dependence on
complexity
Theorem 5.9.2 can be directly applied to estimating dependence of the norm of Haar
shifts (and consequently of the Calderón–Zygmund operators) in the weighted space
L2(W ) on the A2 characteristic [W ]A2
of the weight W ([W ]A2
is the exactly the
A2 characteristic [V,W]A2
from (5.27) with dW = Wdσ, dV = W−1dσ). In the
scalar case, by the A2 theorem [Hyt12b], the norm depends linearly on [W ]A2, and
this estimate is optimal. In the matrix case, as specified in the introduction, the A2
conjecture remains open. The best known estimate so far is [W ]3/2A2
. Theorem 5.9.2
above reduces the problem to finding the optimal estimate in the testing condition
(i) and its dual.
It should be mentioned here that the best known estimates for Haar shifts in
the weighted space L2(w) with a scalar weight w satisfying the Muckenhoupt A2
condition grow linearly in the complexity r of the shift. It appears that in the
scalar case our theorem gives us the growth rate r3/2 in terms of the complexity r,
because the testing constants in condition (i) and its dual are usually estimated by
C · (r + 1)[w]A2. However, the standard splitting technique allows to get a growth
that is linear in complexity. Namely, one can split the operator T as T = ∑rk=0 Tk
Tk =∑j∈Z
∑Q∈D
rkQ=k+(r+1)j
TQ
;
then each Tk is a generalized big Haar Shift of complexity 0, with respect to the
rarefied filtration given by σ-algebras Fk+(r+1)n, n ∈ Z.
In the scalar case estimates of the testing bounds T and T∗ in terms of [w]A2
do
67
not depend on the filtration, so as a result we get a linear (in r) estimate of the norm
of the Haar shifts.
5.11 Weighted paraproducts
We are now ready to prove our main results from this chapter. The essential part of
the proof is the estimate of the weighted paraproducts, which we will present in this
section.
Let f = 1Se be a characteristic function with S ∈ D and e ∈ Fd. Then, for each
fixed n ∈ Z, f has the orthogonal decomposition
(5.34) f =∑
Q∈D,rkQ≥−n∆WQf +
∑Q∈D,rkQ=−n
EWQf.
To prove this equality, just observe that if m ≥ rkE, then f1Q = EWQf for all cubes
Q ∈ D with rkQ ≥ m. Then for each x ∈ X ,
f(x)−∑Q∈D
m≥rkQ≥−n
(∆WQf)
(x)−∑Q∈D
rkQ=−n
(EWQ f
)(x) = f(x)−
∑Q∈D
rkQ=m+1
(EWQ f
)(x) = 0.
Letting m→∞ gives the desired result. By orthogonality, it follows that
‖f‖2L2(W) =
∑Q∈D,rkQ≥−n
‖∆WQf‖2
L2(W) +∑
Q∈D,rkQ=−n‖EW
Qf‖2
L2(W).
For an operator TW acting formally from L2(W) to L2(V) define the paraproduct
ΠW = ΠWT
of complexity r as
ΠWf =∑Q∈D
∑R∈Chr(Q)
∆VR
(TWEW
Qf)
=∑Q∈D
∑R∈Chr(Q)
∆VR
(TW〈f〉
WQ
1Q
);(5.35)
68
clearly, the bilinear form of the paraproduct ΠW is well defined for f, g ∈ L, i.e. ΠW
also acts formally from L2(W) to L2(V). Similarly, for the adjoint T ∗V
of TW, define
the paraproduct ΠV = ΠVT ∗
by
ΠVg =∑Q∈D
∑R∈Chr(Q)
∆WR
(T ∗
VEVQf).
Lemma 5.11.1. Let T = TW be a well-localized operator of radius r, acting formally
from L2(W) to L2(V). Then for any cubes Q ⊂ S and for any R ∈ ChrQ and for
all e ∈ Fd,
∆VRTW1
Se = ∆V
RTW1
Qe.
Remark 5.11.2. The above lemma states that in the formula (5.35) for paraproducts,
one can replace 〈f〉WQ
1Qby 〈f〉W
Q1Swith any arbitrary cube S ⊃ Q. So formally we
can write in the right hand side of (5.35) the expression 〈f〉WQ
1 instead of 〈f〉WQ
1Q,
which looks more in line with the definition of the paraproduct in the scalar case.
To make it even more similar to the scalar representation we should use TW(1⊗
IFd
) instead of TW1 (to apply the operator TW to a matrix-valued function one just
needs to apply it to each column), and write the paraproduct ΠW as
ΠWf =∑Q∈D
∑R∈Chr Q
(TW(1⊗ I
Fd))〈f〉W
Q(5.36)
which is an alternative way of writing (5.35). The expression TW(1⊗ IFd
) should be
understood as TW(1S⊗ I
Fd) where S is an arbitrary cube with Q ⊂ S.
Proof of Lemma 5.11.1. Let P be a cube such that P 6= Q and rkP = rkQ. Since
69
TW is r-lower triangular,
∆VRTW1
Pe = 0
for any cube R 6⊂ P , rkR ≥ rkP + r. In particular, the equality holds for any
R ∈ ChrQ.
Since for a cube S ⊃ Q the set S \ Q is a (countable) union of cubes P with
rkP = rkQ, we conclude, using the weak continuity property (5.20), that for any
R ∈ ChrQ
∆VRTW1
S\Qe = 0,
which proves the lemma.
Remark 5.11.3. As one can see, in the above proof we only used the fact that T is
r-lower triangular; more precisely, only a part of the definition was used.
The following lemma states that the paraproducts ΠW and ΠV exhibit the same
behavior as TW and T ∗V
respectively.
Lemma 5.11.4. Let TW be a well-localized operator of radius r (acting formally
from L2(W) to L2(V), and let ΠW = ΠWT
be the paraproduct of complexity r defined
as above. Then for Q,R ∈ D
(i) If rkR ≤ r + rkQ, then
∆VR
ΠW∆WQ
= 0.
(ii) If R 6⊂ Q, then
∆VR
ΠW∆WQ
= 0.
70
(iii) If rkR > r + rkQ, then
∆VR
ΠW∆WQ
= ∆VRTW∆W
Q
and in particular if R 6⊂ Q, both sides of the equality are zero.
Proof. Using summation indices Q′ and R′, we have
ΠW∆WQ
=∑Q′∈D
∑R′∈Chr(Q′)
∆VR′
(TWEW
Q′∆WQ
),
and since ∆VR
is orthogonal to ∆VR′
for all choices of R′ except for R, we have
∆VR
ΠW∆WQ
= ∆VRTWEW
Q′∆WQ
where Q′ = R(r) is the rth order ancestor of R. Notice that EWQ′
∆WQ6= 0 only if
Q′ ( Q, so if rkR ≤ r + rkQ, then rkR(r) ≤ rkQ, which implies EWQ′
∆WQ
= 0, and
consequently
∆VR
ΠW∆WQ
= 0,
proving the first statement. Also, if R 6⊂ Q, then Q′ = R(r) 6⊂ Q. As above, this
implies EWQ′
∆WQ
= 0, and consequently
∆VR
ΠW∆WQ
= 0,
which proves the second statement.
To prove the third statement, assume rkR > r+ rkQ. If R 6⊂ Q, we can use our
71
previous result and the fact that T is well-localized to conclude:
∆VR
ΠW∆WQ
= 0 = ∆VRTW∆W
Q.
It now suffices to consider the case R ⊂ Q. Recall that Q′ = R(r). Since Q∩Q′ 6= ∅,
we can look at ranks to conclude that Q′ ( Q. Choose Q ∈ ChQ with Q′ ⊆ Q.
Then, using the fact that T is r-lower triangular, we have
∆VRTW∆W
Q= ∆V
RTW
∑S∈ChQ
EWS− EW
Q
= ∆VRTW
(EWQ− EW
Q· 1Q
).
Using earlier arguments and Q′ ( Q, we can write ∆VR
ΠW∆WQ
as
∆VRTWEW
Q′∆WQ
= ∆VRTW
(EWQ· 1Q′ − EW
Q· 1Q′
)= ∆V
RTW
(EWQ− EW
Q· 1Q
),
where the last equality follows by Lemma 5.11.1, completing the proof.
5.12 Estimates of the paraproducts
We restate the Carleson embedding theorem, presented in the previous chapter and
in [CT15]. This result will be used to control the norms of the paraproducts:
Theorem 5.12.1 (The matrix weighted Carleson Embedding Theorem). Let W be
a d × d matrix-valued measure and let AQ
be positive semidefinite d × d matrices.
The following statements are equivalent:
(i)∑Q∈D
∥∥∥∥∥A1/2Q
ˆQ
dWf
∥∥∥∥∥2
≤ A‖f‖2L2(W)
(ii)∑
Q∈D(Q0)W(Q)A
QW(Q) ≤ BW(Q0) for all Q0 ∈ D.
72
Moreover, for the best constants A and B we have B ≤ A ≤ C(d)B, where C(d) is
a constant depending only on the dimension d.
Remark. As shown in the previous chapter, C(d) = e · d3(d+ 1)2, where e is the base
of the natural logarithm. We are not sure if this estimate gives optimal asymptotic
on terms of dimension d, but that seems unlikely.
We now bound the paraproducts as follows:
Lemma 5.12.2. Let ΠW be the paraproduct defined earlier and assume that the
well-localized operator TW satisfies the testing condition
∑R∈D(Q)
rkR≥rkQ+r
‖∆VRTW1
Qe‖2
L2(V)≤ T2
1‖1Qe‖2L2(W)
(5.37)
for all Q ∈ D and e ∈ Fd. Then ΠW is bounded from L2(W) to L2(V) and
∥∥∥ΠW∥∥∥L2(W)→L2(V)
≤ C(d)1/2T1,
where C(d) is the constant in Theorem 5.12.1.
Remark 5.12.3. The testing condition (5.37) is clearly weaker than the testing con-
dition (i) from Theorem 5.8.3; the constant T1 from (5.37) is majorized by the
corresponding constant from (i).
Proof of Lemma 5.12.2. Fix f ∈ L2(W) and in the dense set L. Then by orthogo-
nality,
‖ΠWf‖2L2(V) =
∑Q∈D
∑R∈Chr(Q)
‖∆VR
(TWEW
Qf)‖2L2(V).
73
To control this expression, we use Theorem 5.12.1. First, for each Q ∈ D, define the
linear map BQ
: Fd → L2(V) by
BQe =
∑R∈Chr(Q)
∆VRTW(W(Q)−11Qe), ∀e ∈ Fd.
Then defining AQ
:= B∗QBQ : Fd → Fd, we can write
‖ΠWf‖2L2(V) :=
∑Q∈D
∥∥∥∥∥A1/2Q
ˆQ
dWf
∥∥∥∥∥2
,
so we are able to apply Theorem 5.12.1.
To prove condition (ii) in Theorem 5.12.1, fix Q0 ∈ D, e ∈ Fd, and use the
definitions of AQ, B
Qto obtain
∑Q∈D(Q0)
∥∥∥∥A1/2Q
W(Q)e∥∥∥∥2
=∑
Q∈D(Q0)
∥∥∥BQ
W(Q)e∥∥∥2
=∑
Q∈D(Q0)
∑R∈Chr(Q)
‖∆VR
(TW1
Qe)‖2L2(V).
Then using Lemma 5.11.1 and the testing condition (5.37) we get
∑Q∈D(Q0)
∥∥∥∥A1/2Q
W(Q)e∥∥∥∥2
=∑
Q∈D(Q0)
∑R∈Chr(Q)
‖∆VR
(TW1
Q0e)‖2L2(V) by Lemma 5.11.1
≤ T1‖1Q0e‖2
L2(W)by (5.37),
so condition (ii) of Theorem 5.12.1 is verified. Thus
‖ΠWf‖2L2(V) ≤ C(d)T2
1‖f‖2L2(V)
,
which completes the proof.
74
5.13 Estimates of well-localized operators
In this section we will prove Theorem 5.8.3. Theorem 5.8.1 will follow automatically,
since the bounds T1,2 and their duals T∗1,2 from Theorem 5.8.3 are trivially majorized
by the corresponding bounds from Theorem 5.8.1, and the bound T3 from Theorem
5.8.3 is dominated by the minimum of T1 and T∗1 from Theorem 5.8.1. We will also
explain Remark 5.8.4, claiming that the weak estimate (iii) of Theorem 5.8.3 can be
relaxed and sometimes ignored.
To prove Theorem 5.8.3, we estimate the bilinear form of the operator TW . Let
f ∈ L2(W) and g ∈ L2(V), with ‖f‖L2(W)
= ‖g‖L2(V)
= 1, be from the dense set L
of finite linear combinations of characteristic functions of atoms, i.e.
f =N∑j=1
aj1Qj ej and g =M∑k=1
bk1Rkvk,(5.38)
where Qj, Rk ∈ D and ej, vk ∈ Fd. As such functions are dense in L2(W) and L2(V),
to obtain the result, we need to show that
|〈TWf, g〉L2(V)
| ≤ C‖f‖L2(W)
‖g‖L2(V)
.(5.39)
Let us first perform some simplifications. Define an equivalence relation ∼ on D,
by saying that Q ∼ R if Q and R have a common ancestor (i.e. if Q,R ⊂ S for some
S ∈ D).
Since T is a localized operators, 〈T1Q,1
R〉L2(V)
= 0 if Q and R are in different
equivalence classes. Therefore, it is sufficient to prove (5.39) under assumption that
all Qj, Rk in the representation (5.38) are in the same equivalence class. Then,
taking the direct sum over equivalence classes, we get the general case.
75
Let Q0 ∈ D be a common ancestor of all Qj, Rk. Then, by (5.34), we can write
f , g using the orthogonal decompositions:
f =∑
Q∈D(Q0)∆WQf + EW
Q0f =: f1 + f2;(5.40)
g =∑
R∈D(Q0)∆VRg + EV
Q0g =: g1 + g2.(5.41)
We will estimate the four terms 〈TWfj, gk〉L2(V) for 1 ≤ j, k ≤ 2 separately.
5.14 Estimate of the main part
To estimate 〈TWf1, g1〉L2(V)let us first notice that by Lemma 5.12.2, the testing
condition (i) of Theorem 5.8.3 and its dual counterpart imply that the paraproducts
ΠW = ΠWT
and ΠV = ΠVT ∗
are bounded and that
‖ΠW‖L2(W)→L2(V) + ‖ΠV‖L2(V)→L2(W) ≤ C(d)1/2(T1 + T∗1).
Thus, it is sufficient to estimate the operator TW := TW − ΠW − (ΠV)∗. Lemma
5.11.4 implies that
∆VRTW∆W
Q=
∆VRTW∆W
Q, | rkQ− rkR| ≤ r;
0, | rkQ− rkR| > r,
76
so
〈TWf1, g1〉L2(V)=
∑Q,R∈D(Q0)| rkQ−rkR|≤r
〈TW∆WQf,∆V
Rg〉L2(V)
=∑
Q,R∈D(Q0)rkQ≤rkR≤rkQ+r
〈TW∆WQf,∆V
Rg〉L2(V)
+∑
Q,R∈D(Q0)rkR<rkQ≤rkR+r
〈TW∆WQf,∆V
Rg〉L2(V)
.
Let us estimate the first sum. The second sum will be treated similarly, by considering
the dual operator T ∗V. To estimate the first sum, we need to estimate the operator
T+W
:=∑
Q,R∈D(Q0)rkQ≤rkR≤rkQ+r
∆VRTW∆W
Q.
Since T is r-lower triangular, we can see that ∆VRTW∆W
Q= 0 if rkR ≥ rkQ and
R 6⊂ Q(r). So we can rewrite T+W
as
T+W
=∑
S∈D(Q(r)0 )
∑Q∈Chr S
∑R∈D(S)
rkQ≤rkR≤rkQ+r
∆VRTW∆W
Q=:
∑S∈D(Q(r)
0 )
T+,SW
,
where
T+,SW
=∑
Q∈Chr S
∑R∈D(S)
rkQ≤rkR≤rkQ+r
∆VRTW∆W
Q.
The testing condition (ii) of Theorem 5.8.3 implies that
‖T+,SW‖L2(W)→L2(V)
≤ T2.(5.42)
77
Note that if S ∩ S ′ = ∅ or | rkS − rkS ′| > r then
Ran T+,SW⊥ Ran T+,S′
W,
(ker T+,S
W
)⊥⊥(ker T+,S′
W
)⊥.
Therefore for fixed k ∈ Z, the operator T+,kW
defined by
T+,kW
:=∑j∈Z
∑S∈D(Q(r)
0 )rkS=k+(r+1)j
T+,SW
is the direct sum of the corresponding operators T+,SW
, and the estimate (5.42) implies
‖T+,kW‖L2(W)→L2(V)
≤ T2.(5.43)
Since T+W
= ∑rk=0 T
+,kW
, we can easily conclude from (5.43) that
‖T+W‖L2(W)→L2(V)
≤ (r + 1)T2.
However, by being more careful, we can obtain the following better dependence on
r:
‖T+W‖L2(W)→L2(V)
≤ (r + 1)1/2T2.(5.44)
To get this, observe that for for 0 ≤ j < k ≤ r
(ker T+,j
W
)⊥⊥(ker T+,k
W
)⊥.
78
Then, decomposing f1 = ∑rk=0 f
k, where
fk :=∑n∈Z
∑S∈D
rkS=k+(r+1)n
∑Q∈Chr S
∆WQf,
we get that
‖T+Wf1‖L2(V)
=∥∥∥∥∥T+
W
r∑k=0
fk∥∥∥∥∥L2(V)
=∥∥∥∥∥
r∑k=0
T+,kW
fk∥∥∥∥∥L2(V)
≤r∑
k=0‖fk‖
L2(W)≤(
r∑k=0‖fk‖2
L2(W)
)1/2
(r + 1)1/2;
here the last inequality is a consequence of the Cauchy–Schwarz inequality.
5.15 Estimates of parts involving constant func-
tions
Estimates
|〈TWf2, g1〉L2(V)| ≤ T1
|〈TWf1, g2〉L2(V)| ≤ T∗1
follow immediately from the testing condition (i) and its dual. Estimate
|〈TWf2, g2〉L2(V)| ≤ T3
is a direct corollary of the assumption (iii).
Note that in decompositions (5.40) and (5.41), we can replace Q0 by any of its
79
ancestors, so, as pointed out in Remark 5.8.4, it is sufficient that the estimate (iii)
hold only for sufficiently large cubes Q (meaning that for any Q0 ∈ D we can find
Q ∈ D, Q0 ⊂ Q such that (iii) holds for Q).
Moreover, if for the increasing sequence of cubes Qn, n ≥ 0, where Qn+1 is the
parent of Qn, we have that W(Qn) ≥ αnI, αn ↗∞ then writing the decomposition
(5.40) with Qn instead of Q0 and letting n→ +∞, we obtain
f =∑Q∈D
∆WQf =: f1.
The analogous condition for V implies a similar representation for g, so the theorem is
reduced to estimating 〈TWf1, g1〉L2(V), which was done using only testing conditions
(i), (ii) and their duals.
5.16 Estimates of the Haar shifts
In this section we will prove Lemma 5.9.1. Theorem 5.9.2 is then a simple corollary of
Theorem 5.8.3. We will need the following lemma, which is well known to specialists;
for the convenience of the reader we present its proof here.
Lemma 5.16.1. Let T be an integral operator with kernel K, Tf(x) =´K(x, y)f(y)dy,
where K is supported on Q × Q (Q ∈ D) and ‖K‖∞ ≤ |Q|−1. Assuming that the
weights V, W satisfy the matrix A2 condition (5.27) we have for the operator TW,
TWf = T (Wf)
‖TW‖L2(W)→L2(V)≤ d1/2[V,W]1/2
A2.
Proof. Take f ∈ L2(W), g ∈ L2(V), ‖f‖L2(W)
= ‖g‖L2(V)
= 1. As discussed above
80
in Chapter 2, we can assume without loss of generality that the measures V and
W are absolutely continuous with respect to scalar measures v and w respectively,
dV = V dv, dW = Wdw. We then can write
∣∣∣∣〈TWf, g〉L2(V)
∣∣∣∣ ≤¨Q×Q
∣∣∣〈V (x)K(x, y)W (y)f(y), g(x)〉Fd
∣∣∣ dv(x)dw(y).
The integral then can be estimated by
|Q|−1¨Q×Q‖V 1/2(x)W 1/2(y)‖ · ‖V 1/2(x)g(x)‖
Fd‖W 1/2(y)f(y)‖
Fddv(x)dw(y)
≤(¨
Q×Q‖V 1/2(x)g(x)‖2
Fd‖W 1/2(y)f(y)‖d
Fdv(x)dw(y)
)1/2
×
×(|Q|−2
¨Q×Q‖V 1/2(x)W 1/2(y)‖2dv(x)dw(y)
)1/2
= ‖f‖L2(W)
‖g‖L2(V)
(|Q|−2
¨Q×Q‖V 1/2(x)W 1/2(y)‖2dv(x)dw(y)
)1/2
.
In the last integral, we can replace the operator norm by the Frobenius (Hilbert–
Schmidt) norm ‖ · ‖S2
(recall that ‖A‖2S2
= tr(A∗A)):
¨Q×Q‖V 1/2(x)W 1/2(y)‖2dv(x)dw(y) ≤
¨Q×Q‖V 1/2(x)W 1/2(y)‖2
S2dv(x)dw(y)
=¨Q×Q
tr(V (x)W (y)
)dv(x)dw(y)
= tr(
V(Q)W(Q))
= ‖V(Q)1/2W(Q)1/2‖2S2
≤ d · ‖V(Q)1/2W(Q)1/2‖2
≤ d · |Q|2[V,W]A2.
81
Combining this with the previous estimate, we get the conclusion of the lemma.
5.17 Comparison of different truncations
In the testing conditions from Theorems 5.8.3 and 5.9.2, we used different trunca-
tions of the operator TW , namely TQW
and (TQ)W respectively. These operators are
generally different, but their difference can be estimated.
To state the estimate, we will require some new notation. Let PVQ
be the or-
thogonal projection in L2(V) onto the subspace of functions supported on Q and
orthogonal to {1Qe : e ∈ Fd}. Then for f ∈ L:
PVQf =
∑R∈D(Q)
∆VRf = 1
Qf − EV
Qf.
In this notation, the operator TQW
defined above can be written as TQW
= PVQTW .
Lemma 5.17.1. For operators TQW
and (TQ)W introduced above, we have for f
supported on Q
∥∥∥∥(TQW − PVQ
(TQ)W
)f∥∥∥∥L2(V)
≤ d1/2r · [V,W]1/2A2‖f‖
L2(W).
Proof. For f ∈ L and supported on Q, we have
(TQ
W− PV
Q(TQ)W
)f =
r∑k=1
PVQTQ(k) (Wf)
(the terms TQ(k) (Wf) with k > r are annihilated by PV
Q).
Each operator TQ(k) is an integral operator with kernel K
Q(k) supported on Q(k)×
Q(k) and satisfying ‖KQ(k)‖∞ ≤ |Q(k)|−1. Therefore applying Lemma 5.16.1 and using
82
the fact that PVQ
is an orthogonal projection (and so a contraction) in L2(V) we get
‖PVQTQ(k) (Wf)‖
L2(V)≤ d1/2[V,W]1/2
A2‖f‖
L2(W).
Summation over k completes the proof.
5.18 Proof of Lemma 5.9.1
We conclude this chapter by proving Lemma 5.9.1. Assume that the testing condition
(5.29) holds. Applying Lemma 5.17.1 with f = 1Qe and noticing that
‖PVQ
(TQ)Wf‖L2(V)
≤ ‖(TQ)Wf‖L2(V)
≤ T‖f‖L2(W)
,
we immediately get (5.30).
To get (5.31), some more work is needed. Write
TQ = T r+1 +r∑
k=0Tk,
where
T r+1 =∑
R∈Chr+1 Q
TR
Tk =∑
R∈Chk Q
TR,
with the obvious agreement that Ch0Q = {Q}.
Following the agreed upon notation, for a scalar integral operator T , we denote
83
by TW the operator given by
TWf := T (Wf),
whenever this expression is well defined.
The operators TR
are R-localized, meaning that TRf = T
R(1
Rf), and T
Rf is
supported on R, and the same holds for TR. The functions f ∈ DW,r
Qare constant
on cubes R ∈ Chr+1Q, so using the testing condition (5.29) and the fact that the
operators TR are R-localized, we get for f ∈ DW,r
Q
‖(T r+1)Wf‖2L2(V)
=∑
R∈Chr+1 Q
‖TR(W1Rf)‖2
L2(V)
x ≤∑
R∈Chr+1 Q
T‖1Rf‖2
L2(W)(5.45)
= T‖f‖2L2(W)
.(5.46)
To estimate the operators Tk, we estimate each block TRby Lemma 5.16.1, and using
the fact that TRis R-localized we get for f ∈ DW,r
Q
‖(Tk)Wf‖2L2(V)
=∑
R∈Chk Q
‖TR
(W1Rf)‖2
L2(V)
≤∑
R∈Chk Q
d · [V,W]A2‖1
Rf‖2
L2(W)
= d · [V,W]A2‖f‖2
L2(W).
Adding these estimates for k = 0, 1, . . . , r and combining them with (5.45) ,we see
84
that for any f ∈ DW,r
Q
‖(TQ)Wf‖L2(V)
≤(d1/2(r + 1)[V,W]1/2
A2+ T
)‖f‖
L2(W).
Since the projection PVQ
is a contraction in L2(V), the same estimate holds for the
norm ‖PVQ
(TQ)Wf‖L2(V)
, so combining it with Lemma 5.17.1, we obtain (5.31).
Finally, to show that (5.32) holds, let us recall that T = ∑R∈R TR , where R ⊂ D
is some finite collection. Then for each Q0 ∈ D we can find a cube Q ⊃ Q0 which
is not contained in any R ∈ R. Then TW1Qe = (TQ)W1
Qe, and (5.32) follows from
(5.29).
Bibliography
[BCTW16] K. Bickel, A. Culiuc, S. Treil, and B. Wick. Two weight estimates with
matrix measures for well localized operations. in preparation, 2016.
[Bez08] O. Beznosova. Linear bound for the dyadic paraproduct on weighted
Lebesgue space L2(w). J. Funct. Anal., 255(4):994–1007, 2008.
[BPW14] K. Bickel, S Petermichl, and B. Wick. Bounds for the Hilbert transform
with matrix A2 weights. preprint arXiv:1402.3886, 2014.
[Buc93] S. Buckley. Estimates for operator norms on weighted spaces and reverse
Jensen inequalities. Trans. Amer. Math. Soc., 340(1):253–272, 1993.
[BW14] K. Bickel and B. Wick. Well-localized operators on matrix weighted L2
spaces. preprint arXiv:1407.3819, 2014.
[BW15] K. Bickel and B. Wick. A study of the matrix Carleson embedding the-
orem with applications to sparse operators. preprint arXiv:1503.06493,
2015.
[CDPO16] A. Culiuc, F. Di Plinio, and Y. Ou. Domination of multilinear singular
integrals by positive sparse forms. preprint arXiv:1603.05317, 2016.
85
86
[CF74] R. R. Coifman and C. Fefferman. Weighted norm inequalities for maxi-
mal functions and singular integrals. Studia Math., 51:241–250, 1974.
[CT15] A. Culiuc and S. Treil. The Carleson embedding theorem with matrix
weights. preprint arXiv:1401.6570, 2015.
[Cul15] A. Culiuc. A note on two weight bounds for the generalized Hardy-
Littlewood maximal operator. preprint arXiv:0911.3437, 2015.
[CUMP10] D. Cruz-Uribe, J. Martell, and C. Pérez. Sharp weighted estimates for
approximating dyadic operators. Electron. Res. Announc. Math. Sci.,
17:12–19, 2010.
[CUMP12] D. Cruz-Uribe, J. Martell, and C. Pérez. Sharp weighted estimates for
classical operators. Adv. Math., 229(1):408–441, 2012.
[CW15] A. Culiuc and B. Wick. The boundedness of the Riesz and Ahlfors-
Beurling transforms in matrix weighted spaces. preprint available upon
request, 2015.
[DV03] O. Dragičević and A. Volberg. Sharp estimate of the Ahlfors-Beurling
operator via averaging martingale transforms. Michigan Math. J.,
51(2):415–435, 2003.
[FG13] F. Farroni and R. Giova. Change of variables for A∞ weights by means of
quasiconformal mappings: sharp results. Ann. Acad. Sci. Fenn. Math.,
38(2):785–796, 2013.
[FKP91] R. A. Fefferman, C. E. Kenig, and J. Pipher. The theory of weights
and the Dirichlet problem for elliptic equations. Ann. of Math. (2),
134(1):65–124, 1991.
87
[HMW73] R. Hunt, B. Muckenhoupt, and R. Wheeden. Weighted norm inequalities
for the conjugate function and Hilbert transform. Trans. Amer. Math.
Soc., 176:227–251, 1973.
[HS60] H. Helson and G. Szegö. A problem in prediction theory. Ann. Mat.
Pura ed App., 51(1):107–138, 1960.
[Hyt12a] T. Hytönen. The A2 theorem: Remarks and complements. preprint
arXiv:1212.3840, 2012.
[Hyt12b] T. Hytönen. The sharp weighted bound for general Calderón-Zygmund
operators. Ann. of Math. (2), 175(3):1473–1506, 2012.
[IKP14] J. Isralowitz, H. Kwon, and S. Pott. A matrix weighted T1 theo-
rem for matrix kernelled Calderon-Zygmund operators. preprint arXiv:
1508.01716, 2014.
[IM01] T. Iwaniec and G. Martin. Geometric function theory and non-linear
analysis. Oxford University Press, Oxford, 2001.
[Lac15] M. Lacey. An elementary proof of the A2 Bound. preprint
arXiv:1501.05818, 2015.
[Lai15] J. Lai. Two weight problems and Bellman functions on filtered spaces.
PhD thesis, Brown University, 2015.
[Ler12] A. Lerner. On an estimate of Calderón-Zygmund operators by dyadic
positive operators. preprint arXiv:1202.1860, 2012.
[Ler13] A. Lerner. A simple proof of the A2 conjecture. Int. Math. Res. Not.,
(14):3159–3170, 2013.
88
[LPR10] M. Lacey, S. Petermichl, and M. Reguera. Sharp A2 inequality for Haar
shift operators. Math. Ann., 348(1):127–141, 2010.
[LS09] M. Lacey and I. Sawyer, E.and Uriarte-Tuero. Two weight inequalities
for discrete positive operators. preprint arXiv:1506.07125, 2009.
[Mel05] A. Melas. The Bellman functions of dyadic-like maximal operators and
related inequalities. Adv. Math., 192(2):310–340, 2005.
[Nie10] M. Nielsen. On stability of finitely generated shift-invariant systems. J.
Fourier Anal. Appl., 16(6):901–920, 2010.
[NPTV02] F. Nazarov, G. Pisier, S. Treil, and A. Volberg. Sharp estimates in vec-
tor Carleson imbedding theorem and for vector paraproducts. J. Reine
Angew. Math., 542:147–171, 2002.
[NT96] F. Nazarov and S. Treil. The hunt for a bellman function: applications to
estimates for singular integral operators and to other classical problems
of harmonic analysis (russian). Algebra i Analiz, 8(5):32–162, 1996.
[NTV99] F. Nazarov, S. Treil, and A. Volberg. The Bellman functions and two-
weight inequalities for Haar multipliers. J. Amer. Math. Soc., 12(4):909–
928, 1999.
[NTV01] F. Nazarov, S. Treil, and A. Volberg. Bellman function in stochastic con-
trol and harmonic analysis. In Systems, approximation, singular integral
operators, and related topics (Bordeaux, 2000), volume 129 of Oper. The-
ory Adv. Appl., pages 393–423. Birkhäuser, Basel, 2001.
89
[NTV08] F. Nazarov, S. Treil, and A. Volberg. Two weight inequalities for indi-
vidual Haar multipliers and other well localized operators. Math. Res.
Lett., 15(3):583–597, 2008.
[Pet00] S. Petermichl. Dyadic shifts and a logarithmic estimate for Hankel opera-
tors with matrix symbol. C. R. Acad. Sci. Paris Sér. I Math., 330(6):455–
460, 2000.
[Pet07] S. Petermichl. The sharp bound for the Hilbert transform on weighted
Lebesgue spaces in terms of the classical Ap characteristic. Amer. J.
Math., 129(5):1355–1375, 2007.
[Pet08] S. Petermichl. The sharp weighted bound for the Riesz transforms. Proc.
Amer. Math. Soc., 136(4):1237–1249, 2008.
[PTV02] S. Petermichl, S. Treil, and A. Volberg. Why the Riesz transforms are
averages of the dyadic shifts? In Proceedings of the 6th International
Conference on Harmonic Analysis and Partial Differential Equations (El
Escorial, 2000), number Vol. Extra, pages 209–228, 2002.
[PTV10] C. Pérez, S. Treil, and A. Volberg. On the A2 conjecture and Corona
decomposition of weights. preprint arXiv:1006.2630, 2010.
[PV02] S. Petermichl and A. Volberg. Heating of the Ahlfors-Beurling operator:
weakly quasiregular maps on the plane are quasiregular. Duke Math. J.,
112(2):281–305, 2002.
[RdF84] J. Rubio de Francia. Factorization theory and Ap weights. Amer. J.
Math., 106(3):533–547, 1984.
90
[Saw82] E. Sawyer. A characterization of a two-weight norm inequality for max-
imal operators. Studia Math., 75(1):1–11, 1982.
[Saw88] E. Sawyer. A characterization of two weight norm inequalities for frac-
tional and Poisson integrals. Trans. Amer. Math. Soc., 308(2):533–545,
1988.
[Tre12] S. Treil. A remark on two weight inequalities for positive dyadic opera-
tors. preprint arXiv:1201.1455, 2012.
[TV97] S. Treil and A. Volberg. Wavelets and the angle between past and future.
J. Funct. Anal., 143(2):269–308, 1997.
[Vol97] A. Volberg. Matrix Ap weights via S-functions. J. Amer. Math. Soc.,
10(2):445–466, 1997.
[Wit00] J. Wittwer. A sharp estimate on the norm of the martingale transform.
Math. Res. Lett., 7(1):1–12, 2000.