Celebratio Mathematica

D. L. Donoho, M. Vetterli, R. A. DeVore, and I. Daubechies: “Data compression and harmonic analysis,” IEEE Trans. Inf. Theory 44 : 6 (1998), pp. 2435–2476. MR 1658775 Zbl 1125.94311 article

In this paper we review some recent interactions between harmonic analysis and data compression. The story goes back of course to Shannon’s \( R(D) \) theory in the case of Gaussian stationary processes, which says that transforming into a Fourier basis followed by block coding gives an optimal lossy compression technique; practical developments like transform-based image compression have been inspired by this result. In this paper we also discuss connections perhaps less familiar to the information theory community, growing out of the field of harmonic analysis. Recent harmonic analysis constructions, such as wavelet transforms and Gabor transforms, are essentially optimal transforms for transform coding in certain settings. Some of these transforms are under consideration for future compression standards.

We discuss some of the lessons of harmonic analysis in this century. Typically, the problems and achievements of this field have involved goals that were not obviously related to practical data compression, and have used a language not immediately accessible to outsiders. Nevertheless, through an extensive generalization of what Shannon called the “sampling theorem”, harmonic analysis has succeeded in developing new forms of functional representation which turn out to have significant data compression interpretations. We explain why harmonic analysis has interacted with data compression, and we describe some interesting recent ideas in the field that may affect data compression in the future.

Ronald Alvin DeVore	Related
David Leigh Donoho	Related
Martin Vetterli	Related

A. Cohen, W. Dahmen, I. Daubechies, and R. DeVore: “Tree approximation and optimal encoding,” Appl. Comput. Harmon. Anal. 11 : 2 (2001), pp. 192–226. MR 1848303 Zbl 0992.65151 article

Tree approximation is a new form of nonlinear approximation which appears naturally in some applications such as image processing and adaptive numerical methods. It is somewhat more restrictive than the usual \( n \)-term approximation. We show that the restrictions of tree approximation cost little in terms of rates of approximation. We then use that result to design encoders for compression. These encoders are universal (they apply to general functions) and progressive (increasing accuracy is obtained by sending bit stream increments). We show optimality of the encoders in the sense that they provide upper estimates for the Kolmogorov entropy of Besov balls.

Albert Cohen	Related
Ronald Alvin DeVore	Related
Wolfgang A. Dahmen	Related

I. Daubechies, R. DeVore, C. S. Güntürk, and V. A. Vaishampayan: “Beta expansions: A new approach to digitally correct A/D conversion,” pp. 784–787 in 2002 IEEE international symposium on circuits and systems: Proceedings (Phoenix-Scottsdale, AZ, 26–29 May 2007), vol. 2. IEEE (Piscataway, NJ), 2002. incollection

We introduce a new architecture for pipelined (and also algorithmic) A/D converters that give exponentially accurate conversion using inaccurate comparators. An error analysis of a sigma-delta converter with an imperfect comparator and a constant input reveals a self-correction property that is not inherited by the successive refinement quantization algorithm that underlies both pipelined multistage A/D converters and algorithmic A/D converters. Motivated by this example, we introduce a new A/D converter, the beta converter, which has the same self-correction property as a sigma-delta converter but which exhibits higher order (exponential) accuracy with respect to the bit rate as compared to a sigma-delta converter, which exhibits only polynomial accuracy.

Ronald Alvin DeVore	Related
Cemalettin Sinan Güntürk	Related
Vinay A. Vaishampayan	Related

A. Cohen, W. Dahmen, I. Daubechies, and R. DeVore: “Harmonic analysis of the space BV,” Rev. Mat. Iberoamericana 19 : 1 (2003), pp. 235–263. MR 1993422 Zbl 1044.42028 article

Albert Cohen	Related
Ronald Alvin DeVore	Related
Wolfgang A. Dahmen	Related

I. Daubechies and R. DeVore: “Approximating a bandlimited function using very coarsely quantized data: A family of stable sigma-delta modulators of arbitrary order,” Ann. Math. (2) 158 : 2 (2003), pp. 679–710. MR 2018933 Zbl 1058.94004 article

I. Daubechies, R. A. DeVore, C. S. Güntürk, and V. A. Vaishampayan: “A/D conversion with imperfect quantizers,” IEEE Trans. Inform. Theory 52 : 3 (2006), pp. 874–885. MR 2238058 Zbl 1231.94036 article

This paper analyzes mathematically the effect of quantizer threshold imperfection commonly encountered in the circuit implementation of analog-to-digital (A/D) converters such as pulse code modulation (PCM) and sigma–delta (\( \Sigma\Delta \)) modulation. \( \Sigma\Delta \) modulation, which is based on coarse quantization of oversampled (redundant) samples of a signal, enjoys a type of self-correction property for quantizer threshold errors (bias) that is not shared by PCM. Although “classical” \( \Sigma\Delta \) modulation is inferior to PCM in the rate-distortion sense, this robustness feature is believed to be one of the reasons why \( \Sigma\Delta \) modulation is preferred over PCM in A/D converters with imperfect quantizers. Motivated by these facts, other encoders are constructed in this paper that use redundancy to obtain a similar self-correction property, but that achieve higher order accuracy relative to bit rate compared to classical \( \Sigma\Delta \). More precisely, two different types of encoders are introduced that exhibit exponential accuracy in the bit rate (in contrast to the polynomial-type accuracy of classical \( \Sigma\Delta \)) while possessing the self-correction property.

Ronald Alvin DeVore	Related
Cemalettin Sinan Güntürk	Related
Vinay A. Vaishampayan	Related

I. Daubechies, R. DeVore, M. Fornasier, and S. Gunturk: “Iteratively re-weighted least squares minimization: Proof of faster than linear rate for sparse recovery,” pp. 26–29 in 2008 42nd annual conference on information sciences and systems (Princeton, NJ, 19–21 March 2008). IEEE (Piscataway, NJ), 2008. incollection

Given an \( m{\times}N \) matrix \( \Phi \), with \( m < N \), the system of equations \( \Phi x = y \) is typically underdetermined and has infinitely many solutions. Various forms of optimization can extract a “best” solution. One of the oldest is to select the one with minimal \( \ell_2 \) norm. It has been shown that in many applications a better choice is the minimal \( \ell_1 \) norm solution. This is the case in Compressive Sensing, when sparse solutions are sought.

The minimal \( \ell_1 \) norm solution can be found by using linear programming; an alternative method is Iterative Re-weighted Least Squares (IRLS), which in some cases is numerically faster. The main step of IRLS finds, for a given weight \( w \), the solution with smallest \( \ell_2(w) \) norm; this weight is updated at every iteration step: if \( x^{(n)} \) is the solution at step \( n \), then \( w^{(n)} \) is defined by \[ w_i^{(n)} := \frac1{|x_i^{(n)}|},\quad i = 1,\dots,N .\]

We give a specific recipe for updating weights that avoids technical shortcomings in other approaches, and for which we can prove convergence under certain conditions on the matrix \( \Phi \) known as the Restricted Isometry Property. We also show that if there is a sparse solution, then the limit of the proposed algorithm is that sparse solution. It is also shown that whenever the solution at a given iteration is sufficiently close to the limit, then the remaining steps of the algorithm converge exponentially fast. In the standard version of the algorithm, designed to emulate \( \ell_1 \)-minimization, the exponential rate is linear; in adapted versions aimed at \( \ell_{\tau} \)-minimization with \( \tau < 1 \), we prove faster than linear rate.

Massimo Fornasier	Related
Ronald Alvin DeVore	Related
Cemalettin Sinan Güntürk	Related

I. Daubechies, R. DeVore, M. Fornasier, and C. S. Güntürk: “Iteratively reweighted least squares minimization for sparse recovery,” Comm. Pure Appl. Math. 63 : 1 (2010), pp. 1–38. MR 2588385 Zbl 1202.65046 ArXiv 0807.0575 article

Under certain conditions (known as the restricted isometry property, or RIP) on the \( m{\times}N \) matrix \( \Phi \) (where \( m < N \)), vectors \( x\in\mathbb{R}^N \) that are sparse (i.e., have most of their entries equal to 0) can be recovered exactly from \( y := \Phi x \) even though \( \Phi^{-1}(y) \) is typically an \( (N-m) \)-dimensional hyperplane; in addition, \( x \) is then equal to the element in \( \Phi^{-1}(y) \) of minimal \( \ell_1 \)-norm. This minimal element can be identified via linear programming algorithms. We study an alternative method of determining \( x \), as the limit of an iteratively reweighted least squares (IRLS) algorithm. The main step of this IRLS finds, for a given weight vector \( w \), the element in \( \Phi^{-1}(y) \) with smallest \( \ell_2(w) \)-norm. If \( x^{(n)} \) is the solution at iteration step \( n \), then the new weight \( w^{(n)} \) is defined by \[ w_i^{(n)} := \bigl[|x_i^{(n)}|^2 + \epsilon_n^2\bigr]^{-1/2} ,\] \( i = 1, \dots,N \), for a decreasing sequence of adaptively defined \( \epsilon_n \); this updated weight is then used to obtain \( x^{(n+1)} \) and the process is repeated. We prove that when \( \Phi \) satisfies the RIP conditions, the sequence \( x^{(n)} \) converges for all \( y \), regardless of whether \( \Phi^{-1}(y) \) contains a sparse vector. If there is a sparse vector in \( \Phi^{-1}(y) \), then the limit is this sparse vector, and when \( x^{(n)} \) is sufficiently close to the limit, the remaining steps of the algorithm converge exponentially fast (linear convergence in the terminology of numerical optimization). The same algorithm with the “heavier” weight \[ w_i^{(n)} := \bigl[|x_i^{(n)}|^2 + \epsilon_n^2\bigr]^{-1 + \tau/2} ,\] \( i = 1,\dots,N \), where \( 0 < \tau < 1 \), can recover sparse solutions as well; more importantly, we show its local convergence is superlinear and approaches a quadratic rate for \( \tau \) approaching 0.

Massimo Fornasier	Related
Ronald Alvin DeVore	Related
Cemalettin Sinan Güntürk	Related

A. Cohen, I. Daubechies, R. DeVore, G. Kerkyacharian, and D. Picard: “Capturing ridge functions in high dimensions from point queries,” Constr. Approx. 35 : 2 (April 2012), pp. 225–243. MR 2891227 Zbl 1318.62286 article

Constructing a good approximation to a function of many variables suffers from the “curse of dimensionality”. Namely, functions on \( \mathbb{R}^N \) with smoothness of order \( s \) can in general be captured with accuracy at most \( O(n^{-s/N}) \) using linear spaces or nonlinear manifolds of dimension \( n \). If \( N \) is large and \( s \) is not, then \( n \) has to be chosen inordinately large for good accuracy. The large value of \( N \) often precludes reasonable numerical procedures. On the other hand, there is the common belief that real world problems in high dimensions have as their solution, functions which are more amenable to numerical recovery. This has led to the introduction of models for these functions that do not depend on smoothness alone but also involve some form of variable reduction. In these models it is assumed that, although the function depends on \( N \) variables, only a small number of them are significant. Another variant of this principle is that the function lives on a low dimensional manifold. Since the dominant variables (respectively the manifold) are unknown, this leads to new problems of how to organize point queries to capture such functions. The present paper studies where to query the values of a ridge function \[ f(x) = g(a\cdot x) \] when both \( a\in\mathbb{R}^N \) and \( g\in C[0,1] \) are unknown. We establish estimates on how well \( f \) can be approximated using these point queries under the assumptions that \( g\in C^s[0,1] \). We also study the role of sparsity or compressibility of \( a \) in such query problems.

Ronald Alvin DeVore	Related
Dominique B. Picard	Related
Albert Cohen	Related
Gérard Kerkyacharian	Related

I. Daubechies, R. DeVore, S. Foucart, B. Hanin, and G. Petrova: Nonlinear approximation and (deep) ReLU networks. Preprint, May 2019. ArXiv 1905.02199 techreport

This article is concerned with the approximation and expressive powers of deep neural networks. This is an active research area currently producing many interesting papers. The results most commonly found in the literature prove that neural networks approximate functions with classical smoothness to the same accuracy as classical linear methods of approximation, e.g. approximation by polynomials or by piecewise polynomials on prescribed partitions. However, approximation by neural networks depending on n parameters is a form of nonlinear approximation and as such should be compared with other nonlinear methods such as variable knot splines or \( n \)-term approximation from dictionaries. The performance of neural networks in targeted applications such as machine learning indicate that they actually possess even greater approximation power than these traditional methods of nonlinear approximation. The main results of this article prove that this is indeed the case. This is done by exhibiting large classes of functions which can be efficiently captured by neural networks where classical nonlinear methods fall short of the task. The present article purposefully limits itself to studying the approximation of univariate functions by ReLU networks. Many generalizations to functions of several variables and other activation functions can be envisioned. However, even in this simplest of settings considered here, a theory that completely quantifies the approximation power of neural networks is still lacking.

Ronald Alvin DeVore	Related
Guergana Petrova	Related
Boris L. Hanin	Related
Simon Foucart	Related

year	title	people

			clear

Ingrid Daubechies

Complete Bibliography

Works connected to Ronald Alvin DeVore

Filter the Bibliography List