TY - JOUR
T1 - Better Approximations of High Dimensional Smooth Functions by Deep Neural Networks with Rectified Power Units
AU - Li , Bo
AU - Tang , Shanshan
AU - Yu , Haijun
JO - Communications in Computational Physics
VL - 2
SP - 379
EP - 411
PY - 2019
DA - 2019/12
SN - 27
DO - http://doi.org/10.4208/cicp.OA-2019-0168
UR - https://global-sci.org/intro/article_detail/cicp/13451.html
KW - Deep neural network, high dimensional approximation, sparse grids, rectified linear
unit, rectified power unit, rectified quadratic unit.
AB - <p style="text-align: justify;">Deep neural networks with rectified linear units (ReLU) are getting more
and more popular due to their universal representation power and successful applications. Some theoretical progress regarding the approximation power of deep ReLU
network for functions in Sobolev space and Korobov space have recently been made
by [D. Yarotsky, Neural Network, 94:103-114, 2017] and [H. Montanelli and Q. Du,
SIAM J Math. Data Sci., 1:78-92, 2019], etc. In this paper, we show that deep networks with rectified power units (RePU) can give better approximations for smooth
functions than deep ReLU networks. Our analysis bases on classical polynomial approximation theory and some efficient algorithms proposed in this paper to convert
polynomials into deep RePU networks of optimal size with no approximation error.
Comparing to the results on ReLU networks, the sizes of RePU networks required to
approximate functions in Sobolev space and Korobov space with an error tolerance
ε, by our constructive proofs, are in general <span style="color: #333333; text-align: justify;">$O$</span>(<span style="color: #333333; font-family: ">$log\frac{1}{ε}$</span>) times smaller than the sizes of
corresponding ReLU networks constructed in most of the existing literature. Comparing to the classical results of Mhaskar [Mhaskar, Adv. Comput. Math. 1:61-80, 1993],
our constructions use less number of activation functions and numerically more stable,
they can be served as good initials of deep RePU networks and further trained to break
the limit of linear approximation theory. The functions represented by RePU networks
are smooth functions, so they naturally fit in the places where derivatives are involved
in the loss function.</p>