Volume 3, Issue 2
Implicit Bias in Understanding Deep Learning for Solving PDEs Beyond Ritz-Galerkin Method

CSIAM Trans. Appl. Math., 3 (2022), pp. 299-317.

Published online: 2022-05

Cited by

Export citation
• Abstract

This paper aims at studying the difference between Ritz-Galerkin (R-G) method and deep neural network (DNN) method in solving partial differential equations (PDEs) to better understand deep learning. To this end, we consider solving a particular Poisson problem, where the information of the right-hand side of the equation $f$ is only available at $n$ sample points, that is, $f$ is known at finite sample points. Through both theoretical and numerical studies, we show that solution of the R-G method converges to a piecewise linear function for the one dimensional problem or functions of lower regularity for high dimensional problems. With the same setting, DNNs however learn a relative smooth solution regardless of the dimension, this is, DNNs implicitly bias towards functions with more low-frequency components among all functions that can fit the equation at available data points. This bias is explained by the recent study of frequency principle. In addition to the similarity between the traditional numerical methods and DNNs in the approximation perspective, our work shows that the implicit bias in the learning process, which is different from traditional numerical methods, could help better understand the characteristics of DNNs.

35Q68, 65N30, 65N35

• BibTex
• RIS
• TXT
@Article{CSIAM-AM-3-299, author = {Wang , JihongXu , Zhi-Qin JohnZhang , Jiwei and Zhang , Yaoyu}, title = {Implicit Bias in Understanding Deep Learning for Solving PDEs Beyond Ritz-Galerkin Method}, journal = {CSIAM Transactions on Applied Mathematics}, year = {2022}, volume = {3}, number = {2}, pages = {299--317}, abstract = {

This paper aims at studying the difference between Ritz-Galerkin (R-G) method and deep neural network (DNN) method in solving partial differential equations (PDEs) to better understand deep learning. To this end, we consider solving a particular Poisson problem, where the information of the right-hand side of the equation $f$ is only available at $n$ sample points, that is, $f$ is known at finite sample points. Through both theoretical and numerical studies, we show that solution of the R-G method converges to a piecewise linear function for the one dimensional problem or functions of lower regularity for high dimensional problems. With the same setting, DNNs however learn a relative smooth solution regardless of the dimension, this is, DNNs implicitly bias towards functions with more low-frequency components among all functions that can fit the equation at available data points. This bias is explained by the recent study of frequency principle. In addition to the similarity between the traditional numerical methods and DNNs in the approximation perspective, our work shows that the implicit bias in the learning process, which is different from traditional numerical methods, could help better understand the characteristics of DNNs.

}, issn = {2708-0579}, doi = {https://doi.org/10.4208/csiam-am.SO-2020-0006}, url = {http://global-sci.org/intro/article_detail/csiam-am/20539.html} }
TY - JOUR T1 - Implicit Bias in Understanding Deep Learning for Solving PDEs Beyond Ritz-Galerkin Method AU - Wang , Jihong AU - Xu , Zhi-Qin John AU - Zhang , Jiwei AU - Zhang , Yaoyu JO - CSIAM Transactions on Applied Mathematics VL - 2 SP - 299 EP - 317 PY - 2022 DA - 2022/05 SN - 3 DO - http://doi.org/10.4208/csiam-am.SO-2020-0006 UR - https://global-sci.org/intro/article_detail/csiam-am/20539.html KW - Deep learning, Ritz-Galerkin method, partial differential equations, F-Principle. AB -

This paper aims at studying the difference between Ritz-Galerkin (R-G) method and deep neural network (DNN) method in solving partial differential equations (PDEs) to better understand deep learning. To this end, we consider solving a particular Poisson problem, where the information of the right-hand side of the equation $f$ is only available at $n$ sample points, that is, $f$ is known at finite sample points. Through both theoretical and numerical studies, we show that solution of the R-G method converges to a piecewise linear function for the one dimensional problem or functions of lower regularity for high dimensional problems. With the same setting, DNNs however learn a relative smooth solution regardless of the dimension, this is, DNNs implicitly bias towards functions with more low-frequency components among all functions that can fit the equation at available data points. This bias is explained by the recent study of frequency principle. In addition to the similarity between the traditional numerical methods and DNNs in the approximation perspective, our work shows that the implicit bias in the learning process, which is different from traditional numerical methods, could help better understand the characteristics of DNNs.

Jihong Wang, Zhi-Qin John Xu, Jiwei Zhang & Yaoyu Zhang. (2022). Implicit Bias in Understanding Deep Learning for Solving PDEs Beyond Ritz-Galerkin Method. CSIAM Transactions on Applied Mathematics. 3 (2). 299-317. doi:10.4208/csiam-am.SO-2020-0006
Copy to clipboard
The citation has been copied to your clipboard