Volume 5, Issue 3
Phase Diagram of Initial Condensation for Two-Layer Neural Networks

Zheng-An Chen, Yuqing Li, Tao Luo, Zhangchen Zhou & Zhi-Qin John Xu

CSIAM Trans. Appl. Math., 5 (2024), pp. 448-514.

Published online: 2024-08

Export citation
  • Abstract

The phenomenon of distinct behaviors exhibited by neural networks under varying scales of initialization remains an enigma in deep learning research. In this paper, based on the earlier work [Luo et al., J. Mach. Learn. Res., 22:1–47, 2021], we present a phase diagram of initial condensation for two-layer neural networks. Condensation is a phenomenon wherein the weight vectors of neural networks concentrate on isolated orientations during the training process, and it is a feature in non-linear learning process that enables neural networks to possess better generalization abilities. Our phase diagram serves to provide a comprehensive understanding of the dynamical regimes of neural networks and their dependence on the choice of hyperparameters related to initialization. Furthermore, we demonstrate in detail the underlying mechanisms by which small initialization leads to condensation at the initial training stage.

  • AMS Subject Headings

68U99, 90C26, 34A45

  • Copyright

COPYRIGHT: © Global Science Press

  • Email address
  • BibTex
  • RIS
  • TXT
@Article{CSIAM-AM-5-448, author = {Chen , Zheng-AnLi , YuqingLuo , TaoZhou , Zhangchen and Xu , Zhi-Qin John}, title = {Phase Diagram of Initial Condensation for Two-Layer Neural Networks}, journal = {CSIAM Transactions on Applied Mathematics}, year = {2024}, volume = {5}, number = {3}, pages = {448--514}, abstract = {

The phenomenon of distinct behaviors exhibited by neural networks under varying scales of initialization remains an enigma in deep learning research. In this paper, based on the earlier work [Luo et al., J. Mach. Learn. Res., 22:1–47, 2021], we present a phase diagram of initial condensation for two-layer neural networks. Condensation is a phenomenon wherein the weight vectors of neural networks concentrate on isolated orientations during the training process, and it is a feature in non-linear learning process that enables neural networks to possess better generalization abilities. Our phase diagram serves to provide a comprehensive understanding of the dynamical regimes of neural networks and their dependence on the choice of hyperparameters related to initialization. Furthermore, we demonstrate in detail the underlying mechanisms by which small initialization leads to condensation at the initial training stage.

}, issn = {2708-0579}, doi = {https://doi.org/10.4208/csiam-am.SO-2023-0016}, url = {http://global-sci.org/intro/article_detail/csiam-am/23306.html} }
TY - JOUR T1 - Phase Diagram of Initial Condensation for Two-Layer Neural Networks AU - Chen , Zheng-An AU - Li , Yuqing AU - Luo , Tao AU - Zhou , Zhangchen AU - Xu , Zhi-Qin John JO - CSIAM Transactions on Applied Mathematics VL - 3 SP - 448 EP - 514 PY - 2024 DA - 2024/08 SN - 5 DO - http://doi.org/10.4208/csiam-am.SO-2023-0016 UR - https://global-sci.org/intro/article_detail/csiam-am/23306.html KW - Two-layer neural network, phase diagram, dynamical regime, condensation. AB -

The phenomenon of distinct behaviors exhibited by neural networks under varying scales of initialization remains an enigma in deep learning research. In this paper, based on the earlier work [Luo et al., J. Mach. Learn. Res., 22:1–47, 2021], we present a phase diagram of initial condensation for two-layer neural networks. Condensation is a phenomenon wherein the weight vectors of neural networks concentrate on isolated orientations during the training process, and it is a feature in non-linear learning process that enables neural networks to possess better generalization abilities. Our phase diagram serves to provide a comprehensive understanding of the dynamical regimes of neural networks and their dependence on the choice of hyperparameters related to initialization. Furthermore, we demonstrate in detail the underlying mechanisms by which small initialization leads to condensation at the initial training stage.

Chen , Zheng-AnLi , YuqingLuo , TaoZhou , Zhangchen and Xu , Zhi-Qin John. (2024). Phase Diagram of Initial Condensation for Two-Layer Neural Networks. CSIAM Transactions on Applied Mathematics. 5 (3). 448-514. doi:10.4208/csiam-am.SO-2023-0016
Copy to clipboard
The citation has been copied to your clipboard