TY - JOUR
T1 - Phase Diagram of Initial Condensation for Two-Layer Neural Networks
AU - Chen , Zheng-An
AU - Li , Yuqing
AU - Luo , Tao
AU - Zhou , Zhangchen
AU - Xu , Zhi-Qin John
JO - CSIAM Transactions on Applied Mathematics
VL - 3
SP - 448
EP - 514
PY - 2024
DA - 2024/08
SN - 5
DO - http://doi.org/10.4208/csiam-am.SO-2023-0016
UR - https://global-sci.org/intro/article_detail/csiam-am/23306.html
KW - Two-layer neural network, phase diagram, dynamical regime, condensation.
AB - <p style="text-align: justify;">The phenomenon of distinct behaviors exhibited by neural networks under
varying scales of initialization remains an enigma in deep learning research. In this
paper, based on the earlier work [Luo et al., J. Mach. Learn. Res., 22:1–47, 2021], we
present a phase diagram of initial condensation for two-layer neural networks. Condensation is a phenomenon wherein the weight vectors of neural networks concentrate
on isolated orientations during the training process, and it is a feature in non-linear
learning process that enables neural networks to possess better generalization abilities.
Our phase diagram serves to provide a comprehensive understanding of the dynamical regimes of neural networks and their dependence on the choice of hyperparameters
related to initialization. Furthermore, we demonstrate in detail the underlying mechanisms by which small initialization leads to condensation at the initial training stage.</p>