Volume 5, Issue 2
Embedding Principle in Depth for the Loss Landscape Analysis of Deep Neural Networks

Zhiwei Bai, Tao Luo, Zhi-Qin John Xu & Yaoyu Zhang

CSIAM Trans. Appl. Math., 5 (2024), pp. 350-389.

Published online: 2024-05

Export citation
  • Abstract

In this work, we delve into the relationship between deep and shallow neural networks (NNs), focusing on the critical points of their loss landscapes. We discover an embedding principle in depth that loss landscape of an NN “contains” all critical points of the loss landscapes for shallower NNs. The key tool for our discovery is the critical lifting that maps any critical point of a network to critical manifolds of any deeper network while preserving the outputs. To investigate the practical implications of this principle, we conduct a series of numerical experiments. The results confirm that deep networks do encounter these lifted critical points during training, leading to similar training dynamics across varying network depths. We provide theoretical and empirical evidence that through the lifting operation, the lifted critical points exhibit increased degeneracy. This principle also provides insights into the optimization benefits of batch normalization and larger datasets, and enables practical applications like network layer pruning. Overall, our discovery of the embedding principle in depth uncovers the depth-wise hierarchical structure of deep learning loss landscape, which serves as a solid foundation for the further study about the role of depth for DNNs.

  • AMS Subject Headings

68T07

  • Copyright

COPYRIGHT: © Global Science Press

  • Email address
  • BibTex
  • RIS
  • TXT
@Article{CSIAM-AM-5-350, author = {Bai , ZhiweiLuo , TaoXu , Zhi-Qin John and Zhang , Yaoyu}, title = {Embedding Principle in Depth for the Loss Landscape Analysis of Deep Neural Networks}, journal = {CSIAM Transactions on Applied Mathematics}, year = {2024}, volume = {5}, number = {2}, pages = {350--389}, abstract = {

In this work, we delve into the relationship between deep and shallow neural networks (NNs), focusing on the critical points of their loss landscapes. We discover an embedding principle in depth that loss landscape of an NN “contains” all critical points of the loss landscapes for shallower NNs. The key tool for our discovery is the critical lifting that maps any critical point of a network to critical manifolds of any deeper network while preserving the outputs. To investigate the practical implications of this principle, we conduct a series of numerical experiments. The results confirm that deep networks do encounter these lifted critical points during training, leading to similar training dynamics across varying network depths. We provide theoretical and empirical evidence that through the lifting operation, the lifted critical points exhibit increased degeneracy. This principle also provides insights into the optimization benefits of batch normalization and larger datasets, and enables practical applications like network layer pruning. Overall, our discovery of the embedding principle in depth uncovers the depth-wise hierarchical structure of deep learning loss landscape, which serves as a solid foundation for the further study about the role of depth for DNNs.

}, issn = {2708-0579}, doi = {https://doi.org/10.4208/csiam-am.SO-2023-0020}, url = {http://global-sci.org/intro/article_detail/csiam-am/23125.html} }
TY - JOUR T1 - Embedding Principle in Depth for the Loss Landscape Analysis of Deep Neural Networks AU - Bai , Zhiwei AU - Luo , Tao AU - Xu , Zhi-Qin John AU - Zhang , Yaoyu JO - CSIAM Transactions on Applied Mathematics VL - 2 SP - 350 EP - 389 PY - 2024 DA - 2024/05 SN - 5 DO - http://doi.org/10.4208/csiam-am.SO-2023-0020 UR - https://global-sci.org/intro/article_detail/csiam-am/23125.html KW - Deep learning, loss landscape, embedding principle. AB -

In this work, we delve into the relationship between deep and shallow neural networks (NNs), focusing on the critical points of their loss landscapes. We discover an embedding principle in depth that loss landscape of an NN “contains” all critical points of the loss landscapes for shallower NNs. The key tool for our discovery is the critical lifting that maps any critical point of a network to critical manifolds of any deeper network while preserving the outputs. To investigate the practical implications of this principle, we conduct a series of numerical experiments. The results confirm that deep networks do encounter these lifted critical points during training, leading to similar training dynamics across varying network depths. We provide theoretical and empirical evidence that through the lifting operation, the lifted critical points exhibit increased degeneracy. This principle also provides insights into the optimization benefits of batch normalization and larger datasets, and enables practical applications like network layer pruning. Overall, our discovery of the embedding principle in depth uncovers the depth-wise hierarchical structure of deep learning loss landscape, which serves as a solid foundation for the further study about the role of depth for DNNs.

Bai , ZhiweiLuo , TaoXu , Zhi-Qin John and Zhang , Yaoyu. (2024). Embedding Principle in Depth for the Loss Landscape Analysis of Deep Neural Networks. CSIAM Transactions on Applied Mathematics. 5 (2). 350-389. doi:10.4208/csiam-am.SO-2023-0020
Copy to clipboard
The citation has been copied to your clipboard