Memory$^3$: Language Modeling with Explicit Memory

Volume 3, Issue 3

Hongkang Yang, Zehao Lin, Wenjin Wang, Hao Wu, Zhiyu Li, Bo Tang, Wenqiang Wei, Jinbo Wang, Zeyun Tang, Shichao Song, Chenyang Xi, Yu Yu, Kai Chen, Feiyu Xiong, Linpeng Tang & Weinan E

DOI: 10.4208/jml.240708

J. Mach. Learn. , 3 (2024), pp. 300-346.

Published online: 2024-09

[An open-access article; the PDF is free to any online user.]

Summary Full PDF 169 5879

Cited by

google scholar semantic scholar

Export citation

Abstract

The training and inference of large language models (LLMs) are together a costly process that transports knowledge from raw data to meaningful computation. Inspired by the memory hierarchy of the human brain, we reduce this cost by equipping LLMs with explicit memory, a memory format cheaper than model parameters and text retrieval-augmented generation (RAG). Conceptually, with most of its knowledge externalized to explicit memories, the LLM can enjoy a smaller parameter size, training cost, and inference cost, all proportional to the amount of remaining “abstract knowledge”. As a preliminary proof of concept, we train from scratch a 2.4 B LLM, which achieves better performance than much larger LLMs as well as RAG models, and maintains higher decoding speed than RAG. The model is named ${\rm Memory}^3$, since explicit memory is the third form of memory in LLMs after implicit memory (model parameters) and working memory (context key-values). We introduce a memory circuitry theory to support the externalization of knowledge, and present novel techniques including a memory sparsification mechanism that makes storage tractable and a two-stage pretraining scheme that facilitates memory formation.

Keywords

Large language model, Explicit memory, Large-scale pretraining, Efficient inference, AI database.

AMS Subject Headings

Email address

BibTex
RIS
TXT

@Article{JML-3-300, author = {Yang , HongkangLin , ZehaoWang , WenjinWu , HaoLi , ZhiyuTang , BoWei , WenqiangWang , JinboTang , ZeyunSong , ShichaoXi , ChenyangYu , YuChen , KaiXiong , FeiyuTang , Linpeng and E , Weinan}, title = {Memory$^3$: Language Modeling with Explicit Memory}, journal = {Journal of Machine Learning}, year = {2024}, volume = {3}, number = {3}, pages = {300--346}, abstract = {

}, issn = {2790-2048}, doi = {https://doi.org/10.4208/jml.240708}, url = {http://global-sci.org/intro/article_detail/jml/23419.html} }

TY - JOUR T1 - Memory$^3$: Language Modeling with Explicit Memory AU - Yang , Hongkang AU - Lin , Zehao AU - Wang , Wenjin AU - Wu , Hao AU - Li , Zhiyu AU - Tang , Bo AU - Wei , Wenqiang AU - Wang , Jinbo AU - Tang , Zeyun AU - Song , Shichao AU - Xi , Chenyang AU - Yu , Yu AU - Chen , Kai AU - Xiong , Feiyu AU - Tang , Linpeng AU - E , Weinan JO - Journal of Machine Learning VL - 3 SP - 300 EP - 346 PY - 2024 DA - 2024/09 SN - 3 DO - http://doi.org/10.4208/jml.240708 UR - https://global-sci.org/intro/article_detail/jml/23419.html KW - Large language model, Explicit memory, Large-scale pretraining, Efficient inference, AI database. AB -

Hongkang Yang, Zehao Lin, Wenjin Wang, Hao Wu, Zhiyu Li, Bo Tang, Wenqiang Wei, Jinbo Wang, Zeyun Tang, Shichao Song, Chenyang Xi, Yu Yu, Kai Chen, Feiyu Xiong, Linpeng Tang & Weinan E. (2024). Memory$^3$: Language Modeling with Explicit Memory. Journal of Machine Learning. 3 (3). 300-346. doi:10.4208/jml.240708

Copy to clipboard

BibteX RIS TXT

The citation has been copied to your clipboard

- LOGIN -

- E-mail verification -

- REGISTER -