TY - JOUR
T1 - Memory$^3$: Language Modeling with Explicit Memory
AU - Yang , Hongkang
AU - Lin , Zehao
AU - Wang , Wenjin
AU - Wu , Hao
AU - Li , Zhiyu
AU - Tang , Bo
AU - Wei , Wenqiang
AU - Wang , Jinbo
AU - Tang , Zeyun
AU - Song , Shichao
AU - Xi , Chenyang
AU - Yu , Yu
AU - Chen , Kai
AU - Xiong , Feiyu
AU - Tang , Linpeng
AU - E , Weinan
JO - Journal of Machine Learning
VL - 3
SP - 300
EP - 346
PY - 2024
DA - 2024/09
SN - 3
DO - http://doi.org/10.4208/jml.240708
UR - https://global-sci.org/intro/article_detail/jml/23419.html
KW - Large language model, Explicit memory, Large-scale pretraining, Efficient inference, AI database.
AB - <p style="text-align: justify;">The training and inference of large language models (LLMs) are together a costly process that transports knowledge from raw data to meaningful computation. Inspired by the memory hierarchy of the human
brain, we reduce this cost by equipping LLMs with explicit memory, a memory format cheaper than model
parameters and text retrieval-augmented generation (RAG). Conceptually, with most of its knowledge externalized to explicit memories, the LLM can enjoy a smaller parameter size, training cost, and inference cost, all
proportional to the amount of remaining “abstract knowledge”. As a preliminary proof of concept, we train
from scratch a 2.4 B LLM, which achieves better performance than much larger LLMs as well as RAG models,
and maintains higher decoding speed than RAG. The model is named ${\rm Memory}^3$, since explicit memory is the
third form of memory in LLMs after implicit memory (model parameters) and working memory (context key-values). We introduce a memory circuitry theory to support the externalization of knowledge, and present
novel techniques including a memory sparsification mechanism that makes storage tractable and a two-stage
pretraining scheme that facilitates memory formation.</p>