Analysis of large language model architecture evolution
This paper systematically reviews and analyzes the significant innovation directions based on the Transformer architecture.It examines the evolution of large language model architecture from three dimensions:innovation within the Transformer architecture itself,fusion innovation with other architectures,and innovations in non-Transformer architecture.This paper also provides an outlook on the future development directions of foundation models.
large model architectureTransformerattention mechanismarchitectural innovation