Low-Rank Adaptation of Large Language Models

How does LoRA work?

LoRA freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture.
They use a random Gaussian initialization for \(A\) and zero for \(B\), so \(\Delta W = BA\) is zero at the beginning of training.

Machine Learning Research Flashcards is a collection of flashcards associated with scientific research papers in the field of machine learning. Best used with Anki. Edit MLRF on GitHub.