Language & LLMs
What Is ALiBi?
ALiBi, short for Attention with Linear Biases, encodes position by adding a distance-based penalty to attention scores instead of using explicit position embeddings. Tokens farther apart receive larger penalties. This approach can help models generalize to sequences longer than those seen during training.
Further reading
Read more about ALiBi — articles and blogs from around the web: