Machine Learning & Training
What Is Layer Normalization?
Layer normalization normalizes the activations of a layer across its features for each individual training example. Unlike batch normalization, it does not depend on batch statistics, which makes it well suited for recurrent networks and transformers. This helps stabilize training and is commonly used in modern language models.
Further reading
Read more about layer normalization — articles and blogs from around the web: