Infrastructure & Agents

What Is Tensor Parallelism?

Tensor parallelism partitions the math inside a single layer across several devices, dividing large weight matrices among them. It is a form of model parallelism used to fit and speed up very large layers.

Further reading

Read more about tensor parallelism — articles and blogs from around the web: