Definition

Multi-Head Attention

Multi-Head Attention is an attention layout in Transformers that splits query, key, and value vectors into multiple subspaces, allowing the model to attend to information from different representation coordinates simultaneously.

Frequently Asked Questions

Why use multiple attention heads instead of one?▼

A single attention head averages out focus. Multi-head attention allows the model to simultaneously look at different tokens (e.g. grammar structure and semantic pronouns).

What is the output of the multi-head attention layer?▼

The concatenated outputs of each individual attention head, projected back to the original embedding size.

Quick Facts

CategoryNeural Architectures
Key ApplicationTransformer block operations, sequence correlation mapping, and LLM design.

Coverage Trend12 Weeks

12w agoToday

Related AI Terms

Attention Mechanism Grouped-Query Attention

Multi-Head Attention Media Coverage & Intelligence

No Direct Multi-Head Attention News Today

We currently have no direct coverage articles matching "Multi-Head Attention" in the database archive. Explore trending global AI topics below instead.

A startup claims it broke through a bottleneck that's holding back LLMs

Miami-based AI startup Subquadratic came out of stealth mode last month with a huge claim. It announced that it had solved a mathematical bottleneck that had be

Read Original Coverage

Latent SpaceJun 19, 2026

[AINews] GLM GPT? GLM-5.2 passes vibe check; Z.ai forecasts Open Fable by December

With GLM-5.2 passing everyone's vibe check, the open models story finally becomes a real frontier story.

Read Original Coverage

WiredJun 19, 2026

Meta Quest Promo Codes and Coupons for June 2026

Experience cutting-edge VR and save up to 20% with coupons for the latest games, Meta Quest 3, Ray-Ban AI glasses, and more deals.

Read Original Coverage

SiliconANGLEJun 19, 2026

Fabrix.ai demonstrates production-grade agentic operations at Cisco Live

Artificial intelligence dominated headlines and keynotes at every event I've attended this year, including the recent Cisco Live 2026. Though the thirst for AI has been insatiable for a couple of years, customer feedback at the event showed that the era of AI curiosity has given way to AI urgency. I

Read Original Coverage

Multi-Head Attention

Frequently Asked Questions

Quick Facts

Coverage Trend12 Weeks

Related AI Terms

Multi-Head Attention Media Coverage & Intelligence

No Direct Multi-Head Attention News Today

Trending AI Stories

A startup claims it broke through a bottleneck that's holding back LLMs

[AINews] GLM GPT? GLM-5.2 passes vibe check; Z.ai forecasts Open Fable by December

Meta Quest Promo Codes and Coupons for June 2026

Fabrix.ai demonstrates production-grade agentic operations at Cisco Live