Skip to content

DEV Community

# transformers

👋 Sign in for the ability to sort posts by relevant, latest, or top.

zeromathai

Jun 29

How Modern Transformer Blocks Work — From RMSNorm to MoE

#ai #machinelearning #llm #transformers

5 min read

zeromathai

Jun 26

Why Positional Embeddings Matter — APE, RPE, and RoPE Explained for Developers

#ai #machinelearning #llm #transformers

5 min read

zengbao yu

Jun 26

🧠 人工智能发展方向：当前是否到头？

#ai #machinelearning #transformers

1 min read

zeromathai

Jun 25

Why KV Cache Matters — How MQA, GQA, and MLA Make LLM Inference Faster

#ai #machinelearning #llm #transformers

5 min read

zeromathai

Jun 24

Why Attention Becomes the Bottleneck — And How Efficient Attention Fixes It

#ai #machinelearning #llm #transformers

3 min read

zeromathai

Jun 16

How Transformer Architecture Works — Encoder, Decoder, Tokens, and Context

#ai #machinelearning #nlp #transformers

6 min read

aj1thkr1sh

Jun 15

Attention Is All You Need, Building a Transformer for Thanglish-to-Tamil

#ai #transformers #genai #deeplearning

3 min read

Jun 11

有人在拆 Transformer：Memory Caching 與 CTM 各拆走了一半

#machinelearning #ai #transformers #deeplearning

3 min read

Jun 10

Flash Attention: what it does and why it matters

#llm #ai #deeplearning #transformers

8 min read

zeromathai

Jun 18

How Self-Attention Works — QKV, Softmax, and Matrix Computation

#ai #machinelearning #nlp #transformers

5 min read

zeromathai

Jun 17

How Attention Actually Works — From Next-Token Prediction to QKV Intuition

#ai #machinelearning #nlp #transformers

3 min read

May 13

MoE Architectures Keep Solving the Wrong Problem

#machinelearning #llm #transformers

3 min read

May 2

Chapter 12: Inference - Generating New Text

#csharp #machinelearning #transformers #tutorial

9 min read

Apr 30

Chapter 11: The Full GPT - Assembling the Model

#csharp #machinelearning #transformers #tutorial

10 min read

Apr 28

Chapter 9: Single-Head Attention - Tokens Looking at Each Other

#csharp #machinelearning #transformers #tutorial

9 min read

👋 Sign in for the ability to sort posts by relevant, latest, or top.