zxpmail

Product Engineer building ReqForge ·AI infra ·ex-X

Joined on Jun 6, 2026 https://github.com/zxpmail/ReqForge

zxpmail

Jun 28

KV Cache Is Eating Your VRAM — Here's How to Estimate It Before You Run Out

#llm #inference #engineering #ai

6 min read

Want to connect with zxpmail?

Create an account to connect with zxpmail. You can also sign in below to proceed if you already have an account.

Create Account

Already have an account? Sign in

zxpmail

Jun 28

I Benchmarked Speculative Decoding — a = 3.5 Wasn't Enough

#llm #inference #engineering #ai

7 min read

zxpmail

Jun 28

Lossless, But Not Free: The Lossless, But Not Free — When Speculative Decoding Actually Pays Off (and When It Doesn't)

#ai #llm #inference #engineering

6 min read

zxpmail

Jun 28

The Fourth Layer of Agent-Native

#ai #architecture #agents #webdev

7 min read

zxpmail

Jun 28

Don't Compress, Promote

#ai #webdev #productivity #architecture

4 min read

zxpmail

Jun 28

A Design Document vs a Design Chain

#ai #designtokens #webdev #ux

5 min read

zxpmail

Jun 21

Motif Learning Protocol: Prompt Engineering for Knowledge That Actually Sticks

#ai #promptengineering #learning #opensource

3 min read

zxpmail

Jun 14

We Built a 'Grovel Index' to Measure LLM Sycophancy —Here's What We Found

#ai #llm #promptengineering #sycophancy

5 min read

zxpmail

Jun 7

Smarter Resource Allocation Beats Stronger Models

#ai #claude #coding #llm

6 min read

zxpmail

Jun 6

From Shackles to Anchors: How I Resurrected an Abandoned Open-Source Framework

#showdev #ai #devchallenge #opensource

5 min read

zxpmail

Jun 6

Less Is More: Why 3 Code Examples Beat 10 Rules for LLM Code Generation

#ai #llm #programming #softwaredevelopment

4 min read

DEV Community

zxpmail

Badges

Writing Debut

Skills/Languages

Currently hacking on

KV Cache Is Eating Your VRAM — Here's How to Estimate It Before You Run Out

Want to connect with zxpmail?

I Benchmarked Speculative Decoding — a = 3.5 Wasn't Enough

Lossless, But Not Free: The Lossless, But Not Free — When Speculative Decoding Actually Pays Off (and When It Doesn't)

The Fourth Layer of Agent-Native

Don't Compress, Promote

A Design Document vs a Design Chain

Motif Learning Protocol: Prompt Engineering for Knowledge That Actually Sticks

We Built a 'Grovel Index' to Measure LLM Sycophancy —Here's What We Found

Smarter Resource Allocation Beats Stronger Models

From Shackles to Anchors: How I Resurrected an Abandoned Open-Source Framework

Less Is More: Why 3 Code Examples Beat 10 Rules for LLM Code Generation