国内假AI博主鼓吹的qmd真相

437 2026-03-24 19:26 2026-03-24 19:27

根据搜索结果，这篇文章确实提到了节省 token 的相关内容，但不是以"省 token"的直白说法，而是从减少 token 浪费的角度来阐述的。

在搜索结果中，一篇关于 AI Agent token 成本的文章详细介绍了 QMD（Query Markup Documents）如何帮助节省 token：

文章中的相关论述

该文章指出，AI Agent 在使用传统搜索工具（如 grep）时会遇到严重的 token 浪费问题：

"When an agent needs information from your codebase or document library, it typically does the equivalent of a project-wide Ctrl+F. Every matching line is returned — unranked, unfiltered, and unprioritized."

"Irrelevant matches pile into the LLM's context window, forcing the model to read and process thousands of tokens it didn't actually need."

而 QMD 的解决方案是：

"QMD (Query Markup Documents), built by Shopify founder Tobi Lütke, adds a third stage: LLM re-ranking. After BM25 and vector search each return candidates, a local language model re-reads the top results and reorders them by actual relevance to your query."

核心节省 token 的机制：

表格

传统方式	QMD 方式
grep 返回所有匹配行，无排名过滤	混合搜索（BM25 + 向量）+ LLM 重排序
无关内容涌入上下文窗口	只返回最相关的结果
模型需要处理数千个不需要的 token	精准检索，减少上下文膨胀

文章总结道：

"The token costs that come with always-on AI agents aren't inevitable. This guide covered two areas where better tooling can cut the waste: search and memory."

"index1 and QMD solve this on a single machine by combining BM25 keyword scoring with vector search and returning only the most relevant results."

所以，虽然 README 文档本身没有直接说"省 token"，但第三方技术分析文章明确将 QMD 定位为通过更好的检索来减少 token 浪费的工具——本质上就是帮助用户节省 token 成本。

博客

国内假AI博主鼓吹的qmd真相

文章中的相关论述

全部评论

分类

热门文章

Tags

关于