Supporting Markdown Search For LLMs
This post shows you how to support markdown search for your blog, making it token-efficient to search through your content. It also has a lesson about how context windows work and markdown work. If you only want the implementation steps, you can skip all that and go straight to “So What Are We Doing?”.
If there’s one thing LLMs struggle with, it’s efficiency. I’m not talking about environmental impact — I’m talking about context windows and tokens.
LLMs know a lot of things, but they don’t know everything. When they recognize they don’t know something, they may fall back to searching the web to learn more. This is great because it means they can access the latest information available to solve the problem.
But when LLMs retrieve information from the web, they need to store it somewhere they can reference while working. You can’t keep every word of every book you’ve ever read in your head — and neither can LLMs. Instead, they store retrieved information in a small amount of quickly accessible memory called a context window.
A Primer On Context Windows
An LLM’s context window can’t be infinitely large because the data we put in it needs to be stored somewhere, and it has to be accessed quickly. Having to access it quickly rules out traditional long-term storage mechanisms like databases, because they’re too slow. In effect, we have two constraints: we want to keep only the most important information, and we need to make it as compact as possible.
Before we move on, we need to go over a few facts about context windows:
- The text (or images) you put into an LLM are converted into an LLM-friendly format called tokens.
- A token is a chunk of data that makes it easier for an LLM to process information.
- Counting tokens isn’t as straightforward as you’d expect, but here are some rough guidelines: about 3-4 characters of text = 1 token, or roughly 1 word = 1.33 tokens.
- The average Harry Potter book is 135,000 words long, or about 180,000 tokens.
- The latest models at the time of this writing are Claude Opus 4.5 and GPT-5.2 — which have context windows of 200K and 400K tokens respectively.
- This is a pretty limited space, so once you’ve reached the end of your token window, LLMs will do their best to summarize the conversation. Summarization, though, can lose important details.
This means you can safely fit 1-2 Harry Potter books in a context window — which is far smaller than the entirety of the internet. In practice, that’s approximately 10-15 of my blog posts, or roughly 2-3 hours of video on YouTube.
And that’s why we need to be selective when we ask LLMs to search the web for information. When Claude goes off to search the web, it may retrieve far more data than its context window can hold!
And That’s Why We Need Markdown
The good news is that my blog posts are written in markdown — a lightweight markup language created by John Gruber. If you’re reading this, you’ve likely used markdown to bold or italicize text by adding _ or ** around words or sentences.
Markdown is very lightweight compared to HTML because it’s plain-text without any markup. (Hence the name markdown — which I only just realized as I was writing this sentence. 😆) Sharing the markdown version of a blog post will save precious tokens — which lets LLMs store and access more information in their context window.
In the next version of Claude Code, Claude’s WebFetch tool automatically adds Accept: “text/markdown, *” to requests which helps docs sites provide token-efficient docs https://t.co/CZWdH0OZsO pic.twitter.com/uYthd1m9RP
— Boris Cherny (@bcherny) November 13, 2025
This approach has become standard practice — Claude Code now automatically searches for markdown versions of websites, and other tools like Codex have followed suit. Yet despite its importance, few websites have implemented this technique. Here’s how to be one of them.
So What Are We Doing?
We’re going to do two things.
- Add support for serving markdown versions of our blog posts.
- Direct LLMs to browse the markdown version instead of our HTML-rendered blog posts.
Or more technically:
- When someone adds
.mdto a blog post URL, we’ll serve the markdown version of that post. (Try it yourself with this post: build.ms/2026/2/2/supporting-markdown-search-for-llms.md) - We’ll build a small middleware layer that redirects LLMs to the markdown version when they request a page with an
Acceptheader oftext/markdown.
Adding Markdown Support To Your Website
Build.ms uses the Astro static site generator, but this technique works for any static site generator. To give you an idea of what we want to do, here’s the prompt I gave to Codex to build this feature:
Is it possible for us to expose the original markdown of our posts whenever the user adds a .md extension to a URL? This has become a common pattern for letting AI tools crawl websites, providing a more direct format that they can consume easily.
The result was a file called [slug].md.ts. This file convention creates a route that serves the markdown version of your blog post. Now when a person (or LLM) adds .md to any blog post URL on build.ms, they see the markdown version of our post.
Expand to see the code for [slug].md.ts
import type { APIRoute } from 'astro';
import { type CollectionEntry, getCollection } from 'astro:content';
import { readFile } from 'node:fs/promises';
import path from 'node:path';
interface MarkdownProps {
filePath: string;
slug: string;
pubDate: string;
}
function hasFilePath(post: CollectionEntry<'blog'>): post is CollectionEntry<'blog'> & { filePath: string } {
return typeof post.filePath === 'string';
}
function buildParams(post: CollectionEntry<'blog'>) {
const date = new Date(post.data.pubDate);
return {
year: date.getFullYear().toString(),
month: (date.getMonth() + 1).toString(),
day: date.getDate().toString(),
slug: post.id,
};
}
export async function getStaticPaths() {
const posts = await getCollection('blog');
return posts
.filter(hasFilePath)
.map((post) => ({
params: buildParams(post),
props: {
filePath: post.filePath,
slug: post.id,
pubDate: new Date(post.data.pubDate).toISOString(),
},
}));
}
export const GET: APIRoute = async ({ params, props }) => {
const { filePath, slug, pubDate } = props as MarkdownProps;
if (!filePath || !slug || !pubDate) {
return new Response('Not found', { status: 404 });
}
const parsedDate = new Date(pubDate);
const year = Number(params.year);
const month = Number(params.month);
const day = Number(params.day);
if (Number.isNaN(year) || Number.isNaN(month) || Number.isNaN(day)) {
return new Response('Not found', { status: 404 });
}
if (
parsedDate.getFullYear() !== year ||
parsedDate.getMonth() + 1 !== month ||
parsedDate.getDate() !== day ||
slug !== params.slug
) {
return new Response('Not found', { status: 404 });
}
try {
const absolutePath = path.resolve(process.cwd(), filePath);
const markdown = await readFile(absolutePath, 'utf-8');
return new Response(markdown, {
status: 200,
headers: {
'Content-Type': 'text/markdown; charset=utf-8',
'Cache-Control': 'public, max-age=3600',
},
});
} catch (error) {
console.error('Failed to load markdown for', filePath, error);
return new Response('Not found', { status: 404 });
}
};
We can now start dropping links with .md extensions in our prompts and watch our token count drop. But it would be better if this process was automatic. To do that, I wrote another prompt that explains our goals, so Codex can solve our problem.
I just want to check something. The reason we built this was because I saw this post from the Claude Code team.
> In the next version of Claude Code, Claude’s WebFetch tool automatically adds Accept: “text/markdown, *” to requests which helps docs sites provide token-efficient docs
The reason we recently added support for .md extensions rendering markdown docs was to support this feature, but are we handling that? If not, can we add that somehow?
Expand to see the code for our middleware
import type { MiddlewareHandler } from 'astro';
const blogPathPattern = /^\/\d{4}\/\d{1,2}\/\d{1,2}\/[^/]+\/?$/;
// Redirect markdown-preferring clients to the raw .md blog route (e.g. Claude WebFetch).
export const onRequest: MiddlewareHandler = async ({ request }, next) => {
const acceptHeader = request.headers.get('accept') ?? '';
if (!acceptHeader.toLowerCase().includes('text/markdown')) {
return next();
}
const url = new URL(request.url);
if (!blogPathPattern.test(url.pathname) || url.pathname.endsWith('.md')) {
return next();
}
const segments = url.pathname.replace(/\/$/, '').split('/').filter(Boolean);
const [year, month, day, slug] = segments;
if (!year || !month || !day || !slug) {
return next();
}
const markdownPath = `/${Number(year)}/${Number(month)}/${Number(day)}/${slug}.md${url.search}`;
const location = new URL(markdownPath, url);
return new Response(null, {
status: 307,
headers: {
Location: location.toString(),
Vary: 'Accept',
},
});
};
And just like that, Codex added a middleware layer that redirects any request with an Accept header of text/markdown to the .md version of our blog post. Now the two-step process is complete, and our token usage will automatically be optimized.
And That’s It?
As you can see adding support for the text/markdown Accept header is straightforward and easy! And it’s beneficial: LLMs get more accurate information, they use fewer tokens, and your website saves on bandwidth costs.
You can implement this feature yourself, but rather than writing the code, let’s use the technique I shared in Building Software From Blog Posts. Point Codex or Claude Code to this blog post and let it handle the implementation.
Here’s a prompt you can use — with all the context you’ll need.
Please read this blog post (https://build.ms/2026/2/2/supporting-markdown-search-for-llms) that describes a process for adding support for the text/markdown Accept header — allowing LLMs to more easily and efficiently parse the contents of a statically generated blog. We want to implement the same technique for our static site generator — regardless of whether it’s Astro or not. If there are any concerns please raise them before we begin building this feature. If there are no concerns, please implement a solution that solves this problem for us in our static site.