Skip to main content

Context Engineering for Agents: A Goal, a Map, and a Way to Know It Arrived

· 11 min read
Manny Silva
Creator of Docs as Tests and Doc Detective | Head of Docs as Skyflow

Context Engineering for Agents banner

Over the last few posts, I've worked through the five doc types that configure AI agents, why they count as internal documentation, and how to write skills and score them. Each post answered a "what is this file" question. The question I keep getting in response is the next one: what actually goes inside these files, and why?

That's a context-engineering question. Andrej Karpathy coined the term on X in 2025: "Context engineering is the delicate art and science of filling the context window with just the right information for the next step...Too little or of the wrong form and the LLM doesn't have the right context for optimal performance. Too much or too irrelevant, and the LLM costs might go up, and performance might come down."

The discipline that makes that balance possible is progressive disclosure: surface only what the agent needs for the current step, and let the rest stay one link or one tool call away. Technical writers already know this move — it's how a good README opens with a summary and pushes the detail out to dedicated pages, or how a tutorial introduces the happy path before the edge cases. Context engineering applies the same idea to the agent's working memory.

This post is the practical version of that idea. Three things every agent needs to be successful, and where each of those three things lives in a repo.

How Do You Know If a Skill Is Any Good? LLM-as-Judge Scoring

· 13 min read
Manny Silva
Creator of Docs as Tests and Doc Detective | Head of Docs as Skyflow

How Do You Know If a Skill Is Any Good? LLM-as-Judge Scoring banner

Last time, I walked through writing skills that agents can actually execute and introduced skill-validator as a way to catch structural and content issues before an agent ever sees the skill. At the end, I mentioned that skill-validator also supports LLM-as-judge scoring across dimensions like clarity, actionability, token efficiency, and novelty—and promised to dig into that.

This is that post.

Writing Skills That Agents Can Actually Execute

· 10 min read
Manny Silva
Creator of Docs as Tests and Doc Detective | Head of Docs as Skyflow

Writing Skills That Agents Can Actually Execute banner

First, I argued that agent configurations are documentation. Next, I made the case that they're specifically internal documentation and should be managed that way. Both times I covered five doc types: project descriptions, agent definitions, orchestration patterns, skills, and plans/specs.

Of those five, skills are the hardest to write well. Let's walk through how I handled writing and validating skills for Doc Detective's agent tools.

Your Agent Configs Are Internal Docs. Manage Them That Way.

· 10 min read
Manny Silva
Creator of Docs as Tests and Doc Detective | Head of Docs as Skyflow

Agentic docs are internal documentation banner

A few months into working with AI agents on a documentation project, I'd noticed some inconsistency in agent behaviors and decided to do some digging. Turns out the AGENTS.md file in our repo — the one telling agents how to behave, where things were, and what to escalate — had grown to over 800 lines, and a few people (or likely their agents) had added rules independently, some subtly contradicting each other.

The agents weren't broken. They were following instructions that didn't serve them well.

Welcome to Instruction Manuel

· 3 min read
Manny Silva
Creator of Docs as Tests and Doc Detective | Head of Docs as Skyflow

Welcome to Instruction Manuel! I'm Manny Silva, a technical writer and engineer who spends too much time thinking about documentation, developer experience, and whatever else catches my attention in the tech world.