Large Language Models (LLMs) are powerful but can be surprisingly finicky. Tuning their prompts or instructions to reach the “production-level accuracy” you need can feel like an endless cycle of trial and error. That’s where metaprompting comes in.
Metaprompting is the practice of using a more intelligent model - to iteratively improve the prompts or instructions fed to a less intelligent model. By building a structured feedback loop (Generate → Evaluate → Improve → Repeat), we can systematically refine prompts until performance stabilizes or meets our project criteria.
In this article, we’ll explore the key ideas behind metaprompting, outline the steps of the feedback loop, and highlight best practices to avoid pitfalls like overfitting or token overruns. By the end, you’ll have a clearer understanding of how to structure an iterative process to drastically improve LLM outcomes—whether you’re building Q&A systems, chatbots, or any task that needs repeated refinement.
Metaprompting involves two main models:
The basic workflow:
This loop repeats until you’re satisfied with performance or run out of time/budget.
Most projects begin with an LLM prompt that was adapted from human-readable text (e.g., documentation, guidelines). These human-facing materials are rarely structured in a way an LLM can easily follow, so the first step is transformation into a more LLM-friendly format—we’ll call this a “routine.”