Overview
Automatic prompt optimization (APO) is a reinforcement learning technique used to improve task performance for AI language models. There are dozens of APO algorithms that can accomplish this task, but most all of them follow five general steps (Ramnath, et. al. 2025):
- Seed Prompt Initialization - Using manually created prompts or instruction-induced prompts via LLMs
- Candidate Prompt Generation - Generating new instruction prompts based on the previous generation of prompts
- Inference Evaluation & Feedback - Evaluate the performance of new prompts using a validation set and provide feedback to the APO algorithm
- Filter & Retain Promising Prompts - Select prompts to seed the next generation
- Repeat steps 2-4 until the exit criteria is met
Each algorithm implements these five steps differently. For example, while most algorithms use a user-provided seed prompt in the first step, APE generates seed prompts by inferring instructions from task input-output pairs. This makes it useful for instances where the task may be unknown or hard to describe, but it struggles when there are non-obvious conditions or constraints on the prompt output.
As another example, while some algorithms like APE and OPRO use random input-output pairs from the validation set to generate new prompts, algorithms like PromptAgent and ProTeGi sample from the input-output pairs that the prompt failed on when generating new prompts. This means the prompts from these algorithms should get progressively better with each iteration as they learn from the mistakes they made.
The next section covers these differences in more detail for the algorithms implemented in this package.

Comparison of Algorithms
| Step | APE | OPRO | ProTeGi | PromptAgent |
|---|---|---|---|---|
| Seed Prompt Initialization | ||||
| Candidate Prompt Generation | ||||
| Inference Evaluation & Feedback | ||||
| Filter & Retain Promising Prompts | ||||
| Exit Criteria |