PromptAgent
About

PromptAgent optimizes a prompt by scoring, generating feedback, and generating new prompts based on the feedback.
PromptAgent starts from a seed prompt with known errors on the training data.
At each step, a scored prompt and a sample of its errors are passed to a language model to produce a feedback "action".
Then the prompt, errors, trajectory, and feedback action are passed to a language model to produce new prompts.
These new prompts are scored and the best prompt from each branch (search_mode="beam") or all prompts (search_mode="greedy") are retained for the next step
Usage
The PromptAgentOptimizer requires a description of the failures after each step.
You must provide this feedback in your evaluator by capturing errors and saving them in the prompt object's errors attribute.
[!IMPORTANT] Important Note Your evaluator function **MUST** save any errors to the prompt object's
errorsattribute. Otherwise the optimization will fail.
from lagnchain_openai import ChatOpenAI
from prompt_optimizer import PredictionError, Prompt
from prompt_optimizer.optimizers import PromptAgentOptimizer
# Simple QA validation set
validation_set = [
{"question": "What is the capital of France?", "answer": "Paris"},
{"question": "What is the largest planet in our solar system?", "answer": "Jupiter"},
{"question": "What is the smallest planet in our solar system?", "answer": "Mercury"},
{"question": "What is the longest river in the world?", "answer": "Nile"},
{"question": "What is the smallest river in the world?", "answer": "Reprua River"},
]
# A langchain ChatModel for generating new prompts
client = ChatOpenAI(model="gpt-5", temperature=0.7)
# Evaluator function
def evaluator(prompt: Prompt, validation_set: list[dict]) -> list[str]:
"""Prompt evaluator function."""
# Run the prompt through the AI system
predictions = []
num_correct = 0
agent = get_agent()
for row in validation_set:
question = row["input"]
messages = [{"role": "system", "content": prompt.content}, {"role": "user", "content": question}]
response = agent.invoke(messages)
prediction = response.content.strip()
predictions.append(prediction)
# Reward exact matches and collect errors
actual = row["target"]
if actual == prediction:
num_correct += 1
else:
num_correct += 0
# Save prediction error - Required for PromptAgentOptimizer
error = PredictionError(input=question, prediction=prediction, actual=actual, feedback=None)
prompt.errors.append(error)
# Compute the score
score = num_correct / len(validation_set)
return score
# Initialize the optimizer
baseline_prompt = "Answer the user's questions to the best of your ability."
optimizer = PromptAgentOptimizer(
client=client,
seed_prompts=[baseline_prompt],
validation_set=validation_set,
max_depth=3,
evaluator=evaluator,
)
# Run the optimization
optimized_prompt = optimizer.run()
Citation
@misc{wang2023promptagentstrategicplanninglanguage,
title={PromptAgent: Strategic Planning with Language Models Enables Expert-level Prompt Optimization},
author={Xinyuan Wang and Chenxi Li and Zhen Wang and Fan Bai and Haotian Luo and Jiayou Zhang and Nebojsa Jojic and Eric P. Xing and Zhiting Hu},
year={2023},
eprint={2310.16427},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2310.16427},
}
Source
PromptAgentOptimizer
Bases: BaseOptimizer
PromptAgent Optimizer.
Based on PromptAgent: Strategic Planning with Language Models Enables Expert-level Prompt Optimization.
@misc{wang2023promptagentstrategicplanninglanguage,
title={PromptAgent: Strategic Planning with Language Models Enables Expert-level Prompt Optimization},
author={Xinyuan Wang and Chenxi Li and Zhen Wang and Fan Bai and Haotian Luo and Jiayou Zhang and Nebojsa Jojic and Eric P. Xing and Zhiting Hu},
year={2023},
eprint={2310.16427},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2310.16427},
}
Source code in src/prompt_optimizer/optimizers/promptagent.py
64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 | |
__init__(*, client, seed_prompts, validation_set, max_depth, evaluator, output_path=None, batch_size=5, expand_width=3, num_samples=2, search_mode='beam', score_threshold=None, **kwargs)
Initialize the PromptAgent Optimizer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
client
|
ClientType
|
Language model client to use for prompt generation and feedback. |
required |
seed_prompts
|
list[Prompt]
|
List of prompts to seed generation. |
required |
validation_set
|
ValidationSetType
|
Set of examples to evaluate the prompt on. |
required |
max_depth
|
int
|
Maximum iteration depth for prompt generation. |
required |
evaluator
|
Callable[[Prompt, ValidationSetType], ScoreType]
|
Function that takes a prompt and the validation data and returns a score. |
required |
output_path
|
Union[str, Path]
|
Path to store run results. Should be a .jsonl file path. If None, no outputs will be written to disk. Defaults to None. |
None
|
batch_size
|
int
|
Number of errors to sample for each action / new prompt generation. Defaults to 5. |
5
|
expand_width
|
int
|
Number of feedback actions to generate per prompt. Defaults to 3. |
3
|
num_samples
|
int
|
Number of new prompts to generate per feedback action. Defaults to 2. |
2
|
search_mode
|
Literal['beam', 'greedy']
|
Mode for filtering prompt candidates after each step. "greedy" keeps all prompts from the previous step. "beam" keeps only the highest scoring prompt from each branch of the previous step. Defaults to "beam". |
'beam'
|
score_threshold
|
float
|
Threshold for early convergence. If a prompt exceeds this score after any iteration, the optimization loop immediately ends. If set to None, the optimization loop will not terminate early. Defaults to None. |
None
|
kwargs
|
Additional keyword arguments. |
{}
|
Source code in src/prompt_optimizer/optimizers/promptagent.py
check_early_convergence(*, all_prompts)
Check if the early convergence criteria is met.
Source code in src/prompt_optimizer/optimizers/promptagent.py
generate_prompt_candidates(*, prompts, **kwargs)
Generate prompt candidates using gradients.
Source code in src/prompt_optimizer/optimizers/promptagent.py
get_all_prompts(include_candidates=False)
Get all the prompts from the latest training run.
The default behavior returns a list of lists, where each internal list contains the retained candidates after one iteration step. Setting include_candidates to True will also include all generated candidate prompts.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
include_candidates
|
bool
|
Whether to include all the candidate prompts in the output. If True, candidate prompts from each iteration will be included. Defaults to False. |
False
|
Returns:
| Type | Description |
|---|---|
list[list[Prompt]]
|
list[list[Prompt]]: List of lists where each list contains the prompts from each iteration. E.g. list[0] contains prompts from the first iteration, list[1] the second, etc. If include_candidates is False, each inner list contains only the retained prompts at each iteration. If include_candidates is True, each inner list contains all candidate prompts at each iteration, including those that were discarded. |
Source code in src/prompt_optimizer/optimizers/base.py
run()
Run the optimization pipeline.
Source code in src/prompt_optimizer/optimizers/base.py
save_prompts(output_path)
Save prompts in jsonl format.
Source code in src/prompt_optimizer/optimizers/base.py
select_best_prompt(*, all_prompts)
Select the top scoring prompt.
Source code in src/prompt_optimizer/optimizers/promptagent.py
select_prompt_candidates(*, prompts, validation_set)
Select prompt candidates according to the search mode.