Step-By-Step Instruction Following
Sometimes we need an agent to follow a very strict script. For example, when troubleshooting a router, the agent must ask the customer to unplug it, wait 60 seconds, and then plug it back in. If the agent skips the wait, the fix won’t work. LLMs (Large Language Models) can get distracted or try to be “helpful” by skipping steps. To prevent this, we use a State Machine.What is a State Machine?
Think of a State Machine like a digital bookmark in a book.- The Book is your script (the instructions).
- The Bookmark is the “State” (where you are right now).
How to Build One
To make an agent follow a script, we need three things:- The Script (Flow Definitions): A list of steps the agent must take.
- The Tracker (State): A variable that remembers the current step.
- The Tools: Functions the agent calls to “turn the page”.
1. The Script
We define our script as a simple list of steps. Each step has an ID and instructions.2. The Tracker
We need a way to track which step is active.3. The Tools
Instead of giving the agent the whole script at once, we give it tools to ask “What do I do next?” and “I’m done”.Tool A: get_current_step
The agent calls this to see what it should do right now.
Tool B: start_step
The agent calls this to say “Okay, I am doing this step now.”
Tool C: complete_step
The agent calls this when the customer has finished the action. This is where the magic happens. This function moves the bookmark forward.
Putting It All Together
When the agent runs, the conversation looks like this:- Agent: “System, what is my current step?” (Calls
get_current_step) - System: “Your step is
step_1_unplug. Instruction: Ask customer to unplug.” - Agent: “Hello customer, please unplug your router.”
- Customer: “Okay, it’s unplugged.”
- Agent: “System, I finished
step_1_unplug.” (Callscomplete_step) - System: “Marked as done. Moving bookmark to
step_2_wait. Instruction: Wait 60 seconds.” - Agent: “Okay customer, now we must wait 60 seconds…”