LLM APIs aren't typically used in a "one-shot" manner, for example:
They work the same way ChatGPT works: in a conversation. The conversation has a history, and if we keep track of that history, then with each new prompt, the model can see the entire conversation and respond within the larger context of the conversation.
Importantly, each message in the conversation has a "role." In the context of a chat app like ChatGPT, your conversations would look like this:
So, while our program will still be "one-shot" for now, let's update our code to store a list of messages in the conversation, and pass in the "role" appropriately.
from google.genai import types
messages = [types.Content(role="user", parts=[types.Part(text=args.user_prompt)])]
response = client.models.generate_content(
model="gemini-2.5-flash", contents=messages
)
In the future, we'll add more messages to the list as the agent does its tasks in a loop.
If everything is still working normally, run and submit the CLI tests.