Time for the coup de grâce!
Let's test our agent's ability to actually fix a bug all on its own.
calculator/pkg/calculator.py
precedence
+
uv run calculator/main.py "3 + 7 * 2"
17
20
3 + 7 * 2
After your agent fixes the calculator, run and submit the CLI tests.