Skip to main content
One way of doing this is to calculate convergence:
  1. Run your agent on a set of similar queries
  2. Record the number of steps taken for each
  3. Calculate the convergence score: avg(minimum steps taken / steps taken for this run)
This will give a convergence score of 0-1, with 1 being a perfect score.
# Assume you have an output which has a list of messages, which is the path taken
all_outputs = [
]

optimal_path_length = min(all_outputs, key = lambda output: len(output))
ratios_sum = 0

for output in all_outputs:
    run_length = len(output)
    ratio = optimal_path_length / run_length
    ratios_sum += ratio

# Calculate the average ratio
if len(all_outputs) > 0:
    convergence = ratios_sum / len(all_outputs)
else:
    convergence = 0

print(f"The optimal path length is {optimal_path_length}")
print(f"The convergence is {convergence}")
https://mintcdn.com/arizeai-433a7140/0x3JhHH4Of-bLwrx/images/image-10.png?fit=max&auto=format&n=0x3JhHH4Of-bLwrx&q=85&s=1961b8a14332f1359516bc2a55ec250b

Google Colab

colab.research.google.com