Dataframe
Below shows only the relevant subsection of the dataframe. Theretrieved_document_ids should matched the ids in the corpus data. Note that for each row, the list under the relevance_scores column have a matching length as the one under the retrievals column. But it’s not necessary for all retrieval lists to have the same length.
| query | embedding | retrieved_document_ids | relevance_scores |
|---|---|---|---|
| who was the first person that walked on the moon | [-0.0126, 0.0039, 0.0217, … | [7395, 567965, 323794, … | [11.30, 7.67, 5.85, … |
| who was the 15th prime minister of australia | [0.0351, 0.0632, -0.0609, … | [38906, 38909, 38912, … | [11.28, 9.10, 8.39, … |
| why is amino group in aniline an ortho para di… | [-0.0431, -0.0407, -0.0597, … | [779579, 563725, 309367, … | [-10.89, -10.90, -10.94, … |
Schema
Both theretrievals and scores are grouped under prompt_column_names along with the embedding of the query.

