Skip to content

Benchmark Inference Pipeline #1223

@CarsonDavis

Description

@CarsonDavis

Description

We want to run a benchmark of the inference pipeline to see how many documents it is able to successfully process on staging.

To do this, we would first create a benchmarking script that can run in any environment (local/staging/etc) and will feed full texts into the classifier API and track the results. After confirming on local, we can then benchmark the Staging server.

Implementation Considerations

Make a script that loops through the available full texts within COSMOS, up to a maximum of 5000 full texts, and sends them one by one to the classifier API.

It should record the job_ids and check back in with the classifier to see how many classifications were successful and how long they took.

You will need to reference the API documentation located here: https://github.com/NASA-IMPACT/llm-app-classifier-pipeline.

Deliverable

  • stats on classification completion rates
    • num documents sent
    • num documents in each status (failed, unknown, success)
  • stats on classification throughputs
    • len() of documents sent
    • time taken to classify all the documents

Metadata

Metadata

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions