feat(blueprints): create blueprints/users/Varunrnair/sakhi-expert-maternal-health-benchmark.yml #28
Conversation
…ernal-health-benchmark.yml on new branch
|
❌ Blueprint validation failed
|
|
Hi @Varunrnair — thanks for this contribution. Also, great meeting you today! The automated evaluation failed with Suggested fix: split the blueprint by language, so each file stays well under 1 MB:
Each file keeps its own config header (you can adjust We're also planning a fix on the app side so large blueprints are handled more gracefully in the future, but the split above will unblock this PR right away. Thanks again! 🙏 |
Blueprint Contribution
Blueprint Details
What This Blueprint Tests
Checklist
blueprints/users/<my-github-username>/directoryid(e.g.,france-capital-test, notp1or auto-generated)shouldassertions with specific criteria)$not_*functions instead ofshould_notblocks where applicablepnpm cli run <path-to-blueprint>)Notes
This blueprint expands CivicEval's public-health coverage by evaluating maternal health questions curated and validated by clinical experts. The benchmark focuses on evidence-based maternal health guidance across English, Hindi, and Marathi, enabling assessment of both clinical accuracy and multilingual consistency on high-impact public-health topics.
Automated Evaluation: This PR will trigger an automated evaluation with cost-controlled limits (max 10 prompts, CORE models only). Full evaluation runs automatically after merge.