Tarka

Sanskrit: logic, reasoning, structured argument

An on-call oriented agent that turns Prometheus/Alertmanager alerts into actionable triage reports.

Designed for small teams to reduce "tribal knowledge" during incidents by producing consistent, honest, copy/paste-friendly investigation narratives.

Quick Links

Quickstart Guide - First investigation in 5 minutes
One Pager - Leadership overview and motivation
Full Documentation - Complete documentation hub

What It Does

Converts Prometheus/Alertmanager alerts into triage reports with:

Deterministic base triage: Explicit about what's known and unknown (no guessing)
Alert-specific playbooks: CPU throttling, OOM, HTTP 5xx, pod health, etc.
Multi-source evidence: Prometheus metrics + Kubernetes context + logs (all best-effort)
Read-only operations: Safe investigation, no cluster mutations
Flexible deployment: Run as CLI or in-cluster webhook service

See Triage Methodology for the philosophy.

See It in Action

Case Inbox -- All active alerts in one place, scored and classified automatically.

Leadership Dashboard -- ROI metrics, signal quality, incident trends, and engineer hours saved.

Triage Report -- Structured evidence, verdict, and copy-paste-ready next steps for every alert.

Case Chat -- Ask follow-up questions about a specific case. The agent has full context of the investigation.

Global Chat -- Query across all cases with tool-using AI (PromQL, kubectl, log search, and more).

Full example inputs and reports are available in the examples/ directory:

examples/reports/pod-crashloop/report.md -- rendered triage report
examples/reports/pod-crashloop/investigation.json -- structured JSON analysis

Quick Start

# Install dependencies (base + LLM provider)
poetry install                      # Base installation (no LLM)
poetry install -E vertex           # Base + Vertex AI (Gemini)
poetry install -E anthropic        # Base + Anthropic (Claude)
poetry install -E all-providers    # Base + all LLM providers

# List active alerts
poetry run python main.py --list-alerts

# Investigate a specific alert
poetry run python main.py --alert 0

# Investigate with LLM enrichment (optional)
poetry run python main.py --alert 0 --llm

LLM Provider Configuration:

Vertex AI: Set LLM_PROVIDER=vertexai, GOOGLE_CLOUD_PROJECT, GOOGLE_CLOUD_LOCATION
Anthropic: Set LLM_PROVIDER=anthropic, ANTHROPIC_API_KEY
See Multi-Provider LLM Guide for details

For detailed setup, see Quickstart Guide.

Local Development

Start a complete local development environment with one command:

# Copy environment template
cp .env.example .env

# Start PostgreSQL, NATS, and mock monitoring services
make dev-up

# Start webhook server (Terminal 2)
make dev-serve

# Start UI dev server (Terminal 3)
make dev-ui

Access the UI: http://localhost:5173

Username: admin
Password: admin123 (or from .env)

Mock Services: The local environment includes mock Prometheus/Alertmanager/Logs that return empty data, allowing you to test the full pipeline without real infrastructure. To use real services, port-forward and update .env.

Stop services:

make dev-down

See the Local Development Guide for detailed instructions, troubleshooting, and advanced usage.

Documentation

Getting Started: Quickstart • Local Development • Authentication • Environment Variables
Operating: Deployment • Operations • Testing
Architecture: Overview • Investigation Pipeline • Diagnostics • Playbooks
Integrations: Slack App Setup • GitHub App Setup
Extending: Adding Playbooks • Triage Methodology
Features: Chat • Actions • Memory • Multi-Provider LLM
Roadmap: Completed • Planned

Requirements

Prometheus-compatible API (required) - For metrics and scope calculation
Kubernetes API (optional but recommended) - For pod context and events
VictoriaLogs (optional) - For log evidence (agent remains useful without logs)

Contributing

We welcome contributions! See CONTRIBUTING.md for development setup, coding standards, and PR guidelines.

License

Licensed under the Apache License 2.0.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
.github		.github
agent		agent
config		config
dev		dev
docs		docs
eval		eval
examples		examples
k8s		k8s
scripts		scripts
tests		tests
ui		ui
.dockerignore		.dockerignore
.env.deploy.template		.env.deploy.template
.env.example		.env.example
.env.fixture.example		.env.fixture.example
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
DEPLOYMENT.md		DEPLOYMENT.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
deploy.sh		deploy.sh
docker-compose.yml		docker-compose.yml
main.py		main.py
package-lock.json		package-lock.json
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tarka

Quick Links

What It Does

See It in Action

Quick Start

Local Development

Documentation

Requirements

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Tarka

Quick Links

What It Does

See It in Action

Quick Start

Local Development

Documentation

Requirements

Contributing

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages