Note: I saw an open-source project on GitHub (its code can be seen in the Reference section at the bottom) a while ago, and I am implementing/integrating the features below on top of it.
My main motivation behind the project is to improve my skills in building a full-stack application that is ready to be used by real people and, more importantly, to enhance my understanding of how various components in a full-stack application work together, rather than just building a simple chatbot.
- Multi-agent system (Done)
- Stateful graph based orchestration with conversational memory (Done)
- Conditional routing (Done)
- Multi-step reasoning (In Progress)
- Prompt caching (Not Started)
- Manual high-quality and diverse data collection to evaluate the system (In Progress)
- Online and offline evaluation system with LangSmith (In Progress)
- Tracking the evaluation metrics in a dashboard (In Progress)
- Building a separate Docker container for tracking online evaluation metrics (Not Started)
- Sign-up and log-in mechanisms integration to the sidebar (Done)
- Chatbot integration to the sidebar (Done)
- Photorealistic 3D map (Done)
- API design and development (Done)
- API rate limiting (Done)
- Password hashing with Argon2id (Done)
- JWT authentication (Done)
- User session (Done)
- Caching extracted data and data schema with Redis (Done)
- Data validation with Pydantic (Done)
- AWS-hosted PostgreSQL integration to store user information (In Progress)
- AWS S3 bucket integration to store the uploaded files (In Progress)
- Encrypted communication (In Progress)
- Cookie security: HttpOnly + SameSite (Done)
- Multi-service Docker orchestration (Done)
- Reverse proxy integration (In Progress)
- Deployment in AWS EC2 (Done)
SPA Load, Cookie-Based Login, Encrypted Communication, WebSocket, and TLS Termination at Reverse Proxy
- When preparing the dataset to evaluate the systems, I prepared different groups of datasets to be able to evaluate the system from different/diverse perspectives.
-
Queries that require information available in the uploaded file
- Questions that specifically ask which message type and columns to be extracted? (category:
data_extraction) - Queries that require extracting and returning specific information from the uploaded file (category:
extractive) - Queries that require multi-step reasoning (category:
multi_step_reasoning_single_file) - Querues that require multi-step reasoning across multiple files (category:
multi_step_reasoning_multiple_files) - Queries that require relevant information from external web pages (listed below) to be used when generating the answer (category:
external_knowledge_usage) - Prompts that request multiple tasks to be completed (category:
multi_task)
- Questions that specifically ask which message type and columns to be extracted? (category:
-
Queries that require information not available in the uploaded file
- Queries that measure the system's awareness of external knowledge related to the uploaded file (cateogry:
external_knowledge_awareness) - Daily-life queries that are not related to this topic at all (category:
out_of_scope) - Queries that are relevant and technical but cannot be answered using the information available in the uploaded file (category:
not_found)
- Queries that measure the system's awareness of external knowledge related to the uploaded file (cateogry:
-
The list of web pages that have the technical information that might be beneficial for the agents:
- ArduCopter onboard log messages:
https://ardupilot.org/copter/docs/logmessages.html - Standard MAVLink common messages:
https://mavlink.io/en/messages/common.html - ArduPilot MAVLink dialect messages:
https://mavlink.io/en/messages/ardupilotmega.html
- Context score (whether all the required data and information is available in the context)
- Correctnes score with LLM as a judge (whether the answer semantically matches with the ground truth)
- Exact match score (for questions that require extracting specific data from the uploaded file)
- Node selection (whether the right nodes are chosen for execution)
- Tool selection (whether the right tools are chosen for execution)
- C-DNF ("Correct data not found") score (sometimes the user asks a question, but the required data may not exist in the uploaded file. It is important for the system to detect this correclty, and answer that the required data was not found in the uploaded file instead of making assumptions).
- Average task completion rate (out of all the user requests in a prompt, how many are completed successfully?)
- Conciseness
- P50/P90/P99 latency
- Total token usage
- Total cost
- Node failure rate
- Tool failure rate
- Cache hit rate
- Ratio of failed answers
- User-reported feedback
To track these metrics, there were many options such as:
- LangSmith
- OpenAI evaluation platform
- Anthropic evaluation platform
- Manual evaluation with custom Python code and Weights & Biases
Considering that I had already used LangChain and LangGraph during the process, and that LangSmith already provides many features that make it easy to evaluate the system and build dashboards, I decided to use LangSmith.
To be announced
1) Create EC2 Instance
- AMI: Ubuntu 24.04 LTS
- Instance type: m7i-flex.large
- Storage: 20–30 GB
- Number of instances: 1
- Security group rules: Allow ports
22(SSH from your IP),80, and443.
Note: By default, EC2 blocks all inbound traffic. The security group acts as a firewall for the EC2 instance and determines which sources and ports are allowed to access the machine.
2) Connect and Prepare the Machine
Run the command below in your computer’s terminal to connect to the EC2 instance from the terminal.
ssh -i your-key.pem ubuntu@your-public-ipThis connects your terminal to the EC2 instance. Next, run the commands below to install Git, Docker, and Docker Compose on EC2, start Docker, and add the ubuntu user to the docker group so you can run Docker without sudo.
sudo apt update && sudo apt upgrade -y
sudo apt install -y docker.io git
sudo systemctl enable --now docker
sudo usermod -aG docker ubuntu
sudo apt install -y docker-compose-plugin3) Deploy Code
Clone the project repository from GitHub.
git clone https://github.com/ozyurtf/agentic-data-assistant.git
cd agentic-data-assistantCreate a files folder inside the api folder.
mkdir -p api/filesCopy the variables below.
# Cesium
VUE_APP_CESIUM_TOKEN=<your_cesium_ion_token> # Get from https://ion.cesium.com/signin
VUE_APP_CESIUM_RESOURCE_ID=3
# Google Maps Platform
VUE_APP_GOOGLE_MAPS_KEY=<your_google_maps_key>
# MapTiler
VUE_APP_MAPTILER_KEY=<your_maptiler_key>
# OpenAI
LLM_PROVIDER=anthropic
OPENAI_API_KEY=<your_openai_api_key>
ANTHROPIC_API_KEY=<your_anthropic_api_key>
# Firecrawl
FIRECRAWL_API_KEY=<your_firecrawl_api_key> # Get from https://www.firecrawl.dev
# Chatbot
CHAINLIT_AUTH_SECRET=<your_chainlit_secret> # Get from https://docs.chainlit.io/authentication/overview
# Set the maximum file size allowed for uploading
MAX_FILE_SIZE_MB=100
# Set how long cached data should stay in Redis (in seconds)
CACHE_TTL_SECONDS=3600
# Set the number of data types that can be extracted from the file in a single request.
MAX_MESSAGE_TYPES=3
# App settings
USER_AGENT=drone-chatbot
# Ports and hosts
REDIS_PORT=6379
CHATBOT_PORT=8000
API_PORT=8001
# Redis password
REDIS_PASSWORD=<enter_a_password_for_redis>
# Auth
JWT_SECRET=<a_long_random_string> # Generate with: python3 -c "import secrets; print(secrets.token_urlsafe(48))"
JWT_TTL_SECONDS=604800 # JWT validity window, in seconds (default: 7 days)
AUTH_COOKIE_SECURE=true # Set to true in production (requires HTTPS)
AUTH_COOKIE_SAMESITE=lax # lax (default) for same-origin dev; none + secure=true for cross-site iframesCreate an empty .env file, set the values of the copied variables inside the .env file, and save it.
touch .env
nano .env 4) Register a Domain Name
Buy a domain (let's call it agenticdas.com) from a registrar (e.g., Namecheap, Cloudflare, GoDaddy, or Google Domains). A .com domain costs about $10/year.
5) Point the Domain at the EC2 Instance
In the DNS panel, create two A records for mapping the domain into the IP address of the EC2 instance:
- agenticdas.com:
<your-ec2-public-ip> - www.agenticdas.com:
<your-ec2-public-ip>
6) Verify the Mapping Locally
Run the command below in your terminal to verify whether the domain points to the EC2 created in the 1st step.
dig +short agenticdas.com7) Obtain a Certificate from Certificate Authority
Install the Certbot on the EC2.
sudo apt install -y certbot
sudo certbot certonly --standalone -d agenticdas.com -d www.agenticdas.comAfter this, the certificate (fullchain.pem) and the private key (privkey.pem) are saved to the EBS volume (/etc/letsencrypt/live/agenticdas.com/).
Note: The nginx.conf.template file references these files:
ssl_certificate /etc/letsencrypt/live/agenticdas.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/agenticdas.com/privkey.pem;
8) Enable Secure Cookies
Make sure that AUTH_COOKIE_SECURE is defined as true in the .env file.
9) Launch Services in EC2
docker compose up -d --build10) Access
- UI at
https://www.agenticdas.com/ - Sign up:
admin/password - Log in:
admin/password
- UAV Log Viewer:
https://github.com/ArduPilot/UAVLogViewer







