Refactor analysis views and enhance task handling by doomedraven · Pull Request #3007 · kevoreilly/CAPEv2

doomedraven · 2026-05-08T09:11:32Z

Performance Optimization & Stability Summary

This update addresses critical performance bottlenecks and infrastructure hangs in the CAPEv2 web interface, specifically targeting uWSGI HARAKIRI timeouts and database N+1 query problems.

1. Core Performance Improvements

Resolved N+1 Query Patterns:
- Analysis Index: Updated all task listing logic to use SQL JOINs (include_hashes=True), fetching task and sample data in a single round-trip instead of 100+ individual queries.
- Dashboard: Replaced multiple individual task counts with a single efficient SQL GROUP BY query.
- APIv2: Applied similar pre-fetching optimizations to tasks_list and tasks_search endpoints.
Bulk MongoDB Fetching: Implemented a bulk metadata fetch on the main analysis page, consolidating up to 100 individual MongoDB lookups into one single query.
High-Speed Report Loading:
- Optimized the individual report view to use an extremely narrow MongoDB projection for the initial load, deferring heavy data (Sigma, Sysmon, Suricata) to AJAX-loaded tabs.
- Implemented a 10-second hard timeout (max_time_ms) for MongoDB queries to prevent workers from hanging on slow lookups.

2. Stability & Memory Safety

Fixed Critical Infinite Loop: Resolved a generator bug in the denormalize_files MongoDB hook that caused uWSGI workers to loop indefinitely when loading certain reports.
Resource Capping:
- Noisy Signatures: Capped signature processing at 1,000 matches per signature to prevent CPU exhaustion.
- Massive Logs: Implemented a 1MB safety cap on process.log reading to prevent memory-related crashes.
- Network & BinGraphs: Capped DNS/domain lookups at 1,000 entries and binary graphs at 10 files (max 512KB each).
Hardened Database Layer:
- Implemented Lazy Initialization for MongoDB connections to ensure fork-safety within uWSGI.
- Added strict connection (5s) and socket (30s) timeouts to the MongoDB client.

3. Infrastructure Enhancements

SQL Lock Prevention: Removed/Optimized DBTransactionMiddleware interactions to ensure that slow MongoDB queries do not leave "zombie" SQL locks that hang the entire web UI.
Improved Logging: Hardened the internal logging mechanisms to ensure critical errors are reliably captured in uWSGI logs via sys.stderr.

Impact

Database Round-trips: Reduced by ~90% on the main index and report pages.
Page Load Speed: Significantly faster initial "Time to First Byte" (TTFB) for reports.
System Reliability: Eliminates the primary causes of signal 9 kills and 502 Bad Gateway errors in large-scale installations.

### Summary of Changes 1. **SQL JOIN Optimization**: Updated the `index` view to use `include_hashes=True` in all `db.list_tasks` calls. This ensures that task and sample data are fetched in a single SQL query per category, eliminating up to 100+ separate database round-trips for sample information. 2. **Bulk MongoDB Fetching**: Implemented a bulk MongoDB query at the start of the `index` view. It now fetches all report metadata (scores, detections, VirusTotal summaries, etc.) for all tasks on the page in **one single MongoDB query**, instead of 100 individual queries. 3. **Redundant Logic Removal**: * Removed the `get_tags_tasks` function and its associated database query. * Updated `get_analysis_info` to use pre-fetched task and MongoDB data, completely avoiding redundant SQL and NoSQL calls within the task loops. 4. **Optimized Data Access**: `get_analysis_info` now checks if the sample object is already attached to the task before attempting a database lookup. These changes reduce the number of database queries for a standard 100-task index page from **over 300** down to **fewer than 10** ### Verification - The code is now structured to use the "Pre-fetch -> Process" pattern instead of the "Loop -> Fetch" (N+1) pattern. - Fallbacks are maintained in `get_analysis_info` for use cases where it might be called for a single task (e.g., in the detailed report view).

gemini-code-assist

Code Review

This pull request optimizes the index view by batch-fetching MongoDB analysis data for multiple tasks and refactoring get_analysis_info to utilize pre-fetched data and existing task relationships. It also adds a safety check in filtered_chunk to handle missing call records. Feedback identifies a redundant hasattr check in the sample retrieval logic and a potential IndexError in the Elasticsearch query handling that should be addressed to prevent runtime crashes.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

gemini-code-assist Bot reviewed May 8, 2026

View reviewed changes

Comment thread web/analysis/views.py

Comment thread web/analysis/views.py

doomedraven and others added 10 commits May 8, 2026 18:15

Update web/analysis/views.py

bbc178f

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

fixes

aebc0c8

fix

cce744b

fixes

904b32b

fixes

349e7a8

fixes

39e81cf

fixes

952e394

fixes

5b17ade

fixes

62509e0

fixes

1bec994

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor analysis views and enhance task handling#3007

Refactor analysis views and enhance task handling#3007
doomedraven wants to merge 11 commits into
masterfrom
doomedraven-patch-3

doomedraven commented May 8, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

doomedraven commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Performance Optimization & Stability Summary

1. Core Performance Improvements

2. Stability & Memory Safety

3. Infrastructure Enhancements

Impact

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

doomedraven commented May 8, 2026 •

edited

Loading