Refactor analysis views and enhance task handling#3007
Open
doomedraven wants to merge 11 commits into
Open
Conversation
### Summary of Changes
1. **SQL JOIN Optimization**: Updated the `index` view to use `include_hashes=True` in all `db.list_tasks` calls. This ensures that task and sample data are fetched in a single SQL query per category, eliminating up to 100+ separate database round-trips for sample information.
2. **Bulk MongoDB Fetching**: Implemented a bulk MongoDB query at the start of the `index` view. It now fetches all report metadata (scores, detections, VirusTotal summaries, etc.) for all tasks on the page in **one single MongoDB query**, instead of 100 individual queries.
3. **Redundant Logic Removal**:
* Removed the `get_tags_tasks` function and its associated database query.
* Updated `get_analysis_info` to use pre-fetched task and MongoDB data, completely avoiding redundant SQL and NoSQL calls within the task loops.
4. **Optimized Data Access**: `get_analysis_info` now checks if the sample object is already attached to the task before attempting a database lookup.
These changes reduce the number of database queries for a standard 100-task index page from **over 300** down to **fewer than 10**
### Verification
- The code is now structured to use the "Pre-fetch -> Process" pattern instead of the "Loop -> Fetch" (N+1) pattern.
- Fallbacks are maintained in `get_analysis_info` for use cases where it might be called for a single task (e.g., in the detailed report view).
Contributor
There was a problem hiding this comment.
Code Review
This pull request optimizes the index view by batch-fetching MongoDB analysis data for multiple tasks and refactoring get_analysis_info to utilize pre-fetched data and existing task relationships. It also adds a safety check in filtered_chunk to handle missing call records. Feedback identifies a redundant hasattr check in the sample retrieval logic and a potential IndexError in the Elasticsearch query handling that should be addressed to prevent runtime crashes.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Performance Optimization & Stability Summary
This update addresses critical performance bottlenecks and infrastructure hangs in the CAPEv2 web interface, specifically targeting uWSGI
HARAKIRItimeouts and database N+1 query problems.1. Core Performance Improvements
JOINs (include_hashes=True), fetching task and sample data in a single round-trip instead of 100+ individual queries.GROUP BYquery.tasks_listandtasks_searchendpoints.max_time_ms) for MongoDB queries to prevent workers from hanging on slow lookups.2. Stability & Memory Safety
denormalize_filesMongoDB hook that caused uWSGI workers to loop indefinitely when loading certain reports.process.logreading to prevent memory-related crashes.3. Infrastructure Enhancements
DBTransactionMiddlewareinteractions to ensure that slow MongoDB queries do not leave "zombie" SQL locks that hang the entire web UI.sys.stderr.Impact
signal 9kills and 502 Bad Gateway errors in large-scale installations.