Add Markdown (.md) Report Format Support#1447
Add Markdown (.md) Report Format Support#1447chinmaychandra1 wants to merge 2 commits intoOWASP:masterfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds a new Nettacker application entrypoint implementing startup, dependency/DB checks, target normalization/expansion, prerequisite scans, and multiprocess/multithread scan orchestration; extends report generation to support Markdown output and richer report engines; and adds a Nepali locale resource file. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Hi 👋 This PR introduces Markdown (.md) report format support in Nettacker. I have implemented the feature in graph.py and verified it by generating reports using a sample scan. The output is structured and readable. Looking forward to your feedback. I’d be happy to make any improvements if needed. Thanks! |
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@nettacker/core/app.py.save`:
- Around line 1-327: Remove the accidental backup file that duplicates the
application flow: delete the committed backup (the file containing the Nettacker
class and functions expand_targets, run, start_scan, scan_target,
scan_target_group) from the PR so only the real source remains; ensure no
behavior changes are made to the canonical app.py and confirm the backup file is
not re-added (optionally add the pattern to .gitignore if needed).
- Around line 117-128: The target normalization currently mangles credentialed
hosts and IPv6 by using naive splits and also overwrites a single shared
self.arguments.url_base_path for all targets; replace the string-splitting in
the normalization block with robust URL parsing (use
urllib.parse.urlsplit/urlparse) to extract hostname, port, username/password,
and path properly (preserve IPv6 literals and credentials) and set a per-target
base_path variable (do not assign to the shared self.arguments.url_base_path) so
each invocation of scan_target() uses its own parsed base_path/host; update the
logic in the code that currently does
target.split("://")[1].split("/")[0].split(":")[0] to derive host/from the
urlsplit result and derive base_path from the urlsplit.path instead of
truncating with split.
In `@nettacker/core/graph.py`:
- Around line 304-316: The Markdown table rows can be broken by unescaped pipe
characters or newlines in fields from all_scan_logs and merge_logs_to_list;
before building each cell (the values from event.get('date'),
event.get('target'), event.get('module_name'), event.get('port') and the derived
log_text) sanitize/escape them (e.g., replace "|" with "\|" and normalize or
replace newlines with "<br>" or a single space) so each cell remains a single
table cell, then concatenate the escaped values into the md_content rows and
write via Path(report_path_filename).open as before.
In `@nettacker/locale/ne.yml`:
- Line 98: Update the Nepali help text for the "save_logs" key in
nettacker/locale/ne.yml to advertise the new Markdown output by adding ".md"
(and optionally mentioning "-o output.md") to the list of output formats so
Nepali users can discover the Markdown option; locate the "save_logs" entry and
append ".md" (e.g., "results.txt, results.csv, results.html, results.json,
results.md" or include a note like "including .md via -o output.md").
- Around line 84-87: The outgoing_proxy translation string contains an invalid
proxy auth example with a space after the username colon ("socks://username:
password@127.0.0.1"); update the outgoing_proxy value to remove the space so the
example reads "socks://username:password@127.0.0.1" (and verify other auth
examples like "socks4://username:password@127.0.0.1" and
"socks5://username:password@127.0.0.1" remain correctly formatted) so users can
copy/paste a valid URL.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: a1eed80a-899f-4728-976d-8fe4e4f1f1e1
📒 Files selected for processing (3)
nettacker/core/app.py.savenettacker/core/graph.pynettacker/locale/ne.yml
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (2)
nettacker/core/graph.py.save (2)
88-89: Variablelogshadows the module-level logger.The local variable
logon line 89 shadows the module-levellog = logger.get_logger()from line 24. While this doesn't cause immediate issues since the logger isn't used within this function, it's a potential source of confusion and bugs if logging is added later.♻️ Suggested rename
for event in events: - log = merge_logs_to_list(event, []) + log_entries = merge_logs_to_list(event, []) _table.add_rows( [ table_headers, [ event["date"], event["target"], event["module_name"], str(event["port"]), - "\n".join(log) if log else "Detected", + "\n".join(log_entries) if log_entries else "Detected", ], ] )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@nettacker/core/graph.py.save` around lines 88 - 89, The local variable name "log" in the loop (for event in events: log = merge_logs_to_list(event, [])) shadows the module-level logger variable "log = logger.get_logger()"; rename the local variable (for example to "merged_log" or "event_logs") wherever merge_logs_to_list is assigned/used in that loop to avoid confusion and possible future bugs while leaving the module-level logger unchanged.
391-397: Minor: Length check inconsistency.Same minor inconsistency as in
create_report: length check>= 5but slice[-4:]. However, the CSV header handling here is correct—usingcsvfile.tell() == 0to conditionally write headers is the proper approach for append mode.♻️ Optional fix for consistency
- elif len(fullpath) >= 5 and fullpath[-4:] == ".csv": + elif len(fullpath) >= 4 and fullpath[-4:] == ".csv":🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@nettacker/core/graph.py.save` around lines 391 - 397, The length check is inconsistent in the CSV branch: replace the len(fullpath) >= 5 and slice check with a clearer endswith check (use fullpath.endswith(".csv")) to locate the CSV case around the block that opens csvfile, creates csv.DictWriter and conditionally writes the header using csvfile.tell() == 0; keep the existing header logic but swap the length/slice test for endswith to be correct and unambiguous.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@nettacker/core/graph.py.save`:
- Around line 294-301: The CSV branch opens report_path_filename in append mode
("a") and calls csv.DictWriter.writeheader() unconditionally, which causes
duplicate headers when the file already exists; change the file open mode to
write ("w") to match other formats or, if append is required, check if the file
is empty/existing before calling writer.writeheader(); update the block using
report_path_filename, csvfile, csv.DictWriter (writer) and writeheader()
accordingly so headers are only written when the file is newly created/empty.
- Around line 304-306: The report writer currently falls through to plain text
for all unknown extensions; add explicit Markdown handling by inserting an elif
branch that checks Path(report_path_filename).suffix.lower() == ".md" and opens
the file to write build_markdown_table(all_scan_logs) (mirroring the existing
.csv/.txt logic around report_path_filename and all_scan_logs). Also implement a
new helper function build_markdown_table(events) that returns a proper
GitHub-flavored Markdown table string (headers, separator row, and rows for each
event) so Markdown reports are formatted correctly instead of using
build_text_table.
- Around line 280-283: The length check for detecting the ".dd.json" extension
is inconsistent: change the condition that uses report_path_filename from
"len(report_path_filename) >= 5" to "len(report_path_filename) >= 8" so it
matches the 8-character slice report_path_filename[-8:] used to compare
".dd.json"; update the conditional around the block that calls
create_dd_specific_json(all_scan_logs) and writes to report_file to use the
corrected length check to avoid false matches on short filenames.
- Around line 133-134: The code calls .strip() on log.get("event") and
log.get("json_event") but get_logs_by_scan_id returns parsed JSON (dict/list),
causing AttributeError; update the handling in the block that sets impact and
severity_justification so that if the retrieved value is a str it is stripped,
otherwise convert non-string values to a string (e.g., via json.dumps or str)
before stripping; reference the variables impact and severity_justification and
the source function get_logs_by_scan_id to locate and fix the logic that
currently blindly calls .strip() on the values.
---
Nitpick comments:
In `@nettacker/core/graph.py.save`:
- Around line 88-89: The local variable name "log" in the loop (for event in
events: log = merge_logs_to_list(event, [])) shadows the module-level logger
variable "log = logger.get_logger()"; rename the local variable (for example to
"merged_log" or "event_logs") wherever merge_logs_to_list is assigned/used in
that loop to avoid confusion and possible future bugs while leaving the
module-level logger unchanged.
- Around line 391-397: The length check is inconsistent in the CSV branch:
replace the len(fullpath) >= 5 and slice check with a clearer endswith check
(use fullpath.endswith(".csv")) to locate the CSV case around the block that
opens csvfile, creates csv.DictWriter and conditionally writes the header using
csvfile.tell() == 0; keep the existing header logic but swap the length/slice
test for endswith to be correct and unambiguous.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 6bfb889c-ba6f-4b48-a91b-66f8295fda1c
📒 Files selected for processing (2)
nettacker/core/app.py.savenettacker/core/graph.py.save
🚧 Files skipped from review as they are similar to previous changes (1)
- nettacker/core/app.py.save
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (1)
nettacker/core/app.py.save (1)
93-99: Consider logging the actual exception for debugging.The bare
except Exceptionblocks hide the root cause, making production debugging difficult. Consider logging the exception before callingdie_failure.💡 Suggested improvement
elif Config.db.engine == "mysql": try: mysql_create_database() mysql_create_tables() - except Exception: + except Exception as e: + log.error(f"MySQL initialization failed: {e}") die_failure(_("database_connection_failed")) elif Config.db.engine == "postgres": try: postgres_create_database() - except Exception: + except Exception as e: + log.error(f"PostgreSQL initialization failed: {e}") die_failure(_("database_connection_failed"))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@nettacker/core/app.py.save` around lines 93 - 99, The except blocks around postgres_create_database() (and the earlier database creation branch conditioned on Config.db.engine) are swallowing errors; modify both except Exception handlers to capture the exception (e.g., except Exception as e) and log the full error/traceback via the project logger (or logger.exception) before calling die_failure(_("database_connection_failed")), so you preserve the original exception context for debugging while keeping the existing failure behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@nettacker/core/app.py.save`:
- Around line 325-329: The worker thread exceptions are being swallowed; wrap
thread entry points (where scan_target is passed to threading.Thread) with a
thread_wrapper that catches Exception and puts (threading.current_thread().name,
exc) into a shared exception_queue (queue.Queue); keep using
wait_for_threads_to_finish but after it returns drain exception_queue, log each
failure (e.g., processLogger.error or log.error) including thread name and
exception, and cause the calling flow to return False or re-raise a composed
exception so scan failures are not treated as successes; update places creating
threads to use thread_wrapper and add the exception_queue handling after
wait_for_threads_to_finish.
- Around line 250-252: The for-loop is shadowing the outer target_groups list by
using the same name for the loop variable; change the loop to iterate with a
distinct name (e.g., for t_id, group in enumerate(target_groups)) and pass that
per-group variable to multiprocess.Process (args=(group, scan_id, t_id)) so
scan_target_group receives the single group rather than the whole list; update
any references in that loop from target_groups to the new loop variable.
- Around line 54-56: The code opens Config.path.logo_file directly in
log.write_to_api_console causing the file handle to remain open; change this to
use a with-statement context manager around opening the file (e.g., with
open(Config.path.logo_file) as f:) and pass f.read() to log.write_to_api_console
so the file is properly closed; update the call site where
log.write_to_api_console is invoked to read from the closed-managed variable
instead of opening inline.
- Around line 268-270: scan_target currently calls set_socks_proxy which mutates
global socket state (socks.set_default_proxy) causing races when scan_target
runs in parallel; move the proxy setup out of scan_target into scan_target_group
(call set_socks_proxy once per process/group before spawning threads) and remove
the socket.socket/socket.getaddrinfo assignment from scan_target; if per-target
proxies are required instead, implement per-thread proxy state via thread-local
storage or guard set_socks_proxy with a global lock so only one thread mutates
global socket config at a time (refer to set_socks_proxy, scan_target,
scan_target_group, and the use of socks.set_default_proxy).
- Around line 117-131: The block handling URL parsing (inside the if "://" in
target:) has inconsistent indentation causing an IndentationError; normalize the
indentation to use a consistent number of spaces (e.g., 4 spaces) for all nested
statements (the urlparse call, the if parsed.hostname branch, its else branch,
the target = target.strip().lower() line, and targets.append(target)) so the if
"://" block is uniformly indented and all nested if/else blocks
(parsed.hostname) align correctly; locate the URL parsing block that references
urlparse, parsed.hostname, base_path, target, and targets and re-indent those
lines to a single consistent level.
---
Nitpick comments:
In `@nettacker/core/app.py.save`:
- Around line 93-99: The except blocks around postgres_create_database() (and
the earlier database creation branch conditioned on Config.db.engine) are
swallowing errors; modify both except Exception handlers to capture the
exception (e.g., except Exception as e) and log the full error/traceback via the
project logger (or logger.exception) before calling
die_failure(_("database_connection_failed")), so you preserve the original
exception context for debugging while keeping the existing failure behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 897336c1-ddd8-4aba-bd3a-f5d16ff162c4
📒 Files selected for processing (1)
nettacker/core/app.py.save
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
nettacker/core/app.py.save (1)
194-198: In-place mutation with return is unusual but correct.The function modifies
targetsin place and also returns it. Consider returning a new filtered list for clearer semantics, but this is a minor style preference.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@nettacker/core/app.py.save` around lines 194 - 198, The method filter_target_by_event currently mutates the passed-in targets list (using copy.deepcopy then removing items) while also returning it; change it to avoid in-place mutation by building and returning a new filtered list instead: iterate over the original targets and include only those where find_events(target, module_name, scan_id) is truthy (e.g., via a list comprehension) and return that new list, leaving the original targets unmodified; keep the method name and signature intact and continue to call find_events for the filtering condition.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@nettacker/core/app.py.save`:
- Around line 268-283: In scan_target the Module(...) instantiation block is
mis-indented (6 spaces) causing an IndentationError; update the indentation of
the module = Module(...), module.load(), module.generate_loops(),
module.sort_loops(), and module.start() lines to match the surrounding method
indentation (use the same 8-space level used elsewhere in scan_target) so the
block aligns consistently with the surrounding code flow.
- Around line 89-99: The current bare except blocks around
mysql_create_database/mysql_create_tables and postgres_create_database swallow
the actual exception; update both except clauses to capture the exception (e.g.,
except Exception as e) and pass or log the exception details to die_failure so
the error message includes the exception text (or include traceback) for
Config.db.engine == "mysql" and "postgres" paths, ensuring
mysql_create_database, mysql_create_tables, and postgres_create_database
failures surface diagnostic info instead of only
_("database_connection_failed").
---
Nitpick comments:
In `@nettacker/core/app.py.save`:
- Around line 194-198: The method filter_target_by_event currently mutates the
passed-in targets list (using copy.deepcopy then removing items) while also
returning it; change it to avoid in-place mutation by building and returning a
new filtered list instead: iterate over the original targets and include only
those where find_events(target, module_name, scan_id) is truthy (e.g., via a
list comprehension) and return that new list, leaving the original targets
unmodified; keep the method name and signature intact and continue to call
find_events for the filtering condition.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: a5c653fe-3dd0-44ba-9986-41d54c490117
📒 Files selected for processing (1)
nettacker/core/app.py.save
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@nettacker/core/app.py.save`:
- Around line 280-294: The except block swallows errors but the function always
returns os.EX_OK; introduce a local exit status variable (e.g., exit_code =
os.EX_OK) before the try around
module.load()/module.generate_loops()/module.sort_loops()/module.start(), catch
exceptions and set exit_code to a non-zero value (e.g., os.EX_SOFTWARE) while
logging the full exception, then return exit_code instead of os.EX_OK; also
ensure that whatever caller uses this function (e.g.,
wait_for_threads_to_finish) can observe thread failures by returning this status
(or switch to a thread-safe shared result/queue or ThreadPoolExecutor to
propagate failures if needed).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 2e14ca14-1738-4247-b106-6d0882eb71bb
📒 Files selected for processing (1)
nettacker/core/app.py.save
f16e07c to
dca75db
Compare
|
Hey @chinmaychandra1, can you please clean up your PR? you've rebased instead of merging main I guess, which is why you're seeing other commits in your PR. Please fix that so its good to review. |
0297a0c to
5df0584
Compare
|
Hey @pUrGe12 Thanks for the suggestion! I’ve cleaned up the PR and removed unrelated commits. It now only contains the relevant changes for this feature. Please take another look. |
|
Hi! All the suggested issues (critical, major, and minor) have been addressed. I’ve ensured proper fixes for logic, indentation, and variable handling. Minor improvements and broader changes will be handled separately to keep this PR focused. Kindly review when you get time. Looking forward to your feedback. |
nettacker/core/app.py
Outdated
|
|
||
| def scan_target( | ||
| self, | ||
| . self, |
There was a problem hiding this comment.
I don't think this . was intended.
There was a problem hiding this comment.
Thanks for pointing this out!
This seems to have been from an earlier version — the duplicate "self" has already been removed in the latest commit.
Please let me know if anything else needs to be adjusted.
nettacker/core/app.py
Outdated
| def scan_target_group(self, targets, scan_id, process_number): | ||
| active_threads = [] | ||
| log.verbose_event_info(_("single_process_started").format(process_number)) | ||
| if self.arguments.socks_proxy: |
There was a problem hiding this comment.
This change is also unrelated to markdown support, consider separate PRs for other fixes
There was a problem hiding this comment.
Thanks for the suggestion!
Removed this change to keep the PR focused only on markdown support.
|
@chinmaychandra1 Added a few more things. Do look into those when you have time. |
|
@chinmaychandra1 Also, verify your commits with a gpg key (you'll find documentation on how to do this online). We cannot accept unverified commits. |
0cd0d85 to
a94eef2
Compare
|
All requested changes have been addressed. Commits are now GPG signed and verified. Looking forward to your review. |
|
Gentle ping 🙂 Just checking if there are any updates on this PR. Please let me know if anything else is needed from my side. |
|
Resolved CI issues by restoring corrupted files from upstream. All pre-commit checks are passing and branch is up to date. |
|
@chinmaychandra1 I think you lost all your code here... I can see only gitignores. (why are there two of them though?) |
4af468f to
02a5bcb
Compare
|
Thanks for the feedback ! I have reintroduced the markdown report feature with a clean and minimal implementation. The previous issue was caused by unintended file loss , which has now been fixed. Kindly review the updated changes. |
|
Hi! Just a gentle follow-up on this PR. All requested changes have been addressed and checks are passing. Would appreciate a review whenever you get time. Thanks! |
Description
This PR adds support for Markdown (.md) report generation in OWASP Nettacker.
Changes Made
.mdoutput format in graph.pyTesting
python nettacker.py -i scanme.nmap.org -m all -o output.md
Benefits