feat(tools): Slack monitor on SR votes by warku123 · Pull Request #105 · tronprotocol/tron-docker

warku123 · 2025-12-30T07:17:55Z

What does this PR do?
Add an SR votes monitor and push the results to Slack channel

Why are these changes required?
To get SR votes information ontime

This PR has been tested by:

Manual Testing

Follow up

Extra details

bladehan1 · 2026-01-04T06:17:13Z

+
+	body, err := io.ReadAll(resp.Body)
+	if err != nil {
+		return ""


It's better to handle error messages by writing them to logs.

Already add logs file in logs/sr_monitor.log

barbatos2011 · 2026-01-04T07:01:20Z

+
+go 1.25.5
+
+require github.com/joho/godotenv v1.5.1


Please check the security of this package

This package is the de-facto standard for environment variable management in the Go community (10k+ Stars), repo: https://github.com/joho/godotenv. Its logic is minimal and transparent (reading local files only).

I've scanned the project using the official Go vulnerability tool govulncheck, and no vulnerabilities were detected in this package.

bladehan1 · 2026-05-14T07:12:03Z

+FROM golang:1.25-alpine AS builder
+
+WORKDIR /app
+


[MUST] Builder base image missing digest pinning

Problem:

FROM golang:1.25-alpine AS builder lacks @sha256 digest.

Why this is P0 (Critical):

Non-reproducible builds — Docker Hub may update golang:1.25-alpine tag without notice. Building today vs tomorrow produces different binaries.

Supply chain traceability lost — Cannot audit which exact version was used in each build. Critical for security compliance.

Security vulnerability — New golang versions may have breaking changes or removed dependencies. Without pinned digest, application may fail to start in production.

No rollback capability — If an image breaks production, cannot roll back to previous version because the tag floats.

Historical precedent — tron-docker has SA-002 (base image not pinned), this is a known vulnerability in the project.

Fix:

docker pull golang:1.25-alpine docker inspect golang:1.25-alpine --format='{{index .RepoDigests 0}}' # Replace: FROM golang:1.25-alpine@sha256:abc123...

Fixed in f330c1a. Pinned to golang:1.25-alpine@sha256:8d22e29d960bc50cd025d93d5b7c7d220b1ee9aa7a239b3c8f55a57e987e8d45. Thanks for the catch.

bladehan1 · 2026-05-14T07:12:05Z

+FROM alpine:latest
+
+RUN apk --no-cache add ca-certificates tzdata
+


[MUST] Runtime base image using :latest tag (no digest)

Problem:

FROM alpine:latest is a floating tag with no @sha256 digest.

Why this is P0 (Critical):

Worst-case version management — :latest is the most unstable tag. Rebuild tomorrow and get completely different image.

Zero image consistency guarantee — Same Dockerfile produces different image on each rebuild.

Supply chain attack surface — Anyone (including attackers) could upload new alpine:latest to Docker Hub. Containers auto-pull the poisoned image.

Production safety compromised — Cannot rollback to previous version if new alpine breaks the app. Cannot diagnose which version caused the problem.

Deployment uncertainty — Image contents are non-deterministic. Violates infrastructure-as-code principle.

Fix:

docker pull alpine:3.20 docker inspect alpine:3.20 --format='{{index .RepoDigests 0}}' # Replace: FROM alpine:3.20@sha256:xyz789...

Fixed in f330c1a. Switched to alpine:3.20@sha256:d9e853e87e55526f6b2917df91a2115c36dd7c696a35be12163d44e6e2a4b6bc. Agree that :latest provides no reproducibility guarantee.

bladehan1 · 2026-05-14T07:12:07Z

+
+		// Build rank map for previous SRs
+		prevRankMap := make(map[string]int)
+		for i, addr := range prevSRs {


[MUST] JSON marshal error discarded (blank import)

Problem:

jsonPayload, _ := json.Marshal(payload) // Error ignored!

Why this is P0 (Critical):

Silent failure — worst error handling pattern — Not a panic, not an error return. Silently continues execution.

Sends nil payload — If payload contains non-serializable field (circular reference, unexported field, etc.), marshal fails and jsonPayload becomes nil.

Slack request fails invisibly — nil payload creates invalid HTTP request. Slack returns 400/422, but root cause (marshal failure) is invisible in logs.

Nearly impossible to diagnose — Developer sees "failed to send to Slack: status 400" but cannot see the real cause (json.Marshal failed).

Future breaking change risk — As code evolves and payload structure becomes complex, this bug suddenly explodes in production.

Standard Go error handling:

// ❌ Current (P0) jsonPayload, _ := json.Marshal(payload) // Error disappears // ✅ Correct jsonPayload, err := json.Marshal(payload) if err != nil { log.Printf("ERROR: failed to marshal Slack payload: %v", err) return fmt.Errorf("marshal failed: %w", err) }

Fix: Handle the error explicitly or return (string, error) from the function.

Fixed in f330c1a. The marshal error is now logged and the function returns an empty name, consistent with the existing error paths further down in getAccountName. Agreed this was a silent-failure anti-pattern.

bladehan1 · 2026-05-14T07:15:36Z

+    container_name: slack-sr-monitor
+    restart: always
+    environment:
+      - SLACK_WEBHOOK=${SLACK_WEBHOOK}


[SHOULD] Restart policy 'always' risks infinite restart loop

Problem:

restart: always

Container will be restarted infinitely, even on persistent failures.

Why this is P1 (Should Fix):

Scenario: Invalid Slack Webhook

10:00:00 - Container starts → checks SLACK_WEBHOOK env var → invalid → os.Exit(1) 10:00:01 - Docker: container crashed, restart: always → restart container 10:00:02 - Container starts → checks SLACK_WEBHOOK → invalid → os.Exit(1) 10:00:03 - Docker: container crashed → restart ... 10:05:00 - 300+ restart attempts, logs flooded, CPU spinning

Consequences:

Problem masking — Error is hidden in restart loop. Developer sees "container restarting" but doesn't see the root cause (invalid webhook URL).

Log spam — Logs filled with restart attempts, making debugging harder. Real errors buried under restart messages.

Resource waste — Container keeps starting/stopping, wasting CPU and I/O.

Delayed incident response — Takes longer to discover and diagnose the actual problem.

Hard to stop gracefully — Docker stop doesn't prevent auto-restart; need docker update --restart=no to halt it.

Comparison of restart policies:

Policy Behavior Use Case

always Restart forever, even after intentional stop ❌ Masks failures

unless-stopped Restart unless manually stopped (recommended) ✅ Failures visible, can diagnose

on-failure:5 Restart max 5 times ✅ Expose persistent failures after N retries

Fix:

services: slack-sr-monitor: # Option 1 (Recommended): Stop on failure, allow manual restart restart: unless-stopped # Option 2: Auto-restart but stop after 5 failed attempts # restart: on-failure:5

Benefit of 'unless-stopped': When the service crashes, it stays down until manual intervention. This forces developers to investigate instead of ignoring the logs.

Agreed, switched to restart: unless-stopped in f330c1a. Your reasoning about failure visibility makes sense — a misconfigured webhook should surface as a stopped container rather than be hidden in a restart loop.

- Handle json.Marshal error in getAccountName instead of discarding it - Pin base image digests (golang:1.25-alpine, alpine:3.20) for reproducible builds - Run container as non-root user 'monitor' under /home/monitor - Tighten log file permissions from 0666 to 0600 - Mask Slack webhook token in startup log - Switch docker-compose restart policy to unless-stopped

The loop polls getnextmaintenancetime in a 2-minute cycle. When the node is lagging behind the chain head (e.g. after a connection outage), the API returns a past maintenance time over and over, causing the monitor to send a duplicate Slack message every 2 minutes until the node recovers. Before triggering a push, check the timestamp of the latest block on the node and skip the cycle if it is more than 30 seconds old. Once the node catches up, getnextmaintenancetime returns the next future maintenance time and the normal wait path resumes.

warku123 added 8 commits December 26, 2025 17:14

Add go SR check

2a40e61

add docker config

8bea3b4

change go version

e1adb27

Update URL to account name

27203f8

Change output format

f4d8bb1

Delete previous vote output

0bb37bf

Add sr change check

9ef9e76

Add the readme

2cc3203

bladehan1 reviewed Jan 4, 2026

View reviewed changes

barbatos2011 reviewed Jan 4, 2026

View reviewed changes

warku123 added 4 commits January 4, 2026 15:35

Add log directory and also log file

730864d

Modify print format

1083b5d

Delete unnecessary message

e5bba78

Add rank changes detect in top27 SR

7bf1968

warku123 changed the title ~~feat(toolkit): monitor on SR votes~~ feat(tools): monitor on SR votes Mar 27, 2026

warku123 changed the title ~~feat(tools): monitor on SR votes~~ feat(tools): Slack monitor on SR votes Apr 1, 2026

warku123 requested review from barbatos2011 and bladehan1 May 14, 2026 03:11

This comment was marked as off-topic.

Sign in to view

bladehan1 reviewed May 14, 2026

View reviewed changes

warku123 added 3 commits May 14, 2026 16:33

Strip extra trailing newline to satisfy end-of-file-fixer lint

837d86d

warku123 requested a review from bladehan1 May 18, 2026 03:23

bladehan1 approved these changes May 18, 2026

View reviewed changes

		FROM alpine:latest

		RUN apk --no-cache add ca-certificates tzdata

Policy	Behavior	Use Case
`always`	Restart forever, even after intentional stop	❌ Masks failures
`unless-stopped`	Restart unless manually stopped (recommended)	✅ Failures visible, can diagnose
`on-failure:5`	Restart max 5 times	✅ Expose persistent failures after N retries

Conversation

warku123 commented Dec 30, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment was marked as off-topic.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants