Skip to content

Mlnode cleanup#994

Open
akup wants to merge 5 commits intogonka-ai:upgrade-v0.2.12from
akup:mlnode-cleanup
Open

Mlnode cleanup#994
akup wants to merge 5 commits intogonka-ai:upgrade-v0.2.12from
akup:mlnode-cleanup

Conversation

@akup
Copy link
Copy Markdown
Collaborator

@akup akup commented Apr 1, 2026

PR: Clean PoW from MLNode and keep PoC networking patch local

Important

I will run it on GPU node first to check everything works after cleanup

Motivation

PoW logic is no longer used in mlnode, but the repository still carried PoW-related structure and callback wiring assumptions.
At the same time, PoC v2 logic now lives in the forked vLLM codebase, while our networking integration concerns still belong to this repo.

This PR clarifies that boundary and keeps only the pieces we still actively rely on.

Problem this PR solves

  • mlnode still contained legacy PoW traces even though PoW is deprecated in this stack.
  • PoC callback networking behavior needed by decentralized-api should remain controlled in this repo, not drift in upstream vLLM internals.
  • We needed a clear and documented example for signing callback requests to protect callback endpoints that may need to stay publicly reachable in some deployments.

What this PR does

  • Cleans PoW from mlnode paths that are no longer used.
  • Keeps PoC v2 source of truth in gonka-ai/vllm (./vllm/poc), but applies local callback patching from this repo.
  • Adds/uses mlnode/packages/poc/patches/callbacks.py as the local networking-layer override.
  • Updates Docker build flow to copy callback patches into installed vLLM path (with path resolution/fallback handling).
  • Documents the setup in mlnode/packages/poc/README.md.

Security / networking example included

The callback patch includes an example of signing callback payloads with POC_SIGNATURE_KEY.
This can be used by decentralized-api to verify callback authenticity, which is useful when callback endpoints are publicly exposed due to infrastructure constraints (e.g., mixed providers for ML nodes and API nodes).

Expected outcome

  • Clear separation of responsibilities:
    • PoC core runtime in vLLM
    • networking/control integration in this repo
  • Less dead/legacy PoW surface in mlnode
  • Better operational security posture for callback-based flows

Relation to other PRs

#537 and #417 patching mlnode. I think this PR can go first to cleanup mlnode before adding functionality

@x0152
Copy link
Copy Markdown
Collaborator

x0152 commented Apr 1, 2026

I think it’s worth checking these files as well:

  • mlnode/packages/api/docker-compose.yml
  • mlnode/packages/api/src/api/app.py
  • mlnode/packages/api/src/api/routes.py
  • mlnode/packages/api/pyproject.toml
  • mlnode/pyproject.toml

there are still references to PoW there

@akup
Copy link
Copy Markdown
Collaborator Author

akup commented Apr 1, 2026

I think it’s worth checking these files as well:

  • mlnode/packages/api/docker-compose.yml
  • mlnode/packages/api/src/api/app.py
  • mlnode/packages/api/src/api/routes.py
  • mlnode/packages/api/pyproject.toml
  • mlnode/pyproject.toml

there are still references to PoW there

I've cleaned them all.

But now I need to build the image and run it on GPU node.

I will report the results

@tcharchian tcharchian requested a review from 0xgonka April 2, 2026 00:52
@tcharchian tcharchian moved this from Todo to Needs reviewer in Upgrade v0.2.12 Apr 2, 2026
@tcharchian tcharchian removed the request for review from 0xgonka April 2, 2026 00:53
@tcharchian tcharchian moved this from Needs reviewer to Changes requested / Back in progress in Upgrade v0.2.12 Apr 2, 2026
@@ -0,0 +1,200 @@
"""PoC callback sender with retry-until-stop and bounded buffer."""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akup I'd suggest do not combine cleanup + new patches logic and create one more PR for the new feature.

@github-project-automation github-project-automation bot moved this to New in Triage Apr 11, 2026
@tcharchian tcharchian moved this from New to Needs triage in Triage Apr 11, 2026
@tcharchian tcharchian removed this from the v0.2.12 milestone Apr 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Needs triage

Development

Successfully merging this pull request may close these issues.

4 participants