RSS Feed Service

A lightweight Go server that generates RSS feeds for blog sites that don't provide their own. It scrapes the blog listing pages, extracts post metadata, and serves standard RSS 2.0 XML.

Running on an exe.dev VM.

Feeds

Blog	Feed URL	Source
MotherDuck Blog	`/feed/motherduck`	https://motherduck.com/blog/
Sprites Blog	`/feed/sprites`	https://sprites.dev/blog/
Archil Blog	`/feed/archil`	https://archil.com/blog/

Full URLs: https://rss-feed.exe.xyz:8000/feed/{name}

How it works

When an RSS reader requests a feed, the server checks its in-memory cache (15-minute TTL)
On cache miss, it fetches the blog's listing page via HTTP
A site-specific scraper extracts post titles, URLs, descriptions, authors, and dates from the HTML
The posts are rendered as RSS 2.0 XML using gorilla/feeds and cached

There is no database. No background jobs. The server only fetches upstream when a reader asks and the cache is stale.

Scraper strategies

Each blog site has different HTML structure, so each needs its own scraper:

MotherDuck — Server-rendered HTML, but <a> tags contain block elements (<div>, <h2>) which Go's net/html parser splits into sibling fragments per the HTML spec. The scraper groups fragments by href and collects title/description/date from siblings.
Sprites — Clean server-rendered HTML with <article> elements, <h3> titles, <time> elements, and <span> authors. Straightforward DOM traversal.
Archil — A Next.js client-rendered app. The HTML contains no visible blog content, but the React Server Components (RSC) payload is embedded in <script> tags as JSON. The scraper extracts post data by pattern-matching the RSC JSON for "href":"/post/..." entries and their nearby "children" values.

Adding a new feed

1. Inspect the blog page

Visit the blog in a browser and inspect the HTML structure. Key questions:

Is the content server-rendered (visible in curl output) or client-rendered (JavaScript-only)?
What elements contain post titles, URLs, dates, descriptions, and authors?

# Quick check: does curl see the blog posts?
curl -sL https://example.com/blog/ | grep -o '<a[^>]*href[^>]*>' | head -20

2. Write a scraper function

Add a new scraper in scraper/scraper.go:

func scrapeExample(body io.Reader) ([]Post, error) {
    doc, err := html.Parse(body)
    if err != nil {
        return nil, err
    }
    // Walk the DOM tree, extract posts...
    var posts []Post
    // ...
    return posts, nil
}

For client-rendered (React/Next.js) sites, you may need to parse the data from embedded JSON payloads rather than the HTML DOM. See the Archil scraper for an example.

3. Register the source

Add an entry to the Sources slice in scraper/scraper.go:

var Sources = []FeedSource{
    // ...existing sources...
    {
        Name:    "Example Blog",
        SiteURL: "https://example.com/blog/",
        Scrape:  scrapeExample,
    },
}

4. Add the route and index card

In srv/server.go:

Add a route in the Serve method:

mux.HandleFunc("GET /feed/example", s.handleFeed(3)) // index matches Sources slice

Add a card to indexHTML:

<div class="feed-card">
  <div class="feed-name">Example Blog <span class="rss-icon">RSS</span></div>
  <a class="feed-url" href="/feed/example">/feed/example</a>
</div>

5. Build and deploy

go build -o rss-feed ./cmd/srv
sudo systemctl restart srv

Project structure

cmd/srv/main.go      Entry point — parses flags, starts server
srv/server.go        HTTP handlers, caching, RSS generation, index page
scraper/scraper.go   Fetch + parse logic for each blog site
srv.service          systemd unit file

Building and running

# Build
go build -o rss-feed ./cmd/srv

# Run directly
./rss-feed                    # listens on :8000
./rss-feed -listen :3000      # custom port

# Or via systemd (production)
sudo cp srv.service /etc/systemd/system/srv.service
sudo systemctl daemon-reload
sudo systemctl enable --now srv

# Check status / logs
systemctl status srv
journalctl -u srv -f

Recreating this VM on exe.dev

Create a new VM named rss-feed, then give Shelley this prompt:

Create an RSS feed service in Go that scrapes blog listing pages and serves RSS 2.0 feeds. The blogs to support are:

https://motherduck.com/blog/ → serve at /feed/motherduck

https://sprites.dev/blog/ → serve at /feed/sprites

https://archil.com/blog/ → serve at /feed/archil

Requirements:

Use the Go project template (shelley unpack-template go)

Scrape each blog's listing page for post title, URL, description, author, and date

Generate RSS 2.0 XML using gorilla/feeds

Cache feeds in memory for 15 minutes

Cap upstream response reads at 10 MB with io.LimitReader

Serve an index page at / listing all available feeds

Run as a systemd service on port 8000

No database needed

Note: MotherDuck is server-rendered but Go's HTML parser splits <a> tags containing block elements — group fragments by href. Archil is a Next.js client-rendered app — extract data from the RSC JSON payload in <script> tags. Sprites is clean server-rendered HTML with <article> elements.

After Shelley builds it, run set-public to make the feeds accessible to RSS readers without authentication.

Security notes

Read-only service — no database, no user input, no file uploads
Hardcoded scrape targets — no SSRF risk; the server only fetches from the URLs defined in Sources
Response size limit — upstream responses are capped at 10 MB via io.LimitReader to prevent OOM
Static index page — no user input is interpolated into HTML
Scraped content is XML-escaped by gorilla/feeds before inclusion in RSS output

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
cmd/srv		cmd/srv
scraper		scraper
srv		srv
.gitignore		.gitignore
AGENTS.md		AGENTS.md
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum
srv.service		srv.service

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RSS Feed Service

Feeds

How it works

Scraper strategies

Adding a new feed

1. Inspect the blog page

2. Write a scraper function

3. Register the source

4. Add the route and index card

5. Build and deploy

Project structure

Building and running

Recreating this VM on exe.dev

Security notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RSS Feed Service

Feeds

How it works

Scraper strategies

Adding a new feed

1. Inspect the blog page

2. Write a scraper function

3. Register the source

4. Add the route and index card

5. Build and deploy

Project structure

Building and running

Recreating this VM on exe.dev

Security notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages