Prerender SPA at build time so LLM/search crawlers see the full page#2
Merged
Conversation
Builds on PR #1 (meta + JSON-LD + llms.txt). This finishes the SEO/GEO fix: after `vite build`, a Puppeteer headless browser loads the built SPA, captures the fully-rendered DOM, and writes it back to dist/index.html (and dist/404.html for the GitHub Pages SPA fallback). The result: LLM crawlers (GPTBot, ClaudeBot, PerplexityBot) and search engines that do not reliably execute JS now get the complete Hero component as static HTML — team, thesis, news ticker with the Replenit announcement, contact info, partner logos. The React bundle still hydrates and runs client-side for interactive features (mouse-tracking effect, mobile menu). Local build measurement: - dist/index.html before: 4.96 kB (essentially empty body) - dist/index.html after: 25.7 kB (full rendered Hero) Changes: - scripts/prerender.mjs (new): post-build script using Vite preview server plus Puppeteer to snapshot rendered HTML - package.json: build = "vite build && node scripts/prerender.mjs", added build:nossg as escape hatch, puppeteer as devDependency No changes to React components or routing. Tests: - npm install completes (puppeteer pulls Chromium ~150 MB) - npm run build completes successfully - dist/index.html grep confirms team names plus Replenit present in markup - dist/404.html mirrors index.html (SPA fallback preserved) After merge, GitHub Actions runs the same build pipeline. Verify with: curl -A "GPTBot/1.0" https://vastpoint.vc/ | grep "Pedraszewska"
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Builds on PR #1 (meta + JSON-LD + llms.txt). PR #1 gave LLM and search crawlers rich metadata in the initial HTML response — but the actual page content (team, thesis, portfolio news ticker) still lived behind the React bundle. Crawlers that don't reliably run JS (GPTBot, ClaudeBot, PerplexityBot, and the indexing pass for several others) only saw
<div id="root"></div>for the body.This PR finishes the fix by prerendering the page at build time: after
vite build, a Puppeteer headless browser loads the built SPA, captures the fully-rendered DOM, and writes it back todist/index.html(anddist/404.htmlfor the GitHub Pages SPA fallback).What that gives us
LLM crawlers and search engines now receive the complete Hero component as static HTML:
The React bundle still hydrates and runs client-side for interactive features (mouse-tracking effect, mobile menu).
What changed
scripts/prerender.mjs(new): post-build script that uses Vite'spreview()API and Puppeteerpackage.json:buildis nowvite build && node scripts/prerender.mjsbuild:nossgadded as an escape hatch (justvite build, in case the prerender step ever needs to be skipped)puppeteeradded todevDependenciesNo changes to React components, routing, or build configuration beyond the script orchestration.
Local test
Verify after merge
GitHub Actions runs the same
npm run build. About 30-60 sec extra build time for Chromium download (cached on subsequent runs).After deploy:
Should print both. Currently (after PR #1 only) prints neither.
What is not in this PR
puppeteer-coreplussetup-chrome, but the simple path works first.