SEO/GEO improvements: structured data, llms.txt, sitemap, expanded robots#1
Merged
Conversation
…bots Goal: make the site discoverable by LLM crawlers (GPTBot, ClaudeBot, PerplexityBot, etc.) and improve search engine visibility. The site is a client-side React SPA, so the static HTML LLM crawlers see was nearly empty — these changes give them rich metadata even before the React bundle hydrates. Changes: - index.html: rich title/description, canonical URL, expanded OG/Twitter meta, two JSON-LD blocks (Organization with founders, addresses, sameAs links to LinkedIn/Twitter/Dealroom/Substack/press; WebSite) - public/llms.txt: emerging standard for LLM crawlers, full fund details (size, stage, ticket, sectors), team bios, portfolio (Replenit, Howie, Polyvia), press links, contact - public/sitemap.xml: discovery for crawlers - public/robots.txt: explicit allow for GPTBot, ClaudeBot, anthropic-ai, Claude-Web, PerplexityBot, Perplexity-User, Google-Extended, CCBot, cohere-ai, Applebot-Extended, meta-externalagent, ChatGPT-User, OAI-SearchBot Tests: - npm run build: passes, dist/ contains all new files - 404.html SPA fallback still generated - JSON-LD validates against schema.org Organization spec No code changes, no visual changes, no behavior changes.
ampedraszewska
added a commit
that referenced
this pull request
Apr 29, 2026
Builds on PR #1 (meta + JSON-LD + llms.txt). This finishes the SEO/GEO fix: after `vite build`, a Puppeteer headless browser loads the built SPA, captures the fully-rendered DOM, and writes it back to dist/index.html (and dist/404.html for the GitHub Pages SPA fallback). The result: LLM crawlers (GPTBot, ClaudeBot, PerplexityBot) and search engines that do not reliably execute JS now get the complete Hero component as static HTML — team, thesis, news ticker with the Replenit announcement, contact info, partner logos. The React bundle still hydrates and runs client-side for interactive features (mouse-tracking effect, mobile menu). Local build measurement: - dist/index.html before: 4.96 kB (essentially empty body) - dist/index.html after: 25.7 kB (full rendered Hero) Changes: - scripts/prerender.mjs (new): post-build script using Vite preview server plus Puppeteer to snapshot rendered HTML - package.json: build = "vite build && node scripts/prerender.mjs", added build:nossg as escape hatch, puppeteer as devDependency No changes to React components or routing. Tests: - npm install completes (puppeteer pulls Chromium ~150 MB) - npm run build completes successfully - dist/index.html grep confirms team names plus Replenit present in markup - dist/404.html mirrors index.html (SPA fallback preserved) After merge, GitHub Actions runs the same build pipeline. Verify with: curl -A "GPTBot/1.0" https://vastpoint.vc/ | grep "Pedraszewska" Co-authored-by: Aleksandra Pedraszewska <aleksandra@vastpoint.vc>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
When founders ask LLMs (ChatGPT, Claude, Perplexity, Gemini) "which VC funds invest at seed in Poland / CEE?", Vastpoint doesn't appear in the answers — even though we have strong launch press (MamStartup, MyCompany Polska, Sifted, EU-Startups, Vestbee, AIN.ua).
The reason is simple: vastpoint.vc is a client-side React SPA, so when an LLM crawler (GPTBot, ClaudeBot, PerplexityBot) hits the page, it sees a nearly empty HTML document — only
<title>vastpoint ventures</title>and<div id="root"></div>. All the actual content (team, thesis, portfolio) lives behind a JS bundle that crawlers don't execute reliably.This PR doesn't fix the SPA-rendering issue (that comes in a follow-up — likely
react-snaporvite-react-ssgto pre-render static HTML). What it does is give crawlers maximum information from the initial HTML payload right now, with zero functional or visual changes to the live site.What changed
index.html— expanded<head>:<title>and<meta name="description">covering fund stage, sectors, teamsameAslinkssameAslinks to LinkedIn, X, Dealroom, Substack, and press coverage on Vestbee, EU-Startups, Siftedpublic/llms.txt(new) — emerging standard for LLM crawlers (markdown they parse natively):public/sitemap.xml(new) — discovery for crawlers.public/robots.txt(updated) — explicitAllow: /for the bots that matter:GPTBot,ClaudeBot,anthropic-ai,Claude-Web,PerplexityBot,Google-Extended,CCBot,cohere-ai,Applebot-Extended,meta-externalagentChatGPT-User,OAI-SearchBot,Perplexity-UserTests
npm run buildpasses locally (Vite 5.4, all 1665 modules transformed)dist/contains all new files (llms.txt,sitemap.xml, updatedrobots.txt, expandedindex.html)404.htmlSPA fallback still generated by existingcopy404PluginTest after merge
After GitHub Actions deploys, verify crawler view:
Then submit
https://vastpoint.vc/sitemap.xmlto Google Search Console, and run a Rich Results test on https://search.google.com/test/rich-results — JSON-LD should validate as Organization + WebSite.What's NOT in this PR