Skip to content

feat: implement shebang and comment-based language detection#1258

Open
Divyapahuja31 wants to merge 4 commits intoshikijs:mainfrom
Divyapahuja31:feat/smart-detection
Open

feat: implement shebang and comment-based language detection#1258
Divyapahuja31 wants to merge 4 commits intoshikijs:mainfrom
Divyapahuja31:feat/smart-detection

Conversation

@Divyapahuja31
Copy link
Copy Markdown
Contributor

  • -

Description

This PR implements robust, content-aware language detection in guessEmbeddedLanguages and integrates it into the Shiki CLI.

What is this solving?
Currently, Shiki's language auto-detection is primarily restricted to explicit markdown fences (`js) or specific HTML attributes. This makes it difficult to provide accurate highlighting for:

  1. Extension-less scripts: Files like deploy or run that rely on shebangs.
  2. Piped Input (stdin): Code piped directly into the CLI (cat file | shiki) where no filename exists.
  3. Commented Snippets: Code patterns from platforms like StackOverflow that use hints like <!-- language: lang-js -->.

Why is this needed?
To provide a "magic" out-of-the-box experience, especially for the CLI tool (skat). This PR allows Shiki to identify the language by inspecting the actual content header and common developer annotations, reducing the need for manual --lang overrides.

Changes:

  • Shebang Detection: Added logic to parse #! lines. It accurately handles standard paths (/bin/bash), environment-based resolution (/usr/bin/env node), and complex env -S flags (e.g., ts-node).
  • Comment-based Guessing: Added heuristics for StackOverflow-style hints and @lang annotations.
  • CLI Enhancement: Updated @shikijs/cli to use guessEmbeddedLanguages as a fallback for stdin and files with unknown extensions.
  • Normalization: All detected languages are automatically lowercased and trimmed to match Shiki core expectations.

Linked Issues

This is a feature enhancement for the core utility and CLI.

Additional context

Reviewers may want to focus on the env -S parsing logic in packages/core/src/utils/strings.ts. It's designed to skip common environment flags to find the actual executable name (e.g., identifying ts-node in #!/usr/bin/env -S ts-node --foo).

I have included 7 new unit tests in packages/core/src/utils/strings.test.ts covering these new detection patterns to ensure stability.

@netlify
Copy link
Copy Markdown

netlify bot commented Mar 5, 2026

Deploy Preview for shiki-next ready!

Name Link
🔨 Latest commit ffc4c72
🔍 Latest deploy log https://app.netlify.com/projects/shiki-next/deploys/69a9982deac4710008c3b62b
😎 Deploy Preview https://deploy-preview-1258--shiki-next.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@netlify
Copy link
Copy Markdown

netlify bot commented Mar 5, 2026

Deploy Preview for shiki-matsu ready!

Name Link
🔨 Latest commit ffc4c72
🔍 Latest deploy log https://app.netlify.com/projects/shiki-matsu/deploys/69a9982d8cca8700087e1736
😎 Deploy Preview https://deploy-preview-1258--shiki-matsu.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 5, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.71%. Comparing base (69bec64) to head (ffc4c72).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1258      +/-   ##
==========================================
+ Coverage   89.53%   89.71%   +0.17%     
==========================================
  Files          79       79              
  Lines        3479     3519      +40     
  Branches     1009     1024      +15     
==========================================
+ Hits         3115     3157      +42     
+ Misses        328      326       -2     
  Partials       36       36              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant