Skip to content

Add File Header Management (License / Copyright)#5264

Open
arashi01 wants to merge 3 commits intoscalameta:mainfrom
arashi01:file-header-management
Open

Add File Header Management (License / Copyright)#5264
arashi01 wants to merge 3 commits intoscalameta:mainfrom
arashi01:file-header-management

Conversation

@arashi01
Copy link
Copy Markdown

@arashi01 arashi01 commented Apr 5, 2026

Adds copyright/header management. Revisits #705

Whilst this issue was brought up previously, hopefully it can be reconsidered. In my view, a formatter/linter enforces canonical source file form. Headers are part of that form. With header management in the build tool (sbt-header), identical source files under identical .scalafmt.conf produce different results depending on whether the project uses sbt, Mill, Gradle, scala-cli, Maven etc. Moving this to the formatter guarantees uniform sources regardless of build toolchain - the same guarantee scalafmt already provides for indentation, line endings, and import order.

This is one possible take at the problem

  • Went with pre-format text insertion (in Scalafmt.doFormat, before doFormatOne) since:
    • Headers are not AST constructs - text-level insertion avoids coupling to the token/tree layer
    • Idempotent: string equality short-circuits when header is correct
    • --check works with no additional code
    • Post-format breaks idempotency (formatter's BOF split decisions differ with/without header).
  • Added a fileHeader config section.
  • Tried to address the reproducibility concern from Feature request: manage copyright notice #705:
    • year defaults to current calendar year but can be pinned explicitly in config
    • since sets range start: since = 2020 with current year 2026 produces 2020-2026

Example config:

fileHeader {
  license = Apache-2.0 # 12 SPDX licenses supported
  copyrightHolder = "My Organization"
  since = 2020
  style = block  # block | line | framed
}
  

@kitbellew
Copy link
Copy Markdown
Collaborator

@arashi01 thank you for your contribution. unfortunately, this is not at all how i think it should be done.

if you are inclined to pursue this further, please let me know, and we can brainstorm ways to design it.

# Conflicts:
#	scalafmt-core/shared/src/main/scala/org/scalafmt/config/ScalafmtConfig.scala
@arashi01
Copy link
Copy Markdown
Author

arashi01 commented Apr 5, 2026

@kitbellew Happy to do so. Let me know your preferred approach

@kitbellew
Copy link
Copy Markdown
Collaborator

the preferred approach is to do nothing and direct people to dedicated tools like https://github.com/sbt/sbt-header.

Happy to consider adding logic to scalafmt that avoids formatting comment if it's the first token on the file.

@arashi01
Copy link
Copy Markdown
Author

arashi01 commented Apr 5, 2026

So what you are saying is that there is no approach that you would find acceptable incorporating header formatting into a formatting tool?

@kitbellew
Copy link
Copy Markdown
Collaborator

So what you are saying is that there is no approach that you would find acceptable incorporating header formatting into a formatting tool?

why is it needed here if there's a dedicated tool? adding unrelated disclaimers is not formatting.

@kitbellew
Copy link
Copy Markdown
Collaborator

to be more precise: i would be ok with the following:

  • the change is minimal, and the only place where it is applied is in FormatWriter (and some config setting)
  • there are no predefined license templates
  • the entire template is specified in the config via a multi-line string
  • there's one allowed template parameter, for the year
  • there's detection if the comment at the top of the file matches the template

Here's the tricky part: the change should only apply to files which have changed, so you can conceivably do it in CLI since you know if the tool was called with a --diff or --diff-branch parameter, but not in sbt-scalafmt or otherwise.

@arashi01
Copy link
Copy Markdown
Author

arashi01 commented Apr 5, 2026

@kitbellew thanks for your feedback.

So what you are saying is that there is no approach that you would find acceptable incorporating header formatting into a formatting tool?

why is it needed here if there's a dedicated tool? adding unrelated disclaimers is not formatting.

My thoughts on this:

  • There is no dedicated tool. There are build-tool plugins - sbt-header for sbt, license-maven-plugin for Maven, Spotless for Gradle. Each is tied to its build tool. Mill, Bazel, and scala-cli users have nothing.
  • scalafmt is build-tool-agnostic by design. Header correctness shouldn't depend on which build tool a project uses.
  • scalafmt already normalises lineEndings, encoding, import ordering, trailing commas, and syntax rewrites (ProcedureSyntax, AvoidInfix, RedundantBraces). A file header is the same category - enforcing canonical source file form.

the change should only apply to files which have changed

This is already the case no? When the header is correct, FileHeaderOps returns the content unchanged - the file isn't modified. The only time a file is touched is when the header is actually wrong. This is the same as every other scalafmt feature: rewrite.rules = [Imports, SortModifiers] processes every file in scope and only modifies those that don't already match.

  • there are no predefined license templates
  • the entire template is specified in the config via a multi-line string

Happy to drop the SPDX templates and require the full header text in config.

the change is minimal, and the only place where it is applied is in FormatWriter

I went with the pre-format text approach because headers are not AST constructs. FormatWriter operates on tokens after parsing and inserting a header there would need synthesising tokens that didn't come from the parser? A text-level pre-pass before parsing seemed less invasive. Happy to discuss alternatives.

@kitbellew
Copy link
Copy Markdown
Collaborator

the change should only apply to files which have changed

This is already the case no? When the header is correct, FileHeaderOps returns the content unchanged - the file isn't modified. The only time a file is touched is when the header is actually wrong. This is the same as every other scalafmt feature: rewrite.rules = [Imports, SortModifiers] processes every file in scope and only modifies those that don't already match.

I should clarify. What you are describing is git behaviour; if the file hasn't changed after scalafmt is done with it, git doesn't include it in a patch. scalafmt doesn't know it is changing anything; it simply formats and writes a file completely, and most of the time the contents end up the same (but only git knows that).

These headers frequently include some reference to a year, and that year presumably changes as files get updated; if one file says "Copyright 1970-2025", I don't see a reason to update that line in 2026 unless the file has changed materially in other ways. Thus, scalafmt should not change the year in a file which wasn't modified, thus it needs to know which files are modified.

the change is minimal, and the only place where it is applied is in FormatWriter

I went with the pre-format text approach because headers are not AST constructs. FormatWriter operates on tokens after parsing and inserting a header there would need synthesising tokens that didn't come from the parser? A text-level pre-pass before parsing seemed less invasive. Happy to discuss alternatives.

  • The proposed approach is inefficient, and the code reimplements some existing logic.
  • FormatWriter operates on tokens but it generates text; it's pretty easy to check the first token after BOF and optional Shebang, and if it's not a Comment, write out a license into the output, and if it is a Comment, check first if it's already a license; there's no need to add any complex logic with hundreds of lines of code.

P.S. Did you write the code yourself? :)

@arashi01
Copy link
Copy Markdown
Author

arashi01 commented Apr 5, 2026

I should clarify. What you are describing is git behaviour; if the file hasn't changed after scalafmt is done with it, git doesn't include it in a patch. scalafmt doesn't know it is changing anything; it simply formats and writes a file completely, and most of the time the contents end up the same (but only git knows that).

These headers frequently include some reference to a year, and that year presumably changes as files get updated; if one file says "Copyright 1970-2025", I don't see a reason to update that line in 2026 unless the file has changed materially in other ways. Thus, scalafmt should not change the year in a file which wasn't modified, thus it needs to know which files are modified.

That's fairly easy to address.

FormatWriter operates on tokens but it generates text; it's pretty easy to check the first token after BOF and optional Shebang, and if it's not a Comment, write out a license into the output, and if it is a Comment, check first if it's already a license; there's no need to add any complex logic with hundreds of lines of code.

If providing the header as static text in config then that is feasible. With the current flexibility and config less so.

P.S. Did you write the code yourself? :)

Very much so. Why, is it that terrible? 😂
Given the lack of familiarity with this project and a functioning concept I didn't think so myself 😂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants