Add `buf scaffold` command to setup buf.yaml in a git repository#4418

Draft

cyinma wants to merge 2 commits intomainfrom

cyinma/buf-scaffold-cmd

Member

cyinma commented Mar 30, 2026


          Add buf scaffold command to setup buf.yaml in a git repository

b6f3aee

Contributor

github-actions bot commented Mar 30, 2026 •

edited

Loading

The latest Buf updates on your PR. Results from workflow Buf CI / buf (pull_request).

Build	Format	Lint	Breaking	Updated (UTC)
`✅ passed`	`✅ passed`	`✅ passed`	`✅ passed`	Mar 30, 2026, 5:01 PM


          fix empty dir not tracked

31ba66a

cyinma commented

View reviewed changes

cmd/buf/buf.go

               			push.NewCommand("push", builder),
               			convert.NewCommand("convert", builder),
               			curl.NewCommand("curl", builder),
+              			scaffold.NewCommand("scaffold", builder),

Member Author

cyinma Mar 30, 2026

I am thinking should this be in beta first, but I don't think we want this to be buf config scaffold

cmd/buf/internal/command/scaffold/scaffold.go

+              	if exists {
+              		return fmt.Errorf("buf.yaml already exists in %s, will not overwrite", dirPath)
+              	}
+              	fsys := os.DirFS(dirPath)

Member Author

cyinma Mar 30, 2026

not using the storage.ReadBucket abstraction since in the end we need to read the raw content for parser

Member

doriable Apr 7, 2026

I think we should still use storageosProvider.NewReadBucket, just for consistency in the implementation patterns for these code paths (e.g. handling the symlinks, etc.).

doriable reviewed

View reviewed changes

cmd/buf/internal/command/scaffold/scaffold.go

+              	if exists {
+              		return fmt.Errorf("buf.yaml already exists in %s, will not overwrite", dirPath)
+              	}
+              	fsys := os.DirFS(dirPath)

Member

doriable Apr 7, 2026

I think we should still use storageosProvider.NewReadBucket, just for consistency in the implementation patterns for these code paths (e.g. handling the symlinks, etc.).

cmd/buf/internal/command/scaffold/scaffold.go

+              // walkAndParseProtoFiles walks the filesystem for .proto files and parses each one.
+              // Files that fail to parse or have no package declaration are silently skipped.
+              func walkAndParseProtoFiles(fsys fs.FS) ([]protoFileInfo, error) {

Member

doriable Apr 7, 2026

So the current behaviour is, on the walk, for each file, we synchronously parse using parser.Parse directly. I think it might make sense to:

Walk, collect each proto source file along the way, dump them into a source.Map so that you get the source.Opener implementation and create a queries.AST for each file (https://github.com/bufbuild/protocompile/blob/main/experimental/incremental/queries/ast.go)
Then, you can run incremental.Run on all the queries, and this will parse/resolve the AST in parallel
And then inferModuleRoots's signature can just be func inferModuleRoots(files ...ast.File) []string

cmd/buf/internal/command/scaffold/scaffold.go

Comment on lines +173 to +178

+              // inferModuleRoots determines module root directories by deriving each file's
+              // expected import path from its package declaration, then comparing against its
+              // file path. The difference is the module root.
+              //
+              // Returns sorted, unique module root paths. Returns nil if no roots can be inferred.
+              func inferModuleRoots(files []protoFileInfo) []string {

Member

doriable Apr 7, 2026

So I think here, we can get a bit of a "freebie" based on existing imports. I think the algorithm I would do is, assuming inferModuleRoots(files ...ast.File) []string:

Iterate through each file and gather up the imports in a set
Then you iterate through files again and compare the normalised Path (which should be what you get from the bucket) against the import paths that we have (I think the way to do this is to compare "tail to front", e.g. for local/path/to/abc/v1/xyz.proto, you would find the closest match, starting from xyz.proto -> v1/xyz.proto -> abc/v1/xyz.proto, etc.)
For any files that have not been imported at all, we can run a fallback to the package name check

Otherwise this is entirely dependent on well-formed package names.

cmd/buf/internal/command/scaffold/scaffold.go

+              	if err != nil {
+              		return err
+              	}
+              	return bufcli.PutBufYAMLFileForDirPath(ctx, dirPath, bufYAMLFile)

Member

doriable Apr 7, 2026

I think before we write the bufYAMLFile to disk permanently, we should validate that this produces a workspace that compiles, which is a little tricky. With the way our APIs are currently structured, I think I would temporarily write the buf.yaml to disk, and then use Controller to attempt to GetImage based on the dirPath (since buf.yaml is written in the same place as dirPath, which would be the "input" in this case), and if it succeds, great, keep it written and return. If it fails, we would not write the the buf.yaml, but maybe we have a verbose option of providing the roots we found or something? Would need to look into the UX of the failing case a little but, but yeah... I think some validation step is useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet