Skip to content

feat(search): add ocis search optimize CLI command#12136

Merged
mmattel merged 2 commits intoowncloud:masterfrom
paul43210:feat/search-optimize-command
Apr 17, 2026
Merged

feat(search): add ocis search optimize CLI command#12136
mmattel merged 2 commits intoowncloud:masterfrom
paul43210:feat/search-optimize-command

Conversation

@paul43210
Copy link
Copy Markdown
Contributor

Summary

  • Adds a new ocis search optimize CLI command that compacts the Bleve search index by merging segments (ForceMerge), improving query performance without re-indexing content
  • The command sends a gRPC call to the running search service rather than opening the index directly, since Bleve does not support concurrent access from multiple processes
  • Uses google.protobuf.Empty for the OptimizeIndex RPC to avoid creating new message types

Background

After bulk re-indexing (e.g. ocis search index), the Bleve index can accumulate many small segments that degrade query performance. Currently the only way to compact them is programmatically via the ForceMerge API added in #12104. This PR exposes that capability as a simple CLI command so administrators can trigger optimization on demand.

Requested by @mklos-kw in the context of #12104.

Changes

File What
search.proto Add OptimizeIndex RPC (Empty → Empty)
search.pb.micro.go Generated go-micro client/handler for OptimizeIndex
search.pb.web.go Generated web handler for OptimizeIndex
service.go (search) Add OptimizeIndex to Searcher interface + implementation
service.go (grpc) gRPC handler delegates to Searcher.OptimizeIndex
optimize.go New CLI command — connects to running service, calls OptimizeIndex with 10-min timeout
root.go Register Optimize in command list
mocks/ Updated mockery mocks for new interface method
changelog/ Enhancement changelog entry

Usage

ocis search optimize
# Output: index optimization complete

Test plan

  • make -C services/search test — all 35 tests pass
  • Live-tested on production server: command connects to running search service, triggers ForceMerge, logs show full lifecycle (requested → optimizing → complete), returns in ~31ms on already-compacted index

🤖 Generated with Claude Code

@mmattel mmattel requested review from kobergj and mklos-kw March 22, 2026 16:40
@sonarqubecloud
Copy link
Copy Markdown

Comment thread changelog/unreleased/enhancement-search-optimize-command.md Outdated
@mmattel mmattel force-pushed the feat/search-optimize-command branch from 577b800 to 8658473 Compare March 25, 2026 14:55
@paul43210 paul43210 force-pushed the feat/search-optimize-command branch from 8658473 to a6c10a2 Compare March 26, 2026 15:45
@mmattel mmattel force-pushed the feat/search-optimize-command branch from a6c10a2 to b0abcb9 Compare March 26, 2026 16:03
@paul43210 paul43210 force-pushed the feat/search-optimize-command branch from b0abcb9 to 55d1348 Compare April 9, 2026 01:00
@mmattel mmattel force-pushed the feat/search-optimize-command branch from 55d1348 to 8374d95 Compare April 9, 2026 08:14
@jvillafanez
Copy link
Copy Markdown
Member

I'm more inclined to keep it as "cli-only". Exposing this as part of the GRPC interface doesn't seem a good idea: users of the service shouldn't care about optimizing the service itself nor care about its internals. In addition, the optimization might take a while (likely minutes if not hours), and during that time the search service will be locked and unusable (at least with bleve).

Instead, you can create an engine factory (moving the related code from the GRPC service, https://github.com/owncloud/ocis/blob/master/services/search/pkg/service/grpc/v0/service.go#L45-L69) and use it inside your command to have access to the search engine. You can run the "optimize" function from there.

Live-tested on production server: command connects to running search service, triggers ForceMerge, logs show full lifecycle (requested → optimizing → complete), returns in ~31ms on already-compacted index

That seems an ideal scenario that won't happen. We need to consider the worst possible scenario, and I don't think we can ensure that it will ALWAYS take less than a second to complete the execution. In this case, we should consider it a long running operation, and also a blocking operation. This is why a prefer to ensure that the optimization is run through command line only, so the admin triggers it at the best time minimizing the disruption.


Side note, I've seen you've added an optional Optimizer interface... I'd include it as part of the Engine interface.
Not all the methods of the Engine interface need to be exposed in the service, and even if we could have an engine that doesn't support optimization (not planned at the moment), it should be easy to implement it as a noop or just return an error.

@paul43210 paul43210 force-pushed the feat/search-optimize-command branch 2 times, most recently from 0f2f458 to e5b1895 Compare April 15, 2026 18:45
@paul43210
Copy link
Copy Markdown
Contributor Author

Thanks for the detailed feedback @jvillafanez — all three points addressed:

  1. CLI-only: Removed the gRPC OptimizeIndex RPC entirely. No proto changes, no HTTP endpoint. The command opens the index directly, so it can only be run when the admin explicitly triggers it.

  2. Engine factory: Extracted the engine creation logic from service.go into NewEngineFromConfig() in engine.go. Both the gRPC service handler and the CLI command now use this factory. The gRPC handler is simplified to a single call.

  3. Optimize in Engine interface: Removed the separate Optimizer interface. Optimize(ctx context.Context) error is now part of the Engine interface directly.

Good call on the blocking concern — with direct index access, the admin controls when it runs and there's no risk of locking the running service.

Comment thread services/search/pkg/search/service.go Outdated
@jvillafanez
Copy link
Copy Markdown
Member

In general, the code looks better now 👍
The e2e-search tests are failing and we need to fix them before merging. I've commented a possible spot although I'm not entirely sure that's the cause of the failures. There have been some changes last week (#12187) so you might also want to update your code

Add a CLI-only optimize command that compacts the Bleve search index by
merging segments. The command opens the index directly via a new engine
factory (NewEngineFromConfig), without requiring the running gRPC service.

Key changes per review feedback from jvillafanez:
- CLI-only: no gRPC endpoint, no proto changes — admin triggers it
  directly when disruption is acceptable
- Engine factory: extracted engine creation from the gRPC handler into
  NewEngineFromConfig, reusable by both the service and CLI commands
- Optimize() merged into Engine interface: no separate Optimizer
  interface, non-supporting engines can return an error

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Paul Faure <paul@faure.ca>
@paul43210 paul43210 force-pushed the feat/search-optimize-command branch from e5b1895 to 3d6009f Compare April 16, 2026 12:25
@mmattel mmattel enabled auto-merge (squash) April 17, 2026 08:41
@mmattel mmattel merged commit a177608 into owncloud:master Apr 17, 2026
54 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants