Feature/bench kdtree flann vs nanoflann#6438
Feature/bench kdtree flann vs nanoflann#6438rtkartista wants to merge 2 commits intoPointCloudLibrary:masterfrom
Conversation
Closes PointCloudLibrary#3860 (partial — search module benchmark) Adds benchmarks/search/bench_kdtree.cpp covering tree construction, kNN query, and radius search for pcl::search::KdTree (FLANN) vs pcl::search::KdTreeNanoflann (added in 1.15.1 / PointCloudLibrary#6250). Results on Legion 5 / i7-10750H / Ubuntu 24.04 / GCC 13 / Release: - Build 500k pts: 134 ms (FLANN) - kNN k=1: 0.875 us, k=20: 2.34 us (FLANN) - Radius r=5m: 13,049 us (FLANN) Benchmark skips silently if google-benchmark is not installed. Accepts optional PCD path as argv[1], falls back to synthetic cloud.
mvieth
left a comment
There was a problem hiding this comment.
Thanks for the pull request. I have added a few questions/comments. Please also make sure that the checks pass, currently the clang-format check is failing.
For reference: I also did some benchmarks when I added KdTreeNanoflann: #6250
|
|
||
| static void BM_FlannKdTree_Build(benchmark::State& state) | ||
| { | ||
| auto cloud = makeCloud(static_cast<int>(state.range(0))); |
There was a problem hiding this comment.
Is there a reason why you do not use g_cloud here?
| state.SetItemsProcessed(state.iterations()); | ||
| state.SetLabel("FLANN kNN"); |
There was a problem hiding this comment.
Why is it necessary to do this manually? Does this not happen automatically?
| std::vector<float> dists(k); | ||
|
|
||
| for (auto _ : state) | ||
| benchmark::DoNotOptimize(tree.nearestKSearch(query, k, indices, dists)); |
There was a problem hiding this comment.
Not saying that you absolutely have to change this, but doing it for just a single query point is not so representative. Better would be using a few different query points. But it's your decision if you want to rewrite it or not (same applies to the radius search benchmark).
| { | ||
| const float r = static_cast<float>(state.range(0)); | ||
|
|
||
| pcl::search::KdTree<pcl::PointXYZ> tree; |
There was a problem hiding this comment.
For the radius search, you have to consider that KdTree is sorting the results by default, while KdTreeNanoflann is not. So for a fair comparison, you should explicitly set both to either sorted or unsorted (but the same for both).
Closes #3860 (partial — search module benchmark)
What
Adds the first benchmark to the
searchmodule:benchmarks/search/bench_kdtree.cppBenchmarks
pcl::search::KdTree(FLANN) vspcl::search::KdTreeNanoflannacross three axes:
Results
Hardware: 12 threads / Ubuntu 24.04 / GCC 13 /
-O3 -mavx2Cloud:
table_scene_mug_stereo_textured.pcd(307,200 pts) for query benchmarks;synthetic uniform random for build benchmarks
Repetitions: 3 — mean reported
Tree construction (ms, lower is better)
kNN query (µs, lower is better)
Radius search (µs, lower is better)
nanoflann is 1.2–1.4× faster to build, 1.7–3.4× faster on kNN, and
8.4× faster on radius search — making it the clear default for
localization and mapping pipelines that call
radiusSearchin tight loops(normal estimation, ICP, GICP, NDT).
Notes
argv[1]; falls back to synthetic 100k-pt cloudPCL_HAS_NANOFLANNis definedregistration/featuresbenchmarks in follow-up PRs