From fbc193adf5d4dec31c7daeb3883d34ab94547ac0 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Wed, 18 Mar 2026 15:46:52 +0000 Subject: [PATCH 1/2] Initial plan From 53aa84baa14d7d4cdf98ea4dd0c2a5a427c6fab2 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Wed, 18 Mar 2026 16:03:27 +0000 Subject: [PATCH 2/2] Add README.md documentation to all AliceVision modules without existing documentation Co-authored-by: fabiencastan <153585+fabiencastan@users.noreply.github.com> --- src/aliceVision/calibration/README.md | 49 ++++++++++++++ src/aliceVision/cmdline/README.md | 39 +++++++++++ src/aliceVision/colorHarmonization/README.md | 38 +++++++++++ src/aliceVision/dataio/README.md | 52 ++++++++++++++ src/aliceVision/depthMap/README.md | 49 ++++++++++++++ src/aliceVision/featureEngine/README.md | 36 ++++++++++ src/aliceVision/fuseCut/README.md | 43 ++++++++++++ src/aliceVision/gpu/README.md | 40 +++++++++++ src/aliceVision/graph/README.md | 36 ++++++++++ src/aliceVision/hdr/README.md | 52 ++++++++++++++ src/aliceVision/imageMasking/README.md | 42 ++++++++++++ src/aliceVision/imageMatching/README.md | 41 ++++++++++++ src/aliceVision/imageProcessing/README.md | 38 +++++++++++ .../lensCorrectionProfile/README.md | 48 +++++++++++++ src/aliceVision/lightingEstimation/README.md | 48 +++++++++++++ .../matchingImageCollection/README.md | 45 +++++++++++++ src/aliceVision/mesh/README.md | 65 ++++++++++++++++++ src/aliceVision/mvsData/README.md | 43 ++++++++++++ src/aliceVision/mvsUtils/README.md | 41 ++++++++++++ src/aliceVision/panorama/README.md | 47 +++++++++++++ src/aliceVision/photometricStereo/README.md | 52 ++++++++++++++ src/aliceVision/python/README.md | 15 +++++ src/aliceVision/segmentation/README.md | 53 +++++++++++++++ src/aliceVision/sensorDB/README.md | 38 +++++++++++ src/aliceVision/sfmDataIO/README.md | 67 +++++++++++++++++++ src/aliceVision/sfmMvsUtils/README.md | 21 ++++++ src/aliceVision/sphereDetection/README.md | 40 +++++++++++ src/aliceVision/stl/README.md | 52 ++++++++++++++ src/aliceVision/system/README.md | 66 ++++++++++++++++++ src/aliceVision/utils/README.md | 42 ++++++++++++ src/aliceVision/voctree/README.md | 56 ++++++++++++++++ 31 files changed, 1394 insertions(+) create mode 100644 src/aliceVision/calibration/README.md create mode 100644 src/aliceVision/cmdline/README.md create mode 100644 src/aliceVision/colorHarmonization/README.md create mode 100644 src/aliceVision/dataio/README.md create mode 100644 src/aliceVision/depthMap/README.md create mode 100644 src/aliceVision/featureEngine/README.md create mode 100644 src/aliceVision/fuseCut/README.md create mode 100644 src/aliceVision/gpu/README.md create mode 100644 src/aliceVision/graph/README.md create mode 100644 src/aliceVision/hdr/README.md create mode 100644 src/aliceVision/imageMasking/README.md create mode 100644 src/aliceVision/imageMatching/README.md create mode 100644 src/aliceVision/imageProcessing/README.md create mode 100644 src/aliceVision/lensCorrectionProfile/README.md create mode 100644 src/aliceVision/lightingEstimation/README.md create mode 100644 src/aliceVision/matchingImageCollection/README.md create mode 100644 src/aliceVision/mesh/README.md create mode 100644 src/aliceVision/mvsData/README.md create mode 100644 src/aliceVision/mvsUtils/README.md create mode 100644 src/aliceVision/panorama/README.md create mode 100644 src/aliceVision/photometricStereo/README.md create mode 100644 src/aliceVision/python/README.md create mode 100644 src/aliceVision/segmentation/README.md create mode 100644 src/aliceVision/sensorDB/README.md create mode 100644 src/aliceVision/sfmDataIO/README.md create mode 100644 src/aliceVision/sfmMvsUtils/README.md create mode 100644 src/aliceVision/sphereDetection/README.md create mode 100644 src/aliceVision/stl/README.md create mode 100644 src/aliceVision/system/README.md create mode 100644 src/aliceVision/utils/README.md create mode 100644 src/aliceVision/voctree/README.md diff --git a/src/aliceVision/calibration/README.md b/src/aliceVision/calibration/README.md new file mode 100644 index 0000000000..e91da16670 --- /dev/null +++ b/src/aliceVision/calibration/README.md @@ -0,0 +1,49 @@ +# calibration + +This module provides tools for camera calibration, including checkerboard detection and lens distortion estimation. + +## Checker Detection + +The `CheckerDetector` class detects checkerboard patterns in images. It supports: + +- Heavily distorted images +- Blurred images +- Partially occluded checkerboards (even occluded at the center) +- Checkerboards without size information +- Simple and nested grids (when centered on the image center) + +A checkerboard corner is described by: +- Its center position +- Its two principal dimensions +- A scale of detection (higher scale means better detection quality) + +The detection is based on the following references: [ROCHADE], [Geiger], [Chen], [Liu], [Abdulrahman], [Bok]. + +```cpp +aliceVision::calibration::CheckerDetector detector; +// detect checkerboard corners in an image +detector.process(image); +// retrieve detected corners and boards +const auto& corners = detector.getCorners(); +const auto& boards = detector.getBoards(); +``` + +## Distortion Estimation + +Several methods are provided to estimate lens distortion from calibration data: + +- `distortionEstimationGeometry`: estimates distortion from geometric constraints +- `distortionEstimationLine`: estimates distortion from lines in the scene +- `distortionEstimationPoint`: estimates distortion from point correspondences + +Each method produces a `Statistics` struct describing the quality of the estimation: + +```cpp +struct Statistics { + double mean; + double stddev; + double median; + double lastDecile; + double max; +}; +``` diff --git a/src/aliceVision/cmdline/README.md b/src/aliceVision/cmdline/README.md new file mode 100644 index 0000000000..119d04ad86 --- /dev/null +++ b/src/aliceVision/cmdline/README.md @@ -0,0 +1,39 @@ +# cmdline + +This module provides utilities for building command-line applications in AliceVision. + +It includes macros and helper functions to simplify the setup of command-line tools with consistent error handling, logging, and option validation. + +## Macros + +### `ALICEVISION_COMMANDLINE_START` / `ALICEVISION_COMMANDLINE_END` + +These macros wrap the main body of a command-line tool, providing: + +- Automatic timing of the command execution +- Consistent error handling for `std::exception` and unknown exceptions +- Standardized success and failure return codes + +```cpp +int main(int argc, char** argv) +{ + ALICEVISION_COMMANDLINE_START + + // ... parse options and run the algorithm ... + + ALICEVISION_COMMANDLINE_END +} +``` + +## Option Validation + +The `optInRange(min, max, opt_name)` helper returns a validation function that checks whether a command-line option value falls within a specified range. It integrates with Boost.Program_options: + +```cpp +po::options_description params("Parameters"); +params.add_options() + ("threshold", po::value()->notifier(aliceVision::optInRange(0.f, 1.f, "threshold")), + "Threshold value in [0, 1]."); +``` + +If the value is outside the range, a `boost::program_options::validation_error` is thrown with an appropriate message. diff --git a/src/aliceVision/colorHarmonization/README.md b/src/aliceVision/colorHarmonization/README.md new file mode 100644 index 0000000000..df59e4fb51 --- /dev/null +++ b/src/aliceVision/colorHarmonization/README.md @@ -0,0 +1,38 @@ +# colorHarmonization + +This module provides tools for color harmonization across a set of images. It adjusts the color/gain of images so that overlapping regions share a consistent appearance. + +## Overview + +Color harmonization is the process of making color-related properties (gain, offset) consistent across a set of images that share common content. This is particularly important in photogrammetry pipelines where many overlapping images are combined. + +The approach is based on L∞ optimization of pairwise color histogram differences across an image graph, as described in: + +> [1] P. Moulon, P. Monasse, R. Marlet. *Adaptive Structure from Motion with a Contrario Model Estimation.* ACCV 2012. + +## Common Data Strategies + +The module defines a base class `CommonDataByPair` and three concrete strategies for computing the overlapping regions (masks) between image pairs: + +- `CommonDataByPair_fullFrame`: uses the entire image frame as the overlap mask +- `CommonDataByPair_matchedPoints`: uses matched feature points to define the region of interest +- `CommonDataByPair_vldSegment`: uses VLD (Virtual Line Descriptor) segments to define the overlap region + +```cpp +// Example: using matched points to compute color histograms +std::unique_ptr dataProvider = + std::make_unique(leftImagePath, rightImagePath, matches); + +image::Image maskLeft, maskRight; +dataProvider->computeMask(maskLeft, maskRight); +``` + +## Gain/Offset Constraint Builder + +The `GainOffsetConstraintBuilder` class builds a linear program that enforces consistent gain and offset parameters across image pairs, minimizing the L∞ norm of histogram alignment errors. + +## API + +- `CommonDataByPair::computeMask(maskLeft, maskRight)` — Compute binary masks for the two images. +- `CommonDataByPair::computeHisto(histo, mask, channelIndex, image)` — Compute a color histogram for the masked region of an image channel. +- `GainOffsetConstraintBuilder` — Build linear constraints for gain/offset optimization. diff --git a/src/aliceVision/dataio/README.md b/src/aliceVision/dataio/README.md new file mode 100644 index 0000000000..3d0f59b7c6 --- /dev/null +++ b/src/aliceVision/dataio/README.md @@ -0,0 +1,52 @@ +# dataio + +This module provides a feed-based interface for reading image data from various sources (image sequences, videos, and E57 point cloud files) in a unified way. + +## Overview + +The `dataio` module abstracts the input source of images so that downstream algorithms can work with any of the supported media types without modification. + +## Feed Interface + +The `IFeed` abstract base class defines the interface for all data feeds: + +```cpp +class IFeed { +public: + virtual bool isInit() const = 0; + virtual bool readImage(image::Image& imageRGB, + camera::Pinhole& camIntrinsics, + std::string& mediaPath, + bool& hasIntrinsics) = 0; + virtual std::size_t nbFrames() const = 0; + virtual bool goToFrame(const unsigned int frame) = 0; + virtual bool goToNextFrame() = 0; +}; +``` + +## Feed Provider + +The `FeedProvider` class is the main entry point. It automatically detects the input type and creates the appropriate feed: + +```cpp +// Create a feed from any supported source (image, video, sfmData, ...) +aliceVision::dataio::FeedProvider feed("/path/to/images_or_video"); + +image::Image image; +camera::Pinhole camIntrinsics; +std::string mediaPath; +bool hasIntrinsics; + +while (feed.readImage(image, camIntrinsics, mediaPath, hasIntrinsics)) +{ + // process image ... + feed.goToNextFrame(); +} +``` + +## Supported Feed Types + +- **`ImageFeed`**: reads from a single image file or a directory of image files +- **`VideoFeed`**: reads frames from a video file +- **`SfMDataFeed`**: reads images referenced in an SfMData file +- **`E57Reader`**: reads point cloud data from E57 files diff --git a/src/aliceVision/depthMap/README.md b/src/aliceVision/depthMap/README.md new file mode 100644 index 0000000000..82cf2f1619 --- /dev/null +++ b/src/aliceVision/depthMap/README.md @@ -0,0 +1,49 @@ +# depthMap + +This module provides GPU-accelerated depth map estimation from multi-view stereo images. + +## Overview + +The `depthMap` module computes per-camera depth maps using a two-stage pipeline: + +1. **SGM (Semi-Global Matching)**: An efficient global stereo matching algorithm that propagates local matching costs along multiple 1-D paths in the image. +2. **Refine**: A local refinement step that improves depth precision sub-pixel accuracy. + +Both stages run on the GPU (CUDA) and support large images through a tiling strategy. + +## Pipeline + +### Semi-Global Matching (SGM) + +The `Sgm` class implements the Semi-Global Matching algorithm: + +- Builds a cost volume from patch-based similarity between the reference camera and target cameras +- Propagates costs along multiple directions in the image +- Selects the best depth hypothesis per pixel + +### Refine + +The `Refine` class refines the depth estimates produced by SGM: + +- Uses local optimization around the SGM depth estimate +- Achieves sub-pixel precision + +### Normal Map Estimation + +The `NormalMapEstimator` class estimates a surface normal map from the computed depth map. + +## Tiling + +For large images, computation is split into tiles that fit in GPU memory. The `DepthMapEstimator` manages the tiling and multi-GPU execution: + +```cpp +DepthMapEstimator estimator(mp, tileParams, depthMapParams, sgmParams, refineParams); +estimator.compute(cudaDeviceId, cams); +``` + +## Parameters + +- `DepthMapParams`: high-level parameters (number of target cameras, tiling options, patch pattern) +- `SgmParams`: Semi-Global Matching parameters (number of depth samples, similarity threshold, ...) +- `RefineParams`: Refine step parameters (number of iterations, patch size, ...) +- `TileParams`: tile size and overlap parameters diff --git a/src/aliceVision/featureEngine/README.md b/src/aliceVision/featureEngine/README.md new file mode 100644 index 0000000000..5ef6996810 --- /dev/null +++ b/src/aliceVision/featureEngine/README.md @@ -0,0 +1,36 @@ +# featureEngine + +This module provides the high-level orchestration of feature extraction across all views in an SfMData scene. + +## Overview + +The `featureEngine` module manages the extraction of local features (keypoints and descriptors) for each image in the dataset. It handles: + +- CPU and GPU image describers +- Memory management and scheduling +- Output of feature (`.feat`) and descriptor (`.desc`) files + +## FeatureExtractor + +The `FeatureExtractor` class is the main entry point. It takes an `SfMData` and a list of `ImageDescriber` objects, and processes all views: + +```cpp +aliceVision::featureEngine::FeatureExtractor extractor(sfmData); +extractor.setOutputFolder("/path/to/features"); +extractor.addImageDescriber(siftDescriber); +extractor.process(hardwareContext, image::EImageColorSpace::SRGB); +``` + +### Key features + +- Automatic dispatch between CPU and GPU describers +- Optional image masking (mask folder, extension, inversion) +- Range-based processing (for distributed computation) +- Memory consumption estimation per view + +## FeatureExtractorViewJob + +The `FeatureExtractorViewJob` class represents the feature extraction work for a single view. It tracks which image describers run on the CPU versus the GPU and manages the output file paths: + +- `getFeaturesPath(imageDescriberType)` — path to the `.feat` file for a given describer type +- `getDescriptorPath(imageDescriberType)` — path to the `.desc` file for a given describer type diff --git a/src/aliceVision/fuseCut/README.md b/src/aliceVision/fuseCut/README.md new file mode 100644 index 0000000000..f49fb66640 --- /dev/null +++ b/src/aliceVision/fuseCut/README.md @@ -0,0 +1,43 @@ +# fuseCut + +This module implements the volumetric reconstruction pipeline that converts multi-view depth maps into a 3D mesh using a graph-cut algorithm on a Delaunay tetrahedralization. + +## Overview + +The `fuseCut` module is responsible for the dense 3D reconstruction stage. It: + +1. Fuses depth maps from multiple cameras into a 3D point cloud +2. Builds a Delaunay tetrahedralization of the point cloud +3. Labels tetrahedra as "inside" or "outside" the surface using a min-cut / max-flow algorithm +4. Extracts the mesh at the interface between inside and outside regions + +## Key Classes + +### Fuser + +The `Fuser` class filters and fuses depth maps across cameras: + +- `filterGroups()` / `filterGroupsRC()`: groups pixels across cameras to detect consistent depth estimates +- `filterDepthMaps()` / `filterDepthMapsRC()`: removes depth map pixels that are not supported by enough cameras +- `divideSpaceFromDepthMaps()` / `divideSpaceFromSfM()`: estimates the bounding box of the scene + +### Tetrahedralization + +The `Tetrahedralization` class builds a Delaunay tetrahedralization from the fused 3D points. + +### GraphFiller + +The `GraphFiller` class populates the graph with visibility-based weights used by the max-flow algorithm to determine the surface location. + +### Mesher + +The `Mesher` class extracts the final triangle mesh from the graph-cut result. + +### PointCloud + +The `PointCloud` class manages the 3D point cloud built from the fused depth maps, including point visibility information. + +## References + +- Labatut, P., Pons, J.-P., Keriven, R. *Efficient Multi-View Reconstruction of Large-Scale Scenes using Interest Points, Delaunay Triangulation and Graph Cuts.* ICCV 2007. +- Jancosek, M., Pajdla, T. *Multi-View Reconstruction Preserving Weakly-Supported Surfaces.* CVPR 2011. diff --git a/src/aliceVision/gpu/README.md b/src/aliceVision/gpu/README.md new file mode 100644 index 0000000000..a0f2a5a21d --- /dev/null +++ b/src/aliceVision/gpu/README.md @@ -0,0 +1,40 @@ +# gpu + +This module provides utilities to query and check the availability of CUDA-capable GPU devices. + +## Overview + +The `gpu` module provides a small set of functions to detect GPU capabilities at runtime. It is used by other AliceVision modules to determine whether GPU-accelerated code paths are available. + +## API + +### `gpuSupportCUDA` + +```cpp +bool gpuSupportCUDA(int minComputeCapabilityMajor, + int minComputeCapabilityMinor, + int minTotalDeviceMemory = 0); +``` + +Returns `true` if the system has at least one CUDA device meeting the specified minimum compute capability (major and minor version) and minimum total device memory (in MB). + +### `gpuInformationCUDA` + +```cpp +std::string gpuInformationCUDA(); +``` + +Returns a human-readable string describing all detected CUDA devices and their properties (compute capability, total memory, etc.). + +## Example + +```cpp +#include + +if (aliceVision::gpu::gpuSupportCUDA(3, 5, 2048)) +{ + // GPU with compute capability >= 3.5 and >= 2 GB memory is available +} + +std::cout << aliceVision::gpu::gpuInformationCUDA() << std::endl; +``` diff --git a/src/aliceVision/graph/README.md b/src/aliceVision/graph/README.md new file mode 100644 index 0000000000..e942a28453 --- /dev/null +++ b/src/aliceVision/graph/README.md @@ -0,0 +1,36 @@ +# graph + +This module provides graph data structures and algorithms used throughout AliceVision, including indexed graphs, connected component analysis, and triplet enumeration. + +## Overview + +The `graph` module wraps the [Boost.Graph](https://www.boost.org/doc/libs/release/libs/graph/) library with AliceVision-specific types and adds higher-level algorithms commonly needed in Structure-from-Motion pipelines. + +## IndexedGraph + +The `IndexedGraph` class is a graph where nodes are identified by integer indices (`IndexT`). It supports directed and undirected edges. + +## Connected Components + +The `connectedComponent.hpp` header provides functions to compute and filter connected components of a graph: + +```cpp +// Get all connected components of the graph +std::set> components = aliceVision::graph::graphToConnectedComponents(graph); +``` + +## Triplets + +The `Triplet` struct and associated utilities enumerate all triplets (sets of three mutually connected nodes) in a graph. Triplets are used in Structure-from-Motion for rotation averaging and loop consistency checks: + +```cpp +struct Triplet { + IndexT i, j, k; +}; +std::vector triplets; +aliceVision::graph::ListTriplets(graph, triplets); +``` + +## Graphviz Export + +The `indexedGraphGraphvizExport.hpp` header provides a function to export an `IndexedGraph` to the Graphviz DOT format for visualization. diff --git a/src/aliceVision/hdr/README.md b/src/aliceVision/hdr/README.md new file mode 100644 index 0000000000..ef3a94a6e3 --- /dev/null +++ b/src/aliceVision/hdr/README.md @@ -0,0 +1,52 @@ +# hdr + +This module provides High Dynamic Range (HDR) imaging functionality, including Camera Response Function (CRF) calibration and HDR image merging from multiple LDR exposures. + +## Overview + +HDR imaging reconstructs the full dynamic range of a scene from a set of photographs taken at different exposure times. The `hdr` module implements the complete HDR pipeline: + +1. **CRF Calibration**: estimate the non-linear camera response function from a bracket of LDR images +2. **HDR Merging**: combine the LDR images into a single HDR radiance map + +## Camera Response Function Calibration + +Three calibration methods are available: + +### Debevec Calibration (`DebevecCalibrate`) + +Based on the algorithm from: +> P. Debevec and J. Malik. *Recovering High Dynamic Range Radiance Maps from Photographs.* SIGGRAPH 1997. + +```cpp +aliceVision::hdr::DebevecCalibrate calibration; +calibration.process(ldrSamples, times, channelQuantization, weight, lambda, response); +``` + +### Grossberg Calibration (`GrossbergCalibrate`) + +An alternative CRF estimation method based on the empirical model of response (EMoR) basis. + +### Laguerre BA Calibration (`LaguerreBACalibration`) + +A bundle-adjustment-based CRF calibration using Laguerre polynomials. + +## HDR Merging + +The `hdrMerge` class combines a bracket of LDR images into a single HDR radiance image: + +```cpp +aliceVision::hdr::hdrMerge merge; +merge.process(images, times, weight, response, radiance, + lowLight, highLight, noMidLight, mergingParams); +``` + +It also supports highlight reconstruction via `postProcessHighlight()`. + +## Brackets + +The `brackets` utilities provide functions to group input images into HDR brackets based on their exposure metadata. + +## RGB Curves + +The `rgbCurve` class represents a per-channel response or weighting function sampled over the full intensity range (0–255). diff --git a/src/aliceVision/imageMasking/README.md b/src/aliceVision/imageMasking/README.md new file mode 100644 index 0000000000..a9e67d484d --- /dev/null +++ b/src/aliceVision/imageMasking/README.md @@ -0,0 +1,42 @@ +# imageMasking + +This module provides functions to generate binary masks from images, which can be used to select or exclude regions of interest in downstream processing steps. + +## Overview + +Image masks are used in several parts of the AliceVision pipeline to restrict processing to meaningful image regions (e.g., excluding the background, sky, or calibration targets). + +## Masking Methods + +### HSV-based masking + +```cpp +void hsv(OutImage& result, + const std::string& inputPath, + float hue, + float hueRange, + float minSaturation, + float maxSaturation, + float minValue, + float maxValue); +``` + +Creates a binary mask by selecting pixels within a specified range of hue, saturation, and value (HSV color space). This is useful for isolating objects of a specific color (e.g., a green screen or a colored calibration target). + +- `hue`: target hue in [0, 1] range (0 = red, 0.33 = green, 0.66 = blue, 1 = red) +- `hueRange`: tolerance around the target hue + +### Automatic Grayscale Threshold + +```cpp +void autoGrayscaleThreshold(OutImage& result, const std::string& inputPath); +``` + +Applies Otsu's binarization method to automatically determine a threshold and produce a binary mask from a grayscale image. + +## Post-processing Operations + +After the initial mask is computed, the following post-processing functions can be applied: + +- `postprocess_invert(result)`: inverts the mask (white ↔ black) +- `postprocess_closing(result, iterations)`: applies a morphological closing operation to fill small holes in the mask diff --git a/src/aliceVision/imageMatching/README.md b/src/aliceVision/imageMatching/README.md new file mode 100644 index 0000000000..41bca639fe --- /dev/null +++ b/src/aliceVision/imageMatching/README.md @@ -0,0 +1,41 @@ +# imageMatching + +This module provides algorithms to determine which pairs of images are likely to share common content, in order to limit the number of feature matching operations needed during a Structure-from-Motion pipeline. + +## Overview + +In large-scale photogrammetry, exhaustive pairwise feature matching is computationally prohibitive. The `imageMatching` module provides several strategies to select a tractable subset of candidate image pairs. + +## Matching Methods + +The `EImageMatchingMethod` enum defines the available strategies: + +| Method | Description | +|--------|-------------| +| `EXHAUSTIVE` | All pairs of images are compared (suitable for small datasets) | +| `VOCABULARYTREE` | Uses a visual vocabulary tree to find visually similar images | +| `SEQUENTIAL` | Matches each image with its temporal neighbors | +| `SEQUENTIAL_AND_VOCABULARYTREE` | Combines sequential and vocabulary tree matching | +| `FRUSTUM` | Matches images whose camera frustums overlap (requires known poses) | +| `FRUSTUM_OR_VOCABULARYTREE` | Combines frustum and vocabulary tree matching | +| `MIRROR` | Matches images that are mirror images of each other | + +## Vocabulary Tree Matching + +The vocabulary tree approach quantizes image descriptors into visual words using a pre-trained tree, then finds candidate pairs based on shared visual word histograms: + +```cpp +aliceVision::voctree::VocabularyTree tree; +tree.load(vocTreeFilepath); + +aliceVision::voctree::Database db; +// populate database with image descriptors... + +// retrieve top-K similar images for each query +OrderedPairList selectedPairs; +aliceVision::imageMatching::generateFromVoctree(selectedPairs, sfmData, db, tree, method, numResults); +``` + +## Output + +The module outputs a `PairList` or `OrderedPairList` (a map from image ID to a list of candidate matching image IDs), which is then passed to the feature matching stage. diff --git a/src/aliceVision/imageProcessing/README.md b/src/aliceVision/imageProcessing/README.md new file mode 100644 index 0000000000..d0c9bd74d2 --- /dev/null +++ b/src/aliceVision/imageProcessing/README.md @@ -0,0 +1,38 @@ +# imageProcessing + +This module provides a composable pipeline for applying image processing operations in place on RGBA float images during the image preparation stage. + +## Overview + +The `imageProcessing` module defines an abstract `ImageProcess` interface and a set of concrete processing steps that can be chained together. Each step receives the full `SfMData` context, the current `View`, its camera intrinsics, and the image to modify. + +A `dryRun` flag is supported to allow metadata updates (e.g. image dimensions or intrinsics) without performing the actual pixel computation. This is useful for planning multi-step pipelines. + +## Base Class: `ImageProcess` + +```cpp +class ImageProcess { +public: + bool processInPlace(const sfmData::SfMData& sfmData, + sfmData::View& view, + camera::IntrinsicBase* camera, + image::Image& image, + bool dryRun); +protected: + virtual bool processInternal(...) = 0; +}; +``` + +## Built-in Processing Steps + +### `ExposureProcess` + +Normalizes image exposure based on the median camera exposure across the dataset. Computes a compensation factor from the ratio of the median exposure to the current view's exposure and scales each pixel's RGB channels accordingly. + +### `FixHolesProcess` + +Replaces non-finite pixel values (NaN, Inf) in the image using OpenImageIO's `fixNonFinite` with a 3×3 box filter. This sanitizes images before further processing steps. + +## OpenCV Integration + +Additional processing operations are provided in `imageProcessing_OpenCV.cpp` for operations that require OpenCV (e.g. image warping, color space conversions). diff --git a/src/aliceVision/lensCorrectionProfile/README.md b/src/aliceVision/lensCorrectionProfile/README.md new file mode 100644 index 0000000000..36eb28f131 --- /dev/null +++ b/src/aliceVision/lensCorrectionProfile/README.md @@ -0,0 +1,48 @@ +# lensCorrectionProfile + +This module provides support for reading and applying Adobe Lens Correction Profiles (LCP files), which describe the optical distortion, vignetting, and chromatic aberration characteristics of specific camera/lens combinations. + +## Overview + +Lens Correction Profiles (LCP) are XML files in a format defined by Adobe. They contain parametric models that describe: + +- **Geometric distortion**: how the lens distorts straight lines (rectilinear model) +- **Vignetting**: light fall-off towards the corners of the image +- **Chromatic aberration (CA)**: color fringing caused by different wavelengths focusing at slightly different distances + +## Correction Modes + +The `LCPCorrectionMode` enum selects which correction to apply: + +```cpp +enum class LCPCorrectionMode { + VIGNETTE, + DISTORTION, + CA +}; +``` + +## Rectilinear Distortion Model + +The `RectilinearModel` struct holds the parameters of the rectilinear distortion model as defined in the Adobe Camera Model technical report. Key fields include: + +- `FocalLengthX`, `FocalLengthY`: normalized focal lengths +- `ImageXCenter`, `ImageYCenter`: principal point (normalized, 0.5 = center) +- Radial and tangential distortion coefficients + +## Usage + +```cpp +#include + +LCPdatabase db; +db.load("/path/to/lcp/files"); + +// find the profile for a specific camera/lens/focal combination +LCPinfo* profile = db.findLCP(cameraMaker, cameraModel, lensModel, focalLength); +if (profile) +{ + profile->initialize(focalLength, focusDistance, aperture, LCPCorrectionMode::DISTORTION); + // apply correction... +} +``` diff --git a/src/aliceVision/lightingEstimation/README.md b/src/aliceVision/lightingEstimation/README.md new file mode 100644 index 0000000000..4591dc4304 --- /dev/null +++ b/src/aliceVision/lightingEstimation/README.md @@ -0,0 +1,48 @@ +# lightingEstimation + +This module provides methods for estimating scene lighting from images, albedo maps, and surface normals, using a Spherical Harmonics (SH) model. + +## Overview + +The `lightingEstimation` module estimates the lighting conditions of a scene under the augmented Lambert's reflectance model. Given images, albedo maps, and surface normals, it fits a 9-coefficient Spherical Harmonics lighting vector to the observed pixel intensities. + +## Lighting Model + +The lighting is represented as a 9×3 matrix (one 9-vector per RGB channel): + +```cpp +using LightingVector = Eigen::Matrix; +``` + +The 9 coefficients correspond to the first two bands of the real Spherical Harmonics basis. This model can represent low-frequency environment lighting including directional, ambient, and soft lighting effects. + +## LighthingEstimator + +The `LighthingEstimator` class aggregates data from one or more images and estimates the lighting: + +```cpp +aliceVision::lightingEstimation::LighthingEstimator estimator; + +// Aggregate data from multiple images +estimator.addImage(albedo, picture, normals); +estimator.addImage(albedo2, picture2, normals2); + +// Estimate lighting +LightingVector lighting; +estimator.estimate(lighting); +``` + +Both grayscale (luminance) and RGB (color) estimation are supported. + +## Augmented Normals + +The `augmentedNormals` utility converts surface normals into the 9-dimensional Spherical Harmonics feature vector used by the lighting model. + +## Lighting Calibration + +The `lightingCalibration` module provides tools to calibrate lighting conditions from a scene with known geometry (e.g. a Lambertian sphere used as a light probe). + +## References + +- R. Basri and D.W. Jacobs. *Lambertian Reflectances and Linear Subspaces.* IEEE TPAMI, 2003. +- R. Ramamoorthi and P. Hanrahan. *An Efficient Representation for Irradiance Environment Maps.* SIGGRAPH 2001. diff --git a/src/aliceVision/matchingImageCollection/README.md b/src/aliceVision/matchingImageCollection/README.md new file mode 100644 index 0000000000..6cc614f1f3 --- /dev/null +++ b/src/aliceVision/matchingImageCollection/README.md @@ -0,0 +1,45 @@ +# matchingImageCollection + +This module provides tools for performing robust geometric verification of putative feature matches across a collection of images. + +## Overview + +After finding putative feature correspondences between image pairs (via descriptor matching), the `matchingImageCollection` module applies robust model estimation to filter out outlier matches and retain only geometrically consistent ones. + +## Geometric Filtering + +The `robustModelEstimation` template function applies a geometric filter to all image pairs in a set of putative matches: + +```cpp +aliceVision::matchingImageCollection::robustModelEstimation( + out_geometricMatches, + sfmData, + regionsPerView, + functor, // e.g. GeometricFilterMatrix_F_AC + putativeMatches, + randomNumberGenerator, + guidedMatching, + distanceRatio); +``` + +## Geometric Models + +Several geometric filter types are available: + +| Class | Model | Description | +|-------|-------|-------------| +| `GeometricFilterMatrix_F_AC` | Fundamental matrix | For uncalibrated image pairs | +| `GeometricFilterMatrix_E_AC` | Essential matrix | For calibrated image pairs | +| `GeometricFilterMatrix_H_AC` | Homography | For planar or pure-rotation scenes | +| `GeometricFilterMatrix_HGrowing` | Homography growing | Robust homography with region growing | + +All filters use ACRANSAC (A Contrario RANSAC) for robust estimation. + +## Image Pair List I/O + +The `ImagePairListIO` utilities provide functions to read and write image pair lists from/to text files. + +## Image Collection Matchers + +- `ImageCollectionMatcher_generic`: matches all pairs using a generic nearest-neighbor search +- `ImageCollectionMatcher_cascadeHashing`: uses cascade hashing for fast approximate matching diff --git a/src/aliceVision/mesh/README.md b/src/aliceVision/mesh/README.md new file mode 100644 index 0000000000..3e80f211e4 --- /dev/null +++ b/src/aliceVision/mesh/README.md @@ -0,0 +1,65 @@ +# mesh + +This module provides 3D mesh data structures and algorithms for mesh processing, texturing, and export in the AliceVision dense reconstruction pipeline. + +## Overview + +The `mesh` module handles the final stages of dense 3D reconstruction: mesh cleaning, optimization, UV atlas generation, and texture baking from the input images. + +## Mesh + +The `Mesh` class is the core data structure representing a 3D triangle mesh. It supports: + +- Loading and saving in multiple formats: OBJ, FBX, GLTF, GLB, STL, PLY +- Vertex positions, normals, UVs, and colors +- Point visibility tracking (`PointsVisibility`) + +```cpp +aliceVision::mesh::Mesh mesh; +mesh.load("/path/to/mesh.obj"); +mesh.save("/path/to/output.glb", material, saveTextures); +``` + +## Mesh Processing + +### MeshClean + +The `MeshClean` class provides mesh cleaning operations: removal of isolated components, degenerate triangles, and unreferenced vertices. + +### MeshAnalyze + +The `MeshAnalyze` class computes mesh statistics and quality metrics. + +### MeshEnergyOpt + +The `MeshEnergyOpt` class performs energy-based mesh optimization to improve mesh quality. + +### MeshIntersection + +The `MeshIntersection` class provides ray–mesh intersection queries. + +## Texturing + +The `Texturing` class projects input images onto the mesh surface to produce texture maps. It supports: + +- Multiple unwrapping methods via `EUnwrapMethod`: `Basic`, `ABF++` (Geogram), `Spectral LSCM` (Geogram) +- Bump/normal mapping via `EBumpMappingType` +- Visibility-based texture selection + +```cpp +aliceVision::mesh::Texturing texturing; +texturing.loadFromOBJ(meshPath, flipNormals); +texturing.generateTextures(mp, imagesCache, outputFolder, textureParams); +``` + +## UV Atlas + +The `UVAtlas` class generates UV coordinates for the mesh, managing atlas packing to minimize wasted texture space. + +## Visibility Remapping + +When replacing the reconstruction mesh with a custom input mesh, point visibilities can be remapped using `EVisibilityRemappingMethod`: + +- `Pull`: pull visibilities from the closest reconstruction vertex to each input mesh vertex +- `Push`: push visibilities from reconstruction vertices to the closest input mesh triangle +- `PullPush`: combine both approaches diff --git a/src/aliceVision/mvsData/README.md b/src/aliceVision/mvsData/README.md new file mode 100644 index 0000000000..ed9df4fedf --- /dev/null +++ b/src/aliceVision/mvsData/README.md @@ -0,0 +1,43 @@ +# mvsData + +This module provides basic data structures and mathematical types used throughout the Multi-View Stereo (MVS) pipeline in AliceVision. + +## Overview + +The `mvsData` module defines low-level geometric and algebraic types optimized for the dense reconstruction pipeline. These types are used extensively by the `depthMap`, `fuseCut`, `mesh`, and `mvsUtils` modules. + +## Geometric Types + +| Type | Description | +|------|-------------| +| `Point2d` | 2D point with double-precision coordinates | +| `Point3d` | 3D point with double-precision coordinates | +| `Point4d` | Homogeneous 4D point | +| `Pixel` | Integer 2D pixel coordinate | +| `Voxel` | Integer 3D voxel coordinate | +| `OrientedPoint` | 3D point with an associated normal vector | +| `ROI` | Axis-aligned 2D region of interest | + +## Matrix Types + +| Type | Description | +|------|-------------| +| `Matrix3x3` | 3×3 double-precision matrix (rotation, homography, ...) | +| `Matrix3x4` | 3×4 projection matrix | + +## StaticVector + +`StaticVector` is a thin wrapper around `std::vector` that provides index-based access and serialization support (including zlib-compressed I/O). It is the standard container type throughout the MVS pipeline. + +## Universe (Union-Find) + +The `Universe` class implements a disjoint-set (union-find) data structure, used for connected-component labeling in the graph-cut step. + +## Geometry Utilities + +- `geometry.hpp`: 3D geometric operations (triangle area, barycentric coordinates, plane fitting, ...) +- `geometryTriTri.hpp`: triangle–triangle intersection tests + +## Statistical Types + +The `Stat3d` class computes basic statistics (mean, standard deviation, percentiles) over a set of 3D points. diff --git a/src/aliceVision/mvsUtils/README.md b/src/aliceVision/mvsUtils/README.md new file mode 100644 index 0000000000..d2081fa33f --- /dev/null +++ b/src/aliceVision/mvsUtils/README.md @@ -0,0 +1,41 @@ +# mvsUtils + +This module provides utility classes and functions for the Multi-View Stereo (MVS) pipeline, including multi-view parameter management, image caching, tiling, and file I/O. + +## Overview + +The `mvsUtils` module acts as a support layer for the dense reconstruction pipeline. It bridges the SfMData scene representation with the lower-level MVS data structures. + +## MultiViewParams + +The `MultiViewParams` class is the central configuration object for the MVS pipeline. It is constructed from an `SfMData` scene and provides access to: + +- Camera projection matrices (P, K, R, C) +- Image dimensions and scale +- Camera neighbor relationships +- File paths for intermediate results + +```cpp +aliceVision::mvsUtils::MultiViewParams mp(sfmData, imagesFolder, depthMapsFolder); +int nbCameras = mp.getNbCameras(); +``` + +## ImagesCache + +The `ImagesCache` class implements a least-recently-used (LRU) cache for loading and storing images in memory. It avoids re-loading the same image multiple times when processing multiple depth maps simultaneously. + +## TileParams + +The `TileParams` struct defines the tile size and overlap used to split large images into tiles for GPU processing. + +## File I/O + +The `fileIO` and `mapIO` modules provide functions to read and write intermediate MVS results (depth maps, similarity maps, normal maps, etc.) in a binary format. + +## Common Utilities + +The `common.hpp` header provides miscellaneous utilities used throughout the MVS pipeline, such as: + +- Camera visibility determination +- Nearest camera selection +- Coordinate system conversions diff --git a/src/aliceVision/panorama/README.md b/src/aliceVision/panorama/README.md new file mode 100644 index 0000000000..cd9b795100 --- /dev/null +++ b/src/aliceVision/panorama/README.md @@ -0,0 +1,47 @@ +# panorama + +This module provides tools for stitching and compositing panoramic images from multiple overlapping photographs. + +## Overview + +The `panorama` module implements a complete panorama stitching pipeline: warping individual images into an equirectangular projection, blending them together with feathering or multi-band (Laplacian pyramid) compositing, and handling seams between images. + +## Compositing + +Two compositing strategies are available, both implementing the `Compositer` base class: + +### Alpha Compositing (`alphaCompositer.hpp`) + +Simple alpha blending: each incoming image replaces or blends over the existing panorama based on an alpha channel. + +### Laplacian Pyramid Compositing (`laplacianCompositer.hpp`) + +Multi-band blending using a Laplacian pyramid, which produces seamless transitions between images by blending low and high frequencies at different scales independently. + +## Coordinate Maps + +The `CoordinatesMap` class computes the mapping from panorama pixels to source image pixels for a given camera. It supports equirectangular projection. + +## Seam Handling + +The `feathering` module computes smooth blending weights near seam boundaries to avoid hard transitions between overlapping images. + +## Gaussian and Laplacian Pyramids + +The `gaussian.hpp` and `laplacianPyramid.hpp` modules implement multi-scale image decompositions used by the multi-band compositor. + +## Cached Image + +The `CachedImage` class provides a tile-based image cache for handling panoramas that are too large to fit entirely in memory. + +## Distance Map + +The `distance` module computes distance transforms on binary masks, used for generating smooth blending weights. + +## Graph Cut + +The `graphcut.hpp` module provides optimal seam finding between overlapping images using energy minimization (graph cut), producing visually minimal seam lines. + +## Bounding Box + +The `BoundingBox` class represents the region of the panorama canvas covered by a single warped image. diff --git a/src/aliceVision/photometricStereo/README.md b/src/aliceVision/photometricStereo/README.md new file mode 100644 index 0000000000..94a981298d --- /dev/null +++ b/src/aliceVision/photometricStereo/README.md @@ -0,0 +1,52 @@ +# photometricStereo + +This module provides a Photometric Stereo implementation for recovering surface normals and albedo from a set of images of the same object taken under different lighting conditions. + +## Overview + +Photometric Stereo estimates the surface normal and albedo of each surface point by solving a system of linear equations relating pixel intensity to lighting direction. Given $n$ images of the same scene under $n$ known light directions, the method recovers the normal map and albedo map of the surface. + +The implementation supports: +- Spherical Harmonics lighting model (configurable order) +- Robust estimation (optional) +- Ambient light removal (optional) +- Multi-view setup (one PS problem per pose) + +## Parameters + +```cpp +struct PhotometricSteroParameters { + size_t SHOrder; // Order of Spherical Harmonics for lighting model + bool removeAmbient; // Whether to subtract ambient lighting + bool isRobust; // Whether to use robust estimation + int downscale; // Downscale factor for memory efficiency +}; +``` + +## Usage + +### Single-view (folder-based) + +```cpp +aliceVision::photometricStereo::photometricStereo( + inputPath, lightData, outputPath, parameters, normals, albedo); +``` + +### Multi-view (SfMData-based) + +```cpp +aliceVision::photometricStereo::photometricStereo( + sfmData, lightData, maskPath, outputPath, parameters, normals, albedo); +``` + +## Data I/O + +The `photometricDataIO` module provides functions to load light calibration data and save the resulting normal and albedo maps. + +## Normal Integration + +The `normalIntegration` module integrates the estimated normal map to recover the surface depth map. + +## References + +- R. Woodham. *Photometric method for determining surface orientation from multiple images.* Optical Engineering, 1980. diff --git a/src/aliceVision/python/README.md b/src/aliceVision/python/README.md new file mode 100644 index 0000000000..680f88f749 --- /dev/null +++ b/src/aliceVision/python/README.md @@ -0,0 +1,15 @@ +# python + +This module provides Python utilities and bindings infrastructure for AliceVision. + +## Overview + +The `python` module contains Python helper scripts used within the AliceVision pipeline, along with the infrastructure for parallel processing. + +## parallelization.py + +This script provides Python-level parallelization helpers. It is used by pipeline scripts to manage distributed or multi-process execution of AliceVision nodes. + +## Python Bindings + +The Python bindings for the main AliceVision C++ modules are generated using SWIG (`.i` interface files) and are located alongside the corresponding C++ modules. The `python` module provides shared infrastructure for these bindings. diff --git a/src/aliceVision/segmentation/README.md b/src/aliceVision/segmentation/README.md new file mode 100644 index 0000000000..4075cfcea1 --- /dev/null +++ b/src/aliceVision/segmentation/README.md @@ -0,0 +1,53 @@ +# segmentation + +This module provides semantic image segmentation using deep learning models via ONNX Runtime. + +## Overview + +The `segmentation` module uses a pre-trained neural network (loaded via ONNX Runtime) to assign semantic class labels to every pixel in an image. It supports both CPU and GPU inference. + +Segmentation results are used in the AliceVision pipeline to: +- Mask out unwanted regions (e.g. sky, background) from reconstruction +- Assist in sphere detection and other geometry-aware tasks + +## Segmentation Class + +```cpp +aliceVision::segmentation::Segmentation::Parameters params; +params.modelWeights = "/path/to/model.onnx"; +params.classes = {"background", "person", "sky", ...}; +params.modelWidth = 512; +params.modelHeight = 512; +params.useGpu = true; + +aliceVision::segmentation::Segmentation seg(params); + +image::Image labels; +seg.processImage(labels, sourceImage); +``` + +### ScoredLabel + +Each output pixel is represented as a `ScoredLabel`: + +```cpp +struct ScoredLabel { + IndexT label; // class index + float score; // confidence score +}; +``` + +## Tiled Processing + +For large images, `Segmentation` automatically splits the image into overlapping tiles and stitches the results together to produce a full-resolution label map. + +## Parameters + +| Parameter | Description | +|-----------|-------------| +| `modelWeights` | Path to the ONNX model file | +| `classes` | List of class names indexed by label ID | +| `center` / `scale` | Per-channel normalization parameters | +| `modelWidth` / `modelHeight` | Input resolution expected by the model | +| `overlapRatio` | Tile overlap ratio for tiled inference | +| `useGpu` | Enable GPU inference (requires ONNX GPU support) | diff --git a/src/aliceVision/sensorDB/README.md b/src/aliceVision/sensorDB/README.md new file mode 100644 index 0000000000..93863d5b16 --- /dev/null +++ b/src/aliceVision/sensorDB/README.md @@ -0,0 +1,38 @@ +# sensorDB + +This module provides a database of camera sensor sizes indexed by camera brand and model name. It is used to initialize camera intrinsics when no calibration data is available. + +## Overview + +When processing images whose camera intrinsics are unknown, AliceVision can estimate the focal length from the sensor size stored in EXIF metadata combined with the sensor physical size from a database. + +The `sensorDB` module reads a flat-text database of sensor specifications and provides a lookup interface. + +## Datasheet + +The `Datasheet` struct stores the entry for a single camera model: + +```cpp +struct Datasheet { + std::string _brand; // Camera manufacturer (e.g. "Canon") + std::string _model; // Camera model name (e.g. "EOS 5D Mark IV") + double _sensorWidth; // Sensor width in millimetres +}; +``` + +## Database + +The sensor database is stored in `cameraSensors.db`, a plain-text file with one entry per line. The `parseDatabase` functions read this file and populate a list of `Datasheet` objects. + +```cpp +#include + +std::vector db; +aliceVision::sensorDB::parseDatabase("/path/to/cameraSensors.db", db); + +aliceVision::sensorDB::Datasheet result; +if (aliceVision::sensorDB::getInfo("Canon", "EOS 5D Mark IV", db, result)) +{ + double sensorWidth = result._sensorWidth; // in mm +} +``` diff --git a/src/aliceVision/sfmDataIO/README.md b/src/aliceVision/sfmDataIO/README.md new file mode 100644 index 0000000000..d092d3269c --- /dev/null +++ b/src/aliceVision/sfmDataIO/README.md @@ -0,0 +1,67 @@ +# sfmDataIO + +This module provides serialization and deserialization of `SfMData` scenes to and from multiple file formats. + +## Overview + +The `sfmDataIO` module is the I/O layer for AliceVision's Structure-from-Motion data. It supports reading and writing complete or partial `SfMData` scenes, including: + +- Camera views and intrinsics +- Camera poses (extrinsics) +- 3D landmark structure and observations +- 2D constraints +- Uncertainty information + +## Supported Formats + +| Format | Extension | Description | +|--------|-----------|-------------| +| JSON | `.sfm`, `.json` | AliceVision native format | +| Alembic | `.abc` | Interchange format for VFX pipelines | +| COLMAP | `.txt`, `.bin` | Compatibility with the COLMAP pipeline | +| BAF | `.baf` | Bundle Adjustment Format | +| GT | various | Ground truth formats for evaluation | + +## Main API + +```cpp +#include + +aliceVision::sfmData::SfMData sfmData; + +// Load a scene (format detected automatically from file extension) +aliceVision::sfmDataIO::load(sfmData, "/path/to/scene.sfm", + aliceVision::sfmDataIO::ESfMData::ALL); + +// Save a scene +aliceVision::sfmDataIO::save(sfmData, "/path/to/output.sfm", + aliceVision::sfmDataIO::ESfMData::ALL); +``` + +## Partial Loading + +The `ESfMData` flags control which parts of the scene are loaded or saved: + +```cpp +enum ESfMData { + VIEWS = 1, + EXTRINSICS = 2, + INTRINSICS = 4, + STRUCTURE = 8, + OBSERVATIONS = 16, + LANDMARKS_UNCERTAINTY = 64, + POSES_UNCERTAINTY = 128, + CONSTRAINTS2D = 256, + // ... + ALL = /* all flags combined */ +}; +``` + +## Alembic Export/Import + +The `AlembicExporter` and `AlembicImporter` classes provide direct control over Alembic I/O, which is the preferred format for integration with VFX and animation software: + +```cpp +aliceVision::sfmDataIO::AlembicExporter exporter("/output/scene.abc"); +exporter.addSfM(sfmData, ESfMData::ALL); +``` diff --git a/src/aliceVision/sfmMvsUtils/README.md b/src/aliceVision/sfmMvsUtils/README.md new file mode 100644 index 0000000000..d54e524784 --- /dev/null +++ b/src/aliceVision/sfmMvsUtils/README.md @@ -0,0 +1,21 @@ +# sfmMvsUtils + +This module provides utility functions for bridging the Structure-from-Motion (SfM) and Multi-View Stereo (MVS) pipelines in AliceVision. + +## Overview + +The `sfmMvsUtils` module contains helper functions that convert and use SfMData information in the context of the dense reconstruction pipeline. + +## API + +### `createRefMeshFromDenseSfMData` + +```cpp +void createRefMeshFromDenseSfMData(mesh::Mesh& outRefMesh, + const sfmData::SfMData& sfmData, + const mvsUtils::MultiViewParams& mp); +``` + +Creates a reference mesh from the sparse point cloud contained in an `SfMData` object. This mesh can be used as a prior for the dense reconstruction or for evaluating depth map quality. + +The function uses the `MultiViewParams` context to correctly handle coordinate system transformations between the SfM and MVS pipelines. diff --git a/src/aliceVision/sphereDetection/README.md b/src/aliceVision/sphereDetection/README.md new file mode 100644 index 0000000000..45eb0e6978 --- /dev/null +++ b/src/aliceVision/sphereDetection/README.md @@ -0,0 +1,40 @@ +# sphereDetection + +This module provides automatic and manual detection of spherical objects in images, using a deep learning model via ONNX Runtime. + +## Overview + +The `sphereDetection` module is used in the AliceVision photometric stereo pipeline to locate a reflective or diffuse sphere in the scene. The sphere is used as a light probe to measure the incident lighting direction in each image. + +Detection is performed by running a pre-trained object detection model (ONNX format) over the input images. A fallback mode allows users to specify the sphere parameters manually. + +## API + +### Automatic Detection + +```cpp +// Load an ONNX model +Ort::Session session(env, "/path/to/model.onnx", sessionOptions); + +// Detect spheres in all views of an SfMData +aliceVision::sphereDetection::sphereDetection(sfmData, session, outputPath, minScore); +``` + +### Manual Detection + +```cpp +// Specify sphere center (x, y) and radius manually +std::array sphereParam = {cx, cy, radius}; +aliceVision::sphereDetection::writeManualSphereJSON(sfmData, sphereParam, outputPath); +``` + +### Model Exploration + +```cpp +// Print model inputs/outputs and validate requirements +aliceVision::sphereDetection::modelExplore(session); +``` + +## Output + +Detected sphere parameters are written to a JSON file, one entry per input image. Each entry contains the bounding boxes and confidence scores of detected sphere candidates. diff --git a/src/aliceVision/stl/README.md b/src/aliceVision/stl/README.md new file mode 100644 index 0000000000..aff52b0b47 --- /dev/null +++ b/src/aliceVision/stl/README.md @@ -0,0 +1,52 @@ +# stl + +This module provides STL (Standard Template Library) extensions and utilities used throughout AliceVision. + +## Overview + +The `stl` module collects small, reusable data structures and utility functions that extend the C++ standard library for common patterns found in the AliceVision codebase. + +## Components + +### `DynamicBitset` + +A dynamic bitset similar to `std::vector` but with bitwise operations. Useful for efficient set membership testing. + +```cpp +aliceVision::stl::DynamicBitset bs(1024); +bs.set(42); +bs.reset(42); +bool val = bs.test(42); +``` + +### `FlatMap` + +A sorted `std::vector`-based associative container offering O(log n) lookup with better cache performance than `std::map` for small to medium-sized collections. + +### `bitmask` + +The `ALICEVISION_BITMASK(EnumType)` macro enables bitwise operations (|, &, ^, ~) on enum class types, making it easy to use enumerations as flag sets: + +```cpp +enum class EOption { A = 1, B = 2, C = 4 }; +ALICEVISION_BITMASK(EOption); + +EOption opts = EOption::A | EOption::C; +bool hasA = (opts & EOption::A) != EOption(0); +``` + +### `hash` + +Custom hash functions for standard and AliceVision types (e.g. `std::pair`, `IndexT` pairs). + +### `indexedSort` + +Utility functions for sorting a container while keeping track of the original indices. + +### `mapUtils` + +Helper functions for working with `std::map` and `std::unordered_map`, such as retrieving values with default fallback or inverting a map. + +### `regex` + +Utilities for regex-based string filtering using the C++ `` standard library. diff --git a/src/aliceVision/system/README.md b/src/aliceVision/system/README.md new file mode 100644 index 0000000000..f83d231d1e --- /dev/null +++ b/src/aliceVision/system/README.md @@ -0,0 +1,66 @@ +# system + +This module provides system-level utilities used throughout AliceVision, including logging, timing, memory information, hardware context, and parallelization. + +## Logging + +The `Logger` class wraps [Boost.Log](https://www.boost.org/doc/libs/release/libs/log/) to provide a configurable, leveled logging interface. Log levels (from most to least verbose): + +- `Trace` +- `Debug` +- `Info` +- `Warning` +- `Error` +- `Fatal` + +Convenience macros are provided for each level: + +```cpp +ALICEVISION_LOG_INFO("Processing " << nbImages << " images."); +ALICEVISION_LOG_WARNING("Missing intrinsics for view " << viewId); +ALICEVISION_LOG_ERROR("Failed to load image: " << path); +``` + +## Timer + +The `Timer` class measures elapsed time with microsecond accuracy: + +```cpp +aliceVision::system::Timer timer; +// ... do work ... +double elapsedSeconds = timer.elapsed(); +double elapsedMs = timer.elapsedMs(); +``` + +## Memory Information + +The `MemoryInfo` struct and `getMemoryInfo()` function report system RAM and swap usage (total, free, and available): + +```cpp +aliceVision::system::MemoryInfo mem = aliceVision::system::getMemoryInfo(); +std::cout << "Free RAM: " << mem.freeRam / (1024*1024) << " MB" << std::endl; +``` + +## Hardware Context + +The `HardwareContext` class encapsulates information about available hardware resources (number of CPU threads, GPU devices) and is passed through the pipeline to allow algorithms to adapt their resource usage. + +## Parallelization + +The `Parallelization` module provides utilities for range-based parallel computation, allowing the pipeline to be split into independent chunks for distributed processing: + +```cpp +int rangeStart, rangeEnd; +if (aliceVision::rangeComputation(rangeStart, rangeEnd, iteration, totalBlocks, itemCount)) +{ + // process items [rangeStart, rangeEnd) +} +``` + +## CPU Information + +The `cpu.hpp` header provides functions to query CPU capabilities (number of cores, cache sizes, etc.). + +## Progress Display + +The `ProgressDisplay` class provides a console progress bar for long-running loops. diff --git a/src/aliceVision/utils/README.md b/src/aliceVision/utils/README.md new file mode 100644 index 0000000000..d3a793c74f --- /dev/null +++ b/src/aliceVision/utils/README.md @@ -0,0 +1,42 @@ +# utils + +This module provides general-purpose utility classes and functions used across the AliceVision codebase. + +## Overview + +The `utils` module collects small utility components that do not belong to any specific algorithm module. + +## Histogram + +The `Histogram` template class computes the frequency distribution (histogram) of values within a specified range, divided into a configurable number of bins: + +```cpp +aliceVision::utils::Histogram hist(0.0, 1.0, 100); // 100 bins in [0, 1] + +// Add individual values +hist.Add(0.42); + +// Add a sequence from an iterator range +hist.Add(myVector.begin(), myVector.end()); + +// Retrieve the bin counts +const std::vector& freq = hist.GetHist(); +``` + +## Convert + +The `convert.hpp` header provides type conversion utilities. + +## File I/O + +The `filesIO.hpp` header provides helper functions for common file system operations. + +## Regex Filter + +The `regexFilter.hpp` header provides utilities to filter collections of strings (e.g. file paths) using regular expression patterns. + +```cpp +std::vector files = getFiles("/some/directory"); +std::vector filtered = + aliceVision::utils::filterStrings(files, std::regex(".*\\.jpg")); +``` diff --git a/src/aliceVision/voctree/README.md b/src/aliceVision/voctree/README.md new file mode 100644 index 0000000000..60ce061fc0 --- /dev/null +++ b/src/aliceVision/voctree/README.md @@ -0,0 +1,56 @@ +# voctree + +This module provides a Visual Vocabulary Tree implementation for fast approximate nearest-neighbor search of image descriptors, used in the image matching stage of Structure-from-Motion pipelines. + +## Overview + +A Visual Vocabulary Tree (also known as a Bag-of-Words tree) quantizes high-dimensional feature descriptors into discrete *visual words* using a hierarchical k-means tree. By representing each image as a histogram of visual words, it enables efficient retrieval of visually similar images from a large database without exhaustive pairwise descriptor comparison. + +## Vocabulary Tree + +The `VocabularyTree` class is the core data structure. It is a hierarchical k-means tree built from a training set of descriptors: + +```cpp +aliceVision::voctree::VocabularyTree tree; + +// Load a pre-trained vocabulary +tree.load("/path/to/vocabulary.tree"); + +// Quantize a set of descriptors into visual words +std::vector words = tree.quantize(descriptors); +``` + +### Building a Vocabulary + +The `TreeBuilder` and `SimpleKmeans` classes build a vocabulary tree from scratch by clustering a large set of training descriptors using hierarchical k-means. + +## Database + +The `Database` class maintains an inverted index that maps each visual word to the set of images that contain it. It supports efficient TF-IDF weighted retrieval: + +```cpp +aliceVision::voctree::Database db(tree.words()); + +// Add images to the database +db.insert(imageId, sparseHistogram); + +// Query for similar images +std::vector results = db.find(queryHistogram, numResults); +``` + +## Sparse Histogram + +Images are represented as sparse histograms (`SparseHistogram`), mapping visual word IDs to lists of feature indices. The `computeSparseHistogram` function builds this representation from a list of visual words. + +## Descriptor Loader + +The `descriptorLoader` utility loads feature descriptors from `.desc` files produced by the `featureEngine` module. + +## Distance Functions + +The `distance.hpp` header provides L1, L2, and Hamming distance functions for descriptor comparison. + +## References + +- D. Nistér and H. Stewénius. *Scalable Recognition with a Vocabulary Tree.* CVPR 2006. +- J. Sivic and A. Zisserman. *Video Google: A Text Retrieval Approach to Object Matching in Videos.* ICCV 2003.