Capro: Curvilinear-aware Prompt Learning with Single Unlabeled Image for Cost-effective Curvilinear Structure Segmentation
Curvilinear Structure Segmentation (CSS) is crucial for applications like medical imaging and structural health monitoring. While the Segment Anything Model (SAM) shows potential for CSS, its direct application yields poor results, and existing adaptation methods rely heavily on costly pixel-level annotations and numerous training samples. This paper addresses a more challenging and practical scenario: adapting SAM using only a single unlabeled image. To this end, we propose Curvilinear-aware Prompt Learning (CaPro), a fine-tuning-free framework. CaPro operates in two stages: first, it synthesizes curvilinear structures and trains a self-supervised oriented object detector to generate prompts; second, it introduces a curvilinear-aware discrete representation matching mechanism to filter out unreliable prompts by leveraging shared topological patterns with handwritten digits. This approach enables cost-effective and annotation-free adaptation of SAM to CSS tasks, demonstrating significant performance improvements.
CaPro framework overview
Models may be needed: https://pan.baidu.com/s/1i7Kid7Io943dJNiH39GylQ?pwd=1fj5
These authors contributed equally.
- Zhuangzhuang Chen
- Qiangyu Chen
git clone https://github.com/xmed-lab/CaPro.gitconda create -n capro python=3.10
conda activate caproThe current pipeline requires an NVIDIA GPU with CUDA. The following PyTorch command is an example for CUDA 11.7; please choose the matching PyTorch build for your local CUDA or CPU environment from the official PyTorch installation guide if needed.
cd CaPro/
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1+cu117 --extra-index-url https://download.pytorch.org/whl/cu117
pip install -r requirements.txtThe generate_data.py script in the scripts directory of our project encapsulates the complete data generation process. This process can be broadly divided into the following three steps.
- Generate a curve structure. The parameters can be adjusted as required in
make_fakevessel.py - Embed the curve structure into the image. The parameters can be adjusted as required in
FDA_retinal.py. - Convert data format. Specifically, it refers to the RoLabelIMG_Transform module.
Warning
Before generating the data, please place the background reference image that needs to be embedded in the ./data/L-System/FDA/Single_image directory.
python scripts/generate_data.py --num 500 # If the value of num is too small, it may cause an error.Note: By modifying line 36 of convert_gt.py, you can select the background color (black/white) for the desired generated curve structure.
By the first step: Data generation, we can find corresponding training annotations under detector/data/annotations. Besides, you can choose your own val.json. Then, in the images directory, we will add our training images (if you follow our instructions exactly, we will add the images from data/RoLabelImg_Transform/img).
cd detector
python train.pyWe then use predict.py to generate our predictions, and place the images we want to predict in the imgs directory.
python predict.pyThe output is easy to find. What we need is the data in txts_result/resnet50, which corresponds to the detection boxes for those images.
We filter the boxes we need with TopK_Box.py, where k is a hyperparameter. original_txt and imgs in the capro/dataset directory are the txt files predicted using the detector in the previous step and the corresponding original image.
cd ../capro
python TopK_Box.pyThe following are optional parameters:
- Model type for SAM
- Input and output paths
Caution
In some cases, when you run TopK_Box.py with the original number of boxes less than K, you won't be able to generate the corresponding txt file. In that case, you can assume that the original txt file will be sufficient (if it's a minority). A fallback txt read directory is provided in CaPro_SAM.py to solve this problem.
python CaPro_SAM.pyHere is a simple evaluation function to evaluate the quality of the image segmentation. We use the F1 and MIoU metrics
python eval.py- https://github.com/facebookresearch/segment-anything
- https://github.com/TY-Shi/FreeCOS
- https://github.com/ZeroE04/R-CenterNet
- qiangyuchen516@gmail.com (Qiangyu Chen)
- 1248013830@qq.com (Zhuangzhuang Chen)
If you find our paper is helpful in your research or applications, please consider citing:
@inproceedings{chen2026capro,
title={{CaPro}: Curvilinear-aware Prompt Learning with Single Unlabeled Image for Cost-effective Curvilinear Structure Segmentation},
author={Chen, Zhuangzhuang and Chen, Qiangyu and Ou, Chubin and Li, Xiaomeng},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
year={2026}
}
