Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
adf70f4
Updated the CHANGELOG
pkerpedjiev May 3, 2025
8a8df13
Bump version: 0.20.3 → 0.20.4
pkerpedjiev May 3, 2025
acadbbc
Copying over all of the changes from the rhodius repo
pkerpedjiev Mar 1, 2026
0caf964
Added all the rhodius tests
pkerpedjiev Mar 1, 2026
c3d031d
Fixed linting errors
pkerpedjiev Mar 2, 2026
628435e
Fixed tests and added CLAUDE.md
pkerpedjiev Mar 2, 2026
d97cd11
Fixed linting
pkerpedjiev Mar 2, 2026
da30d42
One more linting check
pkerpedjiev Mar 2, 2026
2130aa5
Added dependencies
pkerpedjiev Mar 2, 2026
a94805d
Update reference to s3sqlite to sosqlite
pkerpedjiev Mar 9, 2026
032c0e4
Bumped sosqlite version
pkerpedjiev Mar 10, 2026
95a027e
Get test data
pkerpedjiev Mar 10, 2026
688907b
Added more test data
pkerpedjiev Mar 10, 2026
86e5a2e
Remove some data fetching
pkerpedjiev Mar 12, 2026
eade1b5
Remove some data fetching
pkerpedjiev Mar 12, 2026
0785249
Remove some data fetching
pkerpedjiev Mar 12, 2026
a946e2a
Remove some data fetching
pkerpedjiev Mar 12, 2026
3b6809d
Remove some data fetching
pkerpedjiev Mar 12, 2026
3a3cb1f
Remove some data fetching
pkerpedjiev Mar 12, 2026
31ba87a
Remove some data fetching
pkerpedjiev Mar 12, 2026
01f362f
Migrate test fixtures to Git LFS
pkerpedjiev Mar 12, 2026
051c766
Remove get_test_data.sh
pkerpedjiev Mar 12, 2026
1639093
Add regions.valid.bed.gz and regions.spaces.bed; document LFS workflow
pkerpedjiev Mar 15, 2026
c14a02d
fix: handle null CIGAR strings and pandas 3.x integer indexing
pkerpedjiev Mar 15, 2026
740fd22
Fixes to use the latest oxbow
pkerpedjiev Mar 17, 2026
d72c350
Fix more issues
pkerpedjiev Mar 21, 2026
bac0e9a
Add SRR1770413.sorted.short.bam.bai and regions.valid.bed.gz.tbi to LFS
pkerpedjiev Mar 21, 2026
a8750dc
Use Python 3.12 and pin oxbow>=0.3.1
pkerpedjiev Mar 21, 2026
f86659f
Added .gitignore
pkerpedjiev Mar 21, 2026
fb0cdfa
Fix gff and bam tests
pkerpedjiev Mar 21, 2026
6486b9d
Fix read_ipc call to use BytesIO
pkerpedjiev Mar 23, 2026
3326738
Fix GFF indexed tiles: use from_gff public API and handle None data
pkerpedjiev Mar 23, 2026
c802a74
More fixes
pkerpedjiev Mar 23, 2026
9084798
Bump oxbow version
pkerpedjiev Mar 23, 2026
758a6c1
Added some missing functionality from the newer changes
pkerpedjiev Mar 25, 2026
a0a6937
Add missing import and push
pkerpedjiev Mar 25, 2026
711c4d9
Updated the CHANGELOG
pkerpedjiev Mar 25, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .bumpversion.cfg
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[bumpversion]
current_version = 0.20.3
current_version = 0.20.4
tag = True
commit = True

Expand Down
20 changes: 20 additions & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Binary genomics data files tracked with Git LFS
*.cool filter=lfs diff=lfs merge=lfs -text
*.mv5 filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.hdf5 filter=lfs diff=lfs merge=lfs -text
*.bam filter=lfs diff=lfs merge=lfs -text
*.bai filter=lfs diff=lfs merge=lfs -text
*.beddb filter=lfs diff=lfs merge=lfs -text
*.bb filter=lfs diff=lfs merge=lfs -text
*.bigWig filter=lfs diff=lfs merge=lfs -text
*.hitile filter=lfs diff=lfs merge=lfs -text
data/*.fna filter=lfs diff=lfs merge=lfs -text
data/*.gff.gz filter=lfs diff=lfs merge=lfs -text
data/*.vcf.gz filter=lfs diff=lfs merge=lfs -text
data/*.bed.gz filter=lfs diff=lfs merge=lfs -text
data/*.bed.1.gz filter=lfs diff=lfs merge=lfs -text
data/*.gz.tbi filter=lfs diff=lfs merge=lfs -text
data/*.multires filter=lfs diff=lfs merge=lfs -text
data/*.gff filter=lfs diff=lfs merge=lfs -text
data/SRR1770413.sorted.short.bam.bai filter=lfs diff=lfs merge=lfs -text
15 changes: 3 additions & 12 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ jobs:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
with:
python-version: '3.10'
python-version: '3.12'

- name: Install Dependencies
run: |
Expand All @@ -29,21 +29,12 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ['3.9', '3.10']
python-version: ['3.12']

steps:
- uses: actions/checkout@v3

- name: Cache Fixtures
id: cache-fixtures
uses: actions/cache@v3
with:
path: data/
key: ${{ runner.os }}-{{ hashFiles('get_test_data.sh') }}-{{ hashFiles('.gitignore') }}

- name: Download Fixtures
if: steps.cache-fixtures.outputs.cache-hit != 'true'
run: ./get_test_data.sh
lfs: true

- name: Set Up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
Expand Down
33 changes: 32 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
notebooks/Scratch.ipynb
notebooks/VCF.ipynb

settings.local.json

*.py[cod]
__pycache__
*~
Expand Down Expand Up @@ -40,7 +42,36 @@ Thumbs.db
old
tmp
checkpoint
data/
data/*
!data/Dixon2012-J1-NcoI-R1-filtered.100kb.multires.cool
!data/hic-resolutions.cool
!data/sample_htime.json
!data/gene_annotations.short.db
!data/wgEncodeCaltechRnaSeqHuvecR1x75dTh1014IlnaPlusSignalRep2.bigWig
!data/points_density.h5
!data/corrected.geneListwithStrand.bed.multires
!data/labels.h5
!data/SRR1770413.sorted.short.bam
!data/SRR1770413.sorted.short.bam.bai
!data/SRR1770413.different_index_filename.bai
!data/SRR1770413.mismatched_bai.bam
!data/geneAnnotationsExonUnions.1000.bed.v3.beddb
!data/masterlist_DHSs_733samples_WM20180608_all_mean_signal_colorsMax.bed.bb
!data/GCA_000350705.1_Esch_coli_KTE11_V1_genomic.short.fna.fai
!data/GCA_002918705.1_ASM291870v1_genomic.gff.gz
!data/genomic.10k.gff
!data/genomic.10k.gff.gz
!data/chm13v1.chrom.sizes
!data/hg38.chrom.sizes
!data/test.1.vcf.gz
!data/no_item_rgb.bed
!data/regions.valid.bed.1.gz
!data/regions.valid.bed
!data/regions.valid.bed.gz
!data/regions.valid.bed.gz.tbi
!data/regions.spaces.bed
!data/genomic.10k.gff.gz.tbi
!data/GCA_000350705.1_Esch_coli_KTE11_V1_genomic.short.fna
output/
COMMANDS
npm-debug.log
Expand Down
8 changes: 8 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,11 @@
v0.21.0

- Huge set of changes to support file-pointer based tileset functions

v0.20.4

- Fix overflow issue in cooler files

v0.20.3

- Add chromsizes tileset_info function
Expand Down
43 changes: 43 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# Clodius

A Python library and CLI tool for aggregating large genomic datasets into tile-based formats for display at multiple resolutions (used by [HiGlass](https://higlass.io)).

## Project Structure

- `clodius/` — main package
- `cli/` — Click-based CLI commands (`aggregate.py`, `convert.py`)
- `tiles/` — tile generation modules per file type (bigwig, cooler, bed, etc.)
- `models/` — Pydantic data models
- `test/` — pytest tests mirroring the source layout
- `test/sample_data/` — small sample files used by tests

## Development Setup

```shell
pip install -e ".[dev]"
```

## Common Commands

Run all tests:
```shell
pytest
```

Run a specific test:
```shell
pytest test/cli_test.py::test_clodius_aggregate_bedgraph
```

Lint:
```shell
flake8 clodius
```

## Key Conventions

- **Linting**: flake8 (configured via `pyproject.toml`)
- **Tests**: pytest with coverage (`pytest --cov=clodius`)
- **Build**: hatchling
- **Main branch**: `develop` (use this as the base for PRs)
- **Python packaging**: `pyproject.toml` (no `setup.py`)
34 changes: 34 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,40 @@ install `clodius` with develop mode:
pip install -e ".[dev]"
```

## Test Fixtures (Git LFS)

Test data files in `data/` are stored in [Git LFS](https://git-lfs.com/). They are downloaded automatically when you clone the repository with LFS enabled:

```shell
git lfs install # once per machine
git clone <repo> # LFS files downloaded automatically
# or, in an existing clone:
git lfs pull
```

### Adding a new test fixture

1. **Check if the file type is already tracked** — open [.gitattributes](.gitattributes) and look for a matching pattern (e.g. `data/*.gz`, `*.bam`). If not, add a new tracking rule:

```shell
git lfs track "data/*.ext" # adds a line to .gitattributes
git add .gitattributes
```

2. **Allow the file through `.gitignore`** — `data/*` is ignored by default. Add a negation line for your file:

```
!data/your_new_file.ext
```

3. **Stage and commit as normal:**

```shell
git add data/your_new_file.ext
git commit -m "Add test fixture: your_new_file.ext"
git push # LFS objects are uploaded automatically
```

## Testing


Expand Down
2 changes: 1 addition & 1 deletion clodius/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = "0.20.3"
__version__ = "0.32.0"
Loading
Loading