Skip to content

Feature/geoprior#80

Draft
mohamedelabbas1996 wants to merge 6 commits into
feature/bq-squashfs-pipelinefrom
feature/geoprior
Draft

Feature/geoprior#80
mohamedelabbas1996 wants to merge 6 commits into
feature/bq-squashfs-pipelinefrom
feature/geoprior

Conversation

@mohamedelabbas1996
Copy link
Copy Markdown

No description provided.

mohamedelabbas1996 and others added 6 commits June 4, 2026 20:27
Initial commit of the global geo-prior (FCNet) pipeline as used for the
geoprior-fcnet-global-12317cls-v1 run:

- src/dataset_tools/build_geoprior_json.py: BigQuery -> COCO-style
  train/val/test JSON for geo-prior training.
- research/geoprior/train_geoprior.py: FCNet training wrapper around
  fagner-lepsAI/geo_prior, with wandb logging + per-epoch checkpoints.
- research/geoprior/predict_geoprior.py: per-occurrence prior generation.
- research/geoprior/fusion_eval_top5.py: geo-prior + classifier top-5
  fusion evaluation.

Scripts committed as-is; cleanup and reproducibility work (category-map
builder, vendored geo_prior modules, README, pinned deps, .gitignore) to
follow in subsequent commits.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The species->class_id map is the frozen class-space contract for the
geo-prior FCNet model (output indices are bound to this alphabetical
ordering). Commit it as a versioned artifact and add a builder that
regenerates+verifies it from BigQuery rather than blindly overwriting.

- research/geoprior/geoprior_categ_map.json: frozen 12,317-class map
  (snapshot public_gbif_2026-05; 6,864,466 geocoded occurrences).
- research/geoprior/geoprior_categ_map.PROVENANCE.md: source BQ tables,
  definition, and regenerate/verify commands.
- src/dataset_tools/build_geoprior_categ_map.py: rebuilds the map (and
  label_map/metadata/master lists) from leps-ai BQ tables; verifies
  against the frozen artifact and refuses to overwrite on drift.

Verified: the builder reproduces the committed map exactly (12,317
entries identical) and its BQ count query matches master_species_with_counts.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Move the geo-prior scripts and frozen artifacts out of the shared
src/dataset_tools/ (BQ/vision pipeline) and research/ into a dedicated
src/geoprior/ package. Pure relocation — no content changes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- geoprior_fagner/: the geo-prior FCNet network (FCNet, losses, dataloader)
  kept in-tree, frozen at upstream commit ff4ccd1 (Apache-2.0). Removes the
  dependency on an external clone.
- config.py: central configuration read from environment / src/geoprior/.env.
  No machine-specific paths or secrets in code.
- .env.sample: annotated configuration template (real .env is gitignored).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…rdcoded paths

- All scripts now import the in-tree geoprior_fagner network and read paths,
  BigQuery project/dataset, and W&B settings from config (no /mnt or /home
  paths, no sys.path hack, no secrets in code).
- build_geoprior_json now reads the in-repo frozen category map (was /mnt).
- Fix stale docstring: MIN_OCC_PER_SPECIES default is 0 (keep all species).
- .gitignore: ignore model checkpoints (*.pth).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- README.md: end-to-end documentation (layout, .env config, each pipeline
  stage, inputs, model details, reproducibility).
- requirements.txt: pinned dependencies for the build and training stages.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 5, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c18ca22b-cb16-42e4-bd90-65d85e90ec1c

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/geoprior

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@mohamedelabbas1996 mohamedelabbas1996 changed the base branch from main to feature/bq-squashfs-pipeline June 5, 2026 22:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant