Feature/geoprior#80
Draft
mohamedelabbas1996 wants to merge 6 commits into
Draft
Conversation
Initial commit of the global geo-prior (FCNet) pipeline as used for the geoprior-fcnet-global-12317cls-v1 run: - src/dataset_tools/build_geoprior_json.py: BigQuery -> COCO-style train/val/test JSON for geo-prior training. - research/geoprior/train_geoprior.py: FCNet training wrapper around fagner-lepsAI/geo_prior, with wandb logging + per-epoch checkpoints. - research/geoprior/predict_geoprior.py: per-occurrence prior generation. - research/geoprior/fusion_eval_top5.py: geo-prior + classifier top-5 fusion evaluation. Scripts committed as-is; cleanup and reproducibility work (category-map builder, vendored geo_prior modules, README, pinned deps, .gitignore) to follow in subsequent commits. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The species->class_id map is the frozen class-space contract for the geo-prior FCNet model (output indices are bound to this alphabetical ordering). Commit it as a versioned artifact and add a builder that regenerates+verifies it from BigQuery rather than blindly overwriting. - research/geoprior/geoprior_categ_map.json: frozen 12,317-class map (snapshot public_gbif_2026-05; 6,864,466 geocoded occurrences). - research/geoprior/geoprior_categ_map.PROVENANCE.md: source BQ tables, definition, and regenerate/verify commands. - src/dataset_tools/build_geoprior_categ_map.py: rebuilds the map (and label_map/metadata/master lists) from leps-ai BQ tables; verifies against the frozen artifact and refuses to overwrite on drift. Verified: the builder reproduces the committed map exactly (12,317 entries identical) and its BQ count query matches master_species_with_counts. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Move the geo-prior scripts and frozen artifacts out of the shared src/dataset_tools/ (BQ/vision pipeline) and research/ into a dedicated src/geoprior/ package. Pure relocation — no content changes. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- geoprior_fagner/: the geo-prior FCNet network (FCNet, losses, dataloader) kept in-tree, frozen at upstream commit ff4ccd1 (Apache-2.0). Removes the dependency on an external clone. - config.py: central configuration read from environment / src/geoprior/.env. No machine-specific paths or secrets in code. - .env.sample: annotated configuration template (real .env is gitignored). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…rdcoded paths - All scripts now import the in-tree geoprior_fagner network and read paths, BigQuery project/dataset, and W&B settings from config (no /mnt or /home paths, no sys.path hack, no secrets in code). - build_geoprior_json now reads the in-repo frozen category map (was /mnt). - Fix stale docstring: MIN_OCC_PER_SPECIES default is 0 (keep all species). - .gitignore: ignore model checkpoints (*.pth). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- README.md: end-to-end documentation (layout, .env config, each pipeline stage, inputs, model details, reproducibility). - requirements.txt: pinned dependencies for the build and training stages. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.