Conversation
…ator and streamline debug handling
…rity and parameter naming
…auto-download functionality
…e-tuned and stock modes
- Deleted `graph_utils.py`, which contained functions for adjacency matrix creation and normalization. - Removed `lifter3d.py`, which included keypoint processing, 3D triangulation, and visualization functions. - Eliminated `mocap_dataset.py`, which defined the `MocapDataset` class for handling motion capture data.
… root path accordingly
…uperAnimalEstimator
…uperAnimalEstimator
… and reuse across images, improving efficiency and clarity.
…h/CUDA installation notes
deruyter92
left a comment
There was a problem hiding this comment.
Great PR which definitely improves the package. I really like the addition of the fine-tuned SuperAnimal 2D model!
A few remarks:
- small bug in partial cleanup for rat7m
- the lazy downloading from hugginface is not working as I think you intended it
- the
predict()method should be cleaned a bit - it would be great if you add tests for the new auto-download branch
Overall good PR! See comments
| def build_2d_estimator(): | ||
| """Build the 2D pose estimator once. Snapshot resolves lazily on first predict. | ||
|
|
||
| Empty --saved_2d_model_path -> auto-download fine-tuned snapshot from HF. | ||
| Non-empty path -> use as a local override. | ||
| """ | ||
| from fmpose3d.common.config import SuperAnimalConfig | ||
| from fmpose3d.inference_api.fmpose3d import SuperAnimalEstimator | ||
| from fmpose3d.utils.weights import resolve_weights_path | ||
|
|
There was a problem hiding this comment.
Well done refactoring this: way cleaner, and also more efficient! Few comments:
- The docstring seems to contain an error: the statement "snapshot resolves lazily on first predict" is not correct, since it is resolved immediately.
- The
resolve_weights_pathseems to download from HF directly with an empty path, which seems to be inconsistent with the approach elsewhere (letting it trigger by the predict method) - Minor nitpick: I think the imports in this case can stay on the top of the file. I would lazily import only for heavy packages (like deeplabcut) or modules that are super specific for a single function. These are all lightweight central helpers, so might belong on the top of the file instead.
| def build_2d_estimator(): | |
| """Build the 2D pose estimator once. Snapshot resolves lazily on first predict. | |
| Empty --saved_2d_model_path -> auto-download fine-tuned snapshot from HF. | |
| Non-empty path -> use as a local override. | |
| """ | |
| from fmpose3d.common.config import SuperAnimalConfig | |
| from fmpose3d.inference_api.fmpose3d import SuperAnimalEstimator | |
| from fmpose3d.utils.weights import resolve_weights_path | |
| def build_2d_estimator(): | |
| """Build the 2D pose estimator once. | |
| Empty --saved_2d_model_path -> auto-download fine-tuned snapshot from HF. | |
| Non-empty path -> use as a local override. | |
| """ | |
| pose_snapshot_path = cfg.pose_snapshot_path | ||
| if not pose_snapshot_path and cfg.auto_download_finetuned: | ||
| from fmpose3d.utils.weights import resolve_weights_path | ||
| pose_snapshot_path = resolve_weights_path("", "sa_finetune_hrnet_w32.pt") |
There was a problem hiding this comment.
when auto-download is True and the path is not provided, resolve_weights_path is called on every predict call. (i.e. hf_hub_download checks the local cache on every call)
I think this could add up for videos with many frames. Instead, this should be resolved once (the first predict call)! e.g. you could define an attribute in __init__ that contains the downloaded weights path after the first download? or a simple flag.
| # Fine-tuned mode: non-empty resolved path swaps the stock 39-joint head | ||
| # for a custom DLC checkpoint that predicts the 26-joint Animal3D layout | ||
| # natively (no _map_keypoints needed). | ||
| is_finetuned = bool(pose_snapshot_path) |
There was a problem hiding this comment.
Same here, this can be resolved in __init__. (right now, all information is derived from a static config, which is available at initialization time)
|
|
||
|
|
||
| def resolve_weights_path(model_weights_path: str, model_type: str) -> str: | ||
| def resolve_weights_path(local_path: str, filename: str) -> str: |
There was a problem hiding this comment.
I think it's fine right now (since nobody is probably using this function right now), but we should be careful with renaming keyword arguments, as they can break peoples scripts.
i.e. this is not backward compatible for people who used to handle the weights in their own scripts:
from fmpose3d.utils import resolve_weights_path
configured_path = ""
my_weights_path = resolve_weights_path(model_weights_path=configured_path) # <- breaks now!
or more concerning:
from fmpose3d.utils import resolve_weights_path
my_weights_path = resolve_weights_path(model_type="fmpose3d_humans") # <- breaks now!
There was a problem hiding this comment.
TL;DR I think its fine for now, as you updated all the call sites internally, but be aware that people might use these public functions in their own scripts as well. We should try to keep all public functions backward compatible whenever possible.
There was a problem hiding this comment.
In case this happens in the future, we could add a deprecation warning for cases that are more impactful than this minor change.
There was a problem hiding this comment.
Good point! I’ll keep this in mind for future changes.
| # Default to fine-tuned + lazy HF auto-download so the animal API | ||
| # works out-of-the-box. Construction stays cheap (no network); | ||
| # the download fires on the first predict() call. | ||
| return ( | ||
| SuperAnimalEstimator(SuperAnimalConfig(auto_download_finetuned=True)), | ||
| AnimalPostProcessor(), | ||
| ) | ||
| return HRNetEstimator(), HumanPostProcessor() |
There was a problem hiding this comment.
This seems to be inconsistent with how vis_animals.py resolves the path.
- Here, is is allowed to be handled lazily in the
predict()method. - In
build_2d_estimator()the weights are downloaded directly and passed aspose_snapshot_path.
See my other comments in vis_animals.py. I think you intended the lazy handling in both, and I agree that it is probably better!
There was a problem hiding this comment.
Fixed! Yes, I intended to use lazy handling.
Co-authored-by: Jaap de Ruyter van Steveninck <32810691+deruyter92@users.noreply.github.com>
…and improve snapshot loading logic
…napshot path handling
…ptured kwargs handling
…ation and path resolution
Co-authored-by: Jaap de Ruyter van Steveninck <32810691+deruyter92@users.noreply.github.com>
…FMPose3DInference tests
There was a problem hiding this comment.
Pull request overview
This PR updates the animal pipeline to support a fine-tuned SuperAnimal-Quadruped 2D model that natively predicts the 26-joint Animal3D layout, adds lazy Hugging Face checkpoint auto-downloads, improves demo ergonomics/performance, and cleans up legacy assets/code while fixing HRNet CPU loading behavior.
Changes:
- Add fine-tuned SuperAnimal support (26-joint native output) with lazy Hugging Face auto-download and new
SuperAnimalConfigcontrols. - Refactor animal demo to construct/reuse 2D estimator + 3D lifter once; update scripts/docs toward Animal3D defaults.
- Improve CPU-only HRNet loading via device-aware
map_location, pin PyTorch/torchvision versions, and remove unused legacy YOLO/animal modules/assets.
Reviewed changes
Copilot reviewed 33 out of 34 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_config.py | Update expected dataset defaults from Rat7M to Animal3D. |
| tests/fmpose3d_api/test_fmpose3d.py | Extend API tests for fine-tuned SuperAnimal behavior + lazy HF resolution. |
| scripts/FMPose3D_main.py | Update weight resolution callsite to pass explicit filename. |
| README.md | Document Python 3.10 recommendation and PyTorch/CUDA install behavior; clarify animal demo auto-download. |
| pyproject.toml | Restrict Python versions; pin torch/torchvision ranges; update codespell ignore list. |
| fmpose3d/utils/weights.py | Refactor weights resolver to accept explicit filename for HF download. |
| fmpose3d/lib/yolov3/data/voc.names | Remove unused legacy YOLO asset. |
| fmpose3d/lib/yolov3/cfg/yolo.cfg | Remove unused legacy YOLO config. |
| fmpose3d/lib/yolov3/cfg/yolo-voc.cfg | Remove unused legacy YOLO config. |
| fmpose3d/lib/yolov3/cfg/tiny-yolo-voc.cfg | Remove unused legacy YOLO config. |
| fmpose3d/lib/hrnet/lib/utils/utilitys.py | Remove unused import tied to deleted COCO↔H36M helper module. |
| fmpose3d/lib/hrnet/lib/utils/coco_h36m.py | Remove unused legacy keypoint conversion helpers. |
| fmpose3d/lib/hrnet/hrnet.py | Fix HRNet checkpoint loading and inference input device placement for CPU-only runs. |
| fmpose3d/lib/hrnet/gen_kpts.py | Fix HRNet checkpoint loading and inference input device placement for CPU-only runs. |
| fmpose3d/inference_api/README.md | Update estimator docs to describe fine-tuned vs stock SuperAnimal modes. |
| fmpose3d/inference_api/fmpose3d.py | Implement fine-tuned SuperAnimal path, lazy HF resolution, and default animal components using it. |
| fmpose3d/common/config.py | Expand SuperAnimalConfig with fine-tuned snapshot/config options and behavior docs. |
| fmpose3d/animals/models/graph_frames.py | Remove Rat7M references/test code and align docs with Animal3D focus. |
| fmpose3d/animals/configs/sa_finetune_hrnet_w32.yaml | Add bundled DLC pytorch_config.yaml for the fine-tuned 26-joint model. |
| fmpose3d/animals/configs/init.py | Export packaged config path constants for fine-tuned DLC model configs. |
| fmpose3d/animals/common/mocap_dataset.py | Remove unused legacy dataset base class. |
| fmpose3d/animals/common/lifter3d.py | Remove unused legacy triangulation/visualization utilities. |
| fmpose3d/animals/common/graph_utils.py | Remove unused legacy graph adjacency helpers. |
| fmpose3d/animals/common/arguments.py | Default to Animal3D; add CLI overrides for 2D fine-tuned snapshot/config; remove Rat7M branch. |
| fmpose3d/animals/common/arber_dataset.py | Remove unused legacy dataset implementation. |
| fmpose3d/animals/common/animal_visualization.py | Remove unused legacy visualization helpers. |
| demo/vis_in_the_wild.py | Update weight resolution callsite to pass explicit filename. |
| animals/scripts/test_animal3d.sh | Default to HF auto-download (empty saved_model_path) and improve quoting/comments. |
| animals/scripts/main_animal3d.py | Use shared HF weights resolver; improve test output directory behavior. |
| animals/README.md | Update animal docs for HF auto-downloads and fix a typo. |
| animals/demo/vis_animals.sh | Add 2D snapshot override flag and default to HF auto-download for both 2D/3D. |
| animals/demo/vis_animals.py | Refactor to reuse estimator/lifter and integrate fine-tuned SuperAnimal estimator in the demo. |
| .github/workflows/codespell.yml | Add mot to codespell ignore words. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # predict() returns (kpts (1, N, 26, 2), scores (1, N, 26), valid_mask (N,)). | ||
| kpts, _scores, _mask = estimator.predict(img_bgr[None]) | ||
| # Pack into the {img_path: (1, 26, 2)} format expected by the save/vis code below. | ||
| mapped_keypoints = {path: kpts[:, 0, :, :]} |
| def build_3d_lifter(): | ||
| """Build the 3D lifter once and return (model, device). | ||
|
|
||
| Empty --saved_model_path -> auto-download fmpose3d_animals.pth from HF. | ||
| Non-empty path is used as a local override. | ||
| """ |
There was a problem hiding this comment.
Fixed! I have updated the docstring
| """ | ||
| FMPose3D: monocular 3D Pose Estimation via Flow Matching | ||
|
|
||
| Official implementation of the paper: | ||
| "FMPose3D: monocular 3D Pose Estimation via Flow Matching" | ||
| by Ti Wang, Xiaohang Yu, and Mackenzie Weygandt Mathis | ||
| Licensed under Apache 2.0 | ||
| """ | ||
|
|
||
| """Shared helpers for resolving / downloading FMPose3D model weights.""" | ||
| """Shared helper for resolving / downloading FMPose3D model weights.""" | ||
|
|
||
| HF_REPO_ID: str = "MLAdaptiveIntelligence/FMPose3D" |
…ecks in get_pose2D and get_3D_pose_from_image functions
deruyter92
left a comment
There was a problem hiding this comment.
Thanks for addressing all the points. Good work on the _get_pose_snapshot_path() implementation. Looks all good now!
Summary
This PR adds first-class support for the fine-tuned SuperAnimal-Quadruped 2D pose model used by the animal pipeline, enabling direct 26-joint Animal3D keypoint prediction and automatic checkpoint download from Hugging Face. It also improves the out-of-the-box demo/install path, fixes CPU fallback for the human HRNet demo, and removes unused legacy code/assets.
Changes
sa_finetune_hrnet_w32.ptfor 2D animal posefmpose3d_animals.pthfor the 3D lifteranimals/demo/vis_animals.pyto build the 2D estimator and 3D lifter once, then reuse them across images.SuperAnimalConfigoptions for fine-tuned checkpoints, detector overrides, and lazy Hugging Face resolution.map_locationand moving inputs to the model device.torch>=2.4.1,<2.5andtorchvision>=0.19.1,<0.20, and document the PyTorch/CUDA behavior in the README.>=3.10,<3.13; README recommends Python 3.10 because install/demo paths were tested there.motto the codespell ignore list.Validation
Ran install, test, and demo checks locally:
python3 -m pip install -e '.[animals,viz]' --dry-run python3 -m pytest tests/test_demo_human.py tests/fmpose3d_api/test_fmpose3d.py -q python3 -m pytest tests/test_model.py tests/test_training_pipeline.py -q bash demo/vis_in_the_wild.sh bash animals/demo/vis_animals.shResults
78 passedfor human demo/API tests,8 passedfor model/training smoke tests.