Skip to content

Synthetic_Philly Optimization and Rework#326

Open
santosl529 wants to merge 15 commits into
mainfrom
lorenzo-philly-data
Open

Synthetic_Philly Optimization and Rework#326
santosl529 wants to merge 15 commits into
mainfrom
lorenzo-philly-data

Conversation

@santosl529

@santosl529 santosl529 commented Jun 23, 2026

Copy link
Copy Markdown

#Summary

  • Add a reworked synthetic Philadelphia pipeline with cached spatial/raster city data, batched joblib generation, and Parquet batch outputs rather than DataFrames to reduce overhead
  • Optimize destination diary generation with NumPy-backed EPR bookkeeping, candidate-only gravity lookup, and restored baked gravity callables from the city cache.
  • Reduce dense trajectory overhead by avoiding unused Shapely route-boundary construction, reusing path-relative position while an agent follows the same route, and validating street coordinates with a precomputed set.
  • Update the active Philadelphia bounding box to include more of West Philly.
  • Increase number of agents from 200 to 1000.

Performance

Before batched joblib generation, agent, trajectory, and diary generation took 1709.27s for a 200-agent, dt=0.5 run. After parallelization, generation time improved to about 291s, then to about 104s after the diary/gravity/cache optimizations.

A single-agent dense trajectory profile improved from about 7.43s to 5.11s after the route-overhead cleanup.

Notes

After the bounding-box change, to fix the cached city, rerun once with:

REGENERATE_SPATIAL_DATA = True
REBUILD_CITY_CACHE = True

And then revert back to False after first run.

Closes #298

@santosl529 santosl529 requested a review from paco-barreras June 23, 2026 16:37
@santosl529 santosl529 self-assigned this Jun 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement Philly Dataset Changes

2 participants