Feat/catalog content sections by ydankner · Pull Request #14 · Tue-StudyOS/StudyPlanner

ydankner · 2026-06-26T08:19:34Z

No description provided.

The catalog Contents field only rendered for courses whose ALMA "Inhalte" tab was a single unlabelled blob (stored as one section literally titled "Inhalte"). Courses with structured labelled sub-boxes (Lernziele, Qualifikationsziel, ...) — the majority — produced no "Inhalte"-titled section, so Contents was empty even though the scraped data was present. Replace the exact-title `_extract_contents` with `_build_content_sections`, which returns every content block (title + text), de-duped against the Description (shown verbatim) and Prerequisites (their own section) and with the heading the scraper duplicates into the body stripped. Contents becomes a list of {title, text}; the frontend renders each block with a sub-heading, suppressing the generic "Inhalte" wrapper title for unstructured courses. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

parse_content_page skipped every content box titled "Inhalte", assuming it was tab-navigation chrome. For structured courses (the majority) that box is the actual syllabus, so the real contents (e.g. INF4151's "Aufbauend auf ...") were never scraped — only the labelled sibling fields (Lernziele, ...) were. Stop dropping "Inhalte" boxes; they have the same shape as the other labelled fields and flow through the existing title-strip and dedup unchanged. Only the genuine sibling tab panes ("Semesterplanung", "Weitere Funktionen") stay skipped, and empty/chrome-only "Inhalte" boxes still collapse to the fallback. Add a regression test driven by INF4151's real contents-tab HTML (saved as a fixture): it fails before the change (no "Inhalte" section) and passes after, while the labelled fields keep being captured. Note: production D1 only reflects this after a re-scrape + re-import. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Capture the non-obvious facts a future agent needs: the catalog is a snapshot so scraper changes require a re-scrape + re-import to affect live data; the ALMA catalog is public and parse_content_page is the pure testable seam (with fixtures under data_collection/alma/tests/); and the local-dev API base URL trap plus the wrangler --remote preview-token refresh failure mode. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

cloudflare-workers-and-pages · 2026-06-26T08:22:49Z

Deploying studyplaner with Cloudflare Pages

Latest commit:	`bbefd40`
Status:	✅ Deploy successful!
Preview URL:	https://d0c8c9a4.studyplaner.pages.dev
Branch Preview URL:	https://feat-catalog-content-section.studyplaner.pages.dev

View logs

ydankner and others added 3 commits June 24, 2026 18:19

ydankner merged commit 4dcb6f4 into main Jun 26, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat/catalog content sections#14

Feat/catalog content sections#14
ydankner merged 3 commits into
mainfrom
feat/catalog-content-sections

ydankner commented Jun 26, 2026

Uh oh!

cloudflare-workers-and-pages Bot commented Jun 26, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

ydankner commented Jun 26, 2026

Uh oh!

cloudflare-workers-and-pages Bot commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying studyplaner with Cloudflare Pages

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cloudflare-workers-and-pages Bot commented Jun 26, 2026 •

edited

Loading