You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The course site is built with Jekyll and hosted on GitHub Pages. The landing page is [`index.md`](index.md). The course manual is [`course_manual.md`](course_manual.md), project guidelines are in [`project_guidelines.md`](project_guidelines.md), and the six group projects live in [`projects/`](projects/).
7
+
Lecturer: [Javier Garcia-Bernardo](https://javier.science/), Assistant Professor of Social Data Science, Department of Methodology & Statistics, Utrecht University.
8
+
9
+
The course site is built with Jekyll and hosted on GitHub Pages. The landing page is [index.md](index.md). The course manual is [course_manual.md](course_manual.md), project guidelines are in [project_guidelines.md](project_guidelines.md), and the six group projects live in [projects/](projects/).
Copy file name to clipboardExpand all lines: _config.yml
+5-1Lines changed: 5 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,9 @@
1
1
title: PCD
2
-
description: Materials for the Methodology and Statistics master course <i>Processing Complex Data</i>.
2
+
description: Materials for the Methodology and Statistics master course Processing Complex Data.
3
+
sidebar: >-
4
+
<span class="sidebar-summary">This webpage contains all materials for the Methodology and Statistics master course <i>Processing Complex Data</i> (PCD). The materials on this website are <a href="https://creativecommons.org/licenses/by/4.0/">CC-BY-4.0 licensed</a>.</span>
5
+
<span class="sidebar-block"><strong>Lecturer</strong><br><a href="https://javier.science/">Javier Garcia-Bernardo</a><br>Assistant Professor of Social Data Science<br>Department of Methodology & Statistics<br>Utrecht University</span>
This webpage contains all materials for the Methodology and Statistics master course **Processing Complex Data (PCD)**. The materials on this website are [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/) licensed.
**Lecturer:**[Javier Garcia-Bernardo](https://javier.science/), Assistant Professor of Social Data Science, Department of Methodology & Statistics, Utrecht University.
6
4
7
5
## About the course
8
6
9
7
Contrary to what most introductory data science and statistics courses teach, real-world scientific data come in an enormous variety of formats, sizes, structures, and procedures — from simple tables to spatiotemporal arrays, normalized relational schemas, nested API responses, raw scraped web pages, networks, and domain-specific scientific standards. This course gives students hands-on experience with handling, processing, and modelling six families of complex data, in a hackathon-style format where each group goes deep on one data type and teaches the rest of the class.
10
8
11
9
The narrative spine of the course is *from raw traces to defensible claims*. Each group works through a single pipeline: raw source → operationalized clean object → baseline model with one sensitivity check → presentation.
12
10
13
-
## Course materials
14
-
15
-
-[Course manual](course_manual.md) — official course description, learning goals, assessment, and materials.
| Geospatial |[`projects/geospatial.md`](projects/geospatial.md)| What is the relation between municipal land use and population composition? |
34
-
| Networks |[`projects/networks.md`](projects/networks.md)| What is the relationship between gender and cross-program relations in high school? |
35
-
| Messy web text |[`projects/messy_web_text.md`](projects/messy_web_text.md)| Do company sustainability pages differ linguistically from public-interest climate information pages? |
36
-
| Relational database |[`projects/relational_database.md`](projects/relational_database.md)| Which driver, constructor, grid, circuit, and season characteristics are associated with F1 finishing points? |
37
-
| Time series |[`projects/time_series.md`](projects/time_series.md)| How does an fMRI signal change across NSD scan sessions? |
38
-
| API data |[`projects/api_data.md`](projects/api_data.md)| Which study attributes are associated with completed versus ongoing clinical trials? |
26
+
| Geospatial |[projects/geospatial.md](projects/geospatial.md)| What is the relation between municipal land use and population composition? |
27
+
| Networks |[projects/networks.md](projects/networks.md)| What is the relationship between gender and cross-program relations in high school? |
28
+
| Messy web text |[projects/messy_web_text.md](projects/messy_web_text.md)| Do company sustainability pages differ linguistically from public-interest climate information pages? |
29
+
| Relational database |[projects/relational_database.md](projects/relational_database.md)| Which driver, constructor, grid, circuit, and season characteristics are associated with F1 finishing points? |
30
+
| Time series |[projects/time_series.md](projects/time_series.md)| How does an fMRI signal change across NSD scan sessions? |
31
+
| API data |[projects/api_data.md](projects/api_data.md)| Which study attributes are associated with completed versus ongoing clinical trials? |
Copy file name to clipboardExpand all lines: projects/api_data.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,7 +5,7 @@
5
5
- Programming language: `R` (suggested) or `python` (allowed)
6
6
- Expert contact: `Javier Garcia-Bernardo`
7
7
8
-
> **Canonical course conventions live in [`project_guidelines.md`](../project_guidelines.md).** That file is the source of truth for the four required workflow files (`week1_explore.qmd`, `week2_operationalize_clean.qmd`, `week3_model.qmd`, `week4_storytelling.qmd`), the `data/model_data.rds` -> `data/model_results.rds` pipeline, the raw-data policy, quality-check requirements, decision logs, and contribution tracking. Read it before starting and treat anything below as project-specific guidance on top of those conventions.
8
+
> **Canonical course conventions live in [project_guidelines.md](../project_guidelines.md).** That file is the source of truth for the four required workflow files (`week1_explore.qmd`, `week2_operationalize_clean.qmd`, `week3_model.qmd`, `week4_storytelling.qmd`), the `data/model_data.rds` -> `data/model_results.rds` pipeline, the raw-data policy, quality-check requirements, decision logs, and contribution tracking. Read it before starting and treat anything below as project-specific guidance on top of those conventions.
Copy file name to clipboardExpand all lines: projects/geospatial.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,7 +5,7 @@
5
5
- Programming language: `R` (suggested) or `python` (allowed)
6
6
- Expert contact: TBD, Marco Helvich
7
7
8
-
> **Canonical course conventions live in [`project_guidelines.md`](../project_guidelines.md).** That file is the source of truth for the four required workflow files (`week1_explore.qmd`, `week2_operationalize_clean.qmd`, `week3_model.qmd`, `week4_storytelling.qmd`), the `data/model_data.rds` -> `data/model_results.rds` pipeline, the raw-data policy, quality-check requirements, decision logs, and contribution tracking. Read it before starting and treat anything below as project-specific guidance on top of those conventions.
8
+
> **Canonical course conventions live in [project_guidelines.md](../project_guidelines.md).** That file is the source of truth for the four required workflow files (`week1_explore.qmd`, `week2_operationalize_clean.qmd`, `week3_model.qmd`, `week4_storytelling.qmd`), the `data/model_data.rds` -> `data/model_results.rds` pipeline, the raw-data policy, quality-check requirements, decision logs, and contribution tracking. Read it before starting and treat anything below as project-specific guidance on top of those conventions.
Copy file name to clipboardExpand all lines: projects/messy_web_text.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,7 +5,7 @@
5
5
- Programming language: `python` (suggested) or `R` (allowed)
6
6
- Expert contact: TBD, Anastasia?
7
7
8
-
> **Canonical course conventions live in [`project_guidelines.md`](../project_guidelines.md).** That file is the source of truth for the four required workflow files (`week1_explore.qmd`, `week2_operationalize_clean.qmd`, `week3_model.qmd`, `week4_storytelling.qmd`), the `data/model_data.rds` -> `data/model_results.rds` pipeline, the raw-data policy, quality-check requirements, decision logs, and contribution tracking. Read it before starting and treat anything below as project-specific guidance on top of those conventions.
8
+
> **Canonical course conventions live in [project_guidelines.md](../project_guidelines.md).** That file is the source of truth for the four required workflow files (`week1_explore.qmd`, `week2_operationalize_clean.qmd`, `week3_model.qmd`, `week4_storytelling.qmd`), the `data/model_data.rds` -> `data/model_results.rds` pipeline, the raw-data policy, quality-check requirements, decision logs, and contribution tracking. Read it before starting and treat anything below as project-specific guidance on top of those conventions.
Copy file name to clipboardExpand all lines: projects/networks.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@
6
6
7
7
- Programming language: `python` (suggested) or `R` (allowed)
8
8
9
-
> **Canonical course conventions live in [`project_guidelines.md`](../project_guidelines.md).** That file is the source of truth for the four required workflow files (`week1_explore.qmd`, `week2_operationalize_clean.qmd`, `week3_model.qmd`, `week4_storytelling.qmd`), the `data/model_data.rds` -> `data/model_results.rds` pipeline, the raw-data policy, quality-check requirements, decision logs, and contribution tracking. Read it before starting and treat anything below as project-specific guidance on top of those conventions.
9
+
> **Canonical course conventions live in [project_guidelines.md](../project_guidelines.md).** That file is the source of truth for the four required workflow files (`week1_explore.qmd`, `week2_operationalize_clean.qmd`, `week3_model.qmd`, `week4_storytelling.qmd`), the `data/model_data.rds` -> `data/model_results.rds` pipeline, the raw-data policy, quality-check requirements, decision logs, and contribution tracking. Read it before starting and treat anything below as project-specific guidance on top of those conventions.
0 commit comments