Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first approach
-
Updated
Jan 15, 2024 - Python
Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first approach
Generate conversational, tool-calling, structured-output, and preference datasets — easily and at scale
MCP server for human-in-the-loop surveys, A/B preference tests, ratings, and rankings. Get real human feedback inside Claude Code, Claude Desktop, Cursor, Windsurf, and any MCP client — powered by Datapoint AI.
Curated tools, papers, datasets, and practices for LLM training data engineering.
Pairwise rating CLI for AI responses — per-axis scoring (helpfulness/harmlessness/accuracy/instruction-following), JSONL in/out, inter-rater Cohen's kappa
A forkable example of the human-in-the-loop model-improvement loop: AI generates, humans judge via the Terac MCP, you improve the model. Built as an SVG illustration arena.
This repository contains all artifacts produced during my bachelor's thesis on data modeling for collective decision-making.
RLHF preference data curation pipeline: HH-RLHF + UltraFeedback + OASST1 → quality filter → MinHash dedup → DPO-ready JSONL
Add a description, image, and links to the preference-data topic page so that developers can more easily learn about it.
To associate your repository with the preference-data topic, visit your repo's landing page and select "manage topics."