Skip to content

feat: implement AI-based matching engine with scoring and inference c…#23

Open
dientruonggg wants to merge 2 commits into
devfrom
feat/DS22
Open

feat: implement AI-based matching engine with scoring and inference c…#23
dientruonggg wants to merge 2 commits into
devfrom
feat/DS22

Conversation

@dientruonggg

@dientruonggg dientruonggg commented May 20, 2026

Copy link
Copy Markdown
Collaborator

This pull request introduces a new AI-powered job matching feature to the backend, leveraging vector search (pgvector), dynamic scoring, and Google Gemini AI for job-profile fit explanations. The implementation includes new API endpoints, database migrations, AI logic, and comprehensive tests. It also adds new dependencies to support AI and validation workflows.

AI Matching Feature Implementation

  • API & Routing

    • Added a new REST endpoint GET /api/ai/match/:profileId that returns a list of jobs matched to a user profile, using the new AI matching logic. (ai.controller.js, ai.resolver.js, index.js) [1] [2] [3]
  • AI Matching Logic

    • Implemented matchJobsForProfile agent: fetches the user profile and CV, performs vector similarity search on jobs using pgvector, dynamically infers scoring weights for each job using Google Gemini AI, calculates a hybrid score, and returns the top matches with AI-generated explanations. (match.agent.js)
    • Added weight inference logic using Google Gemini AI, with fallback to defaults and caching. Weights are validated with Zod and persisted to the database. (match.inference.js)
    • Implemented scoring functions for skills, experience, accessibility, geography, and hybrid score aggregation. (match.scoring.js)

Database & Dependencies

  • Database Migration

    • Added a migration to enable the vector extension, and add embedding and weights_json columns to jobs and narrative_embedding to profiles, supporting vector search and AI metadata. (20260424130019_add_pgvector_and_ai_fields.js)
  • Dependencies

    • Added @google/genai for Gemini API, zod for schema validation, jest and babel-jest for testing. (package.json) [1] [2] [3]
    • Updated .gitignore to exclude yarn.lock.

Testing

  • Comprehensive Unit Tests
    • Added tests for the agent orchestrator, inference logic, and scoring logic, ensuring correct error handling, weight inference, and scoring calculations. (match.agent.test.js, match.inference.test.js, match.scoring.test.js) [1] [2] [3]

References:
[1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14]

…apabilities, and perform comprehensive codebase refactoring and dependency updates.
@dientruonggg dientruonggg requested a review from Copilot May 20, 2026 02:39
@vercel

vercel Bot commented May 20, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
d-shiftify Ready Ready Preview, Comment May 20, 2026 3:05am

@dientruonggg dientruonggg marked this pull request as ready for review May 20, 2026 02:43

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an AI-assisted job matching capability to the backend, combining vector similarity (pgvector) with rule-based scoring and optional Gemini-based inference/explanations, plus an HTTP endpoint to fetch matches for a profile.

Changes:

  • Introduces matching engine components (scoring, weight inference, orchestration agent) and Jest tests for them.
  • Adds DB migration for pgvector and new embedding/weights fields.
  • Exposes a new /ai/match/:profileId API route and adds new dependencies (Gemini SDK + Zod) and test tooling.

Reviewed changes

Copilot reviewed 12 out of 16 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
package-lock.json Adds a root lockfile (currently minimal/empty).
backend/src/core/modules/ai/agents/match/match.scoring.js Implements geo/accessibility/skill/experience scoring and a hybrid score combiner.
backend/src/core/modules/ai/agents/match/match.inference.js Infers and caches per-job scoring weights via Gemini; persists weights to DB.
backend/src/core/modules/ai/agents/match/match.agent.js Orchestrates profile/job retrieval, vector search, scoring, and explanation generation.
backend/src/core/modules/ai/agents/match/tests/match.scoring.test.js Unit tests for scoring functions.
backend/src/core/modules/ai/agents/match/tests/match.inference.test.js Unit tests for weight inference behavior and fallbacks.
backend/src/core/modules/ai/agents/match/tests/match.agent.test.js Unit tests for the orchestrator/agent behavior with mocks.
backend/src/core/database/migrations/20260424130019_add_pgvector_and_ai_fields.js Adds pgvector extension + embedding/weights columns.
backend/src/core/api/index.js Registers the new AI module in the API module resolver list.
backend/src/core/api/ai/ai.resolver.js Adds an AI route resolver (currently not aligned with the repo’s Module system).
backend/src/core/api/ai/ai.controller.js Adds controller to serve matching results.
backend/package.json Adds Jest test script and new dependencies (@google/genai, zod, jest, babel-jest).
backend/.gitignore Ignores yarn.lock (and normalizes Dockerfile entry).
.gitignore Adds root ignores including node_modules and backend/yarn.lock.
Comments suppressed due to low confidence (1)

backend/src/core/database/migrations/20260424130019_add_pgvector_and_ai_fields.js:21

  • exports.down chains .raw('DROP EXTENSION...') off a schema builder, which is not a supported Knex pattern in this codebase (and likely won’t execute). Use sequential await knex.schema.alterTable(...) calls, then await knex.raw('DROP EXTENSION IF EXISTS vector;').
exports.down = function(knex) {
  return knex.schema
    .alterTable('profiles', table => {
      table.dropColumn('narrative_embedding');
    })
    .alterTable('jobs', table => {
      table.dropColumn('embedding');
      table.dropColumn('weights_json');
    })
    .raw('DROP EXTENSION IF EXISTS vector;');

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1 to +21
exports.up = function(knex) {
return knex.schema.raw('CREATE EXTENSION IF NOT EXISTS vector;')
.alterTable('jobs', table => {
table.specificType('embedding', 'vector(768)');
table.jsonb('weights_json');
})
.alterTable('profiles', table => {
table.specificType('narrative_embedding', 'vector(768)');
});
};

exports.down = function(knex) {
return knex.schema
.alterTable('profiles', table => {
table.dropColumn('narrative_embedding');
})
.alterTable('jobs', table => {
table.dropColumn('embedding');
table.dropColumn('weights_json');
})
.raw('DROP EXTENSION IF EXISTS vector;');

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 7d6696a. I split the migration into sequential awaited steps (knex.raw + separate knex.schema.alterTable calls) in both up and down.

Comment thread backend/src/core/api/ai/ai.resolver.js Outdated
Comment on lines +1 to +31
import { Router } from 'express';
import { AIController } from './ai.controller.js';

export const AIResolver = {
/**
* @param {Router} router
*/
resolve: (router) => {
const aiRouter = Router();

/**
* @swagger
* /ai/match/{profileId}:
* get:
* tags: [AI]
* summary: Lấy danh sách việc làm phù hợp với profile
* parameters:
* - in: path
* name: profileId
* required: true
* schema:
* type: string
* responses:
* 200:
* description: Thành công
*/
aiRouter.get('/match/:profileId', AIController.matchForProfile);

router.use('/ai', aiRouter);
}
};

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 7d6696a. AIResolver now uses the Module.builder().addPrefix(...).register([...]) pattern instead of a plain object resolver.

Comment on lines +3 to +18
export const AIController = {
/**
* GET /api/ai/match/:profileId
*/
matchForProfile: async (req, res, next) => {
try {
const { profileId } = req.params;
const matches = await matchJobsForProfile(profileId);

return res.status(200).json({
success: true,
data: matches,
message: 'Tìm việc làm phù hợp thành công.'
});
} catch (error) {
next(error);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 7d6696a. AIController.matchForProfile now follows project conventions (accepts req only and returns ValidHttpResponse.toOkResponse(...)).

const deg2rad = (deg) => deg * (Math.PI / 180);

export const geoScore = (lat1, lon1, lat2, lon2) => {
if (!lat1 || !lon1 || !lat2 || !lon2) return 0;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 7d6696a. geoScore now uses explicit null/undefined checks and numeric validation, so 0 coordinates are treated as valid values.

Comment on lines +26 to +33
export const inferWeights = async (job) => {
if (weightCache.has(job.id)) return weightCache.get(job.id);

if (job.weights_json) {
const w = typeof job.weights_json === 'string' ? JSON.parse(job.weights_json) : job.weights_json;
weightCache.set(job.id, w);
return w;
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 7d6696a. weights_json parsing is now guarded, validated with Zod, normalized, and safely falls back to defaults when invalid.

Comment thread backend/.gitignore Outdated
Comment on lines +55 to +57
Dockerfile

yarn.lock No newline at end of file

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 7d6696a. I removed yarn.lock from .gitignore so the existing committed lockfile remains tracked.

Comment on lines +49 to +53
const profile = {
...profileRecord,
skills: cvRecord.skills || [],
experiences: cvRecord.experiences || []
};

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 7d6696a. I now normalize/load additional profile data from CV JSON (conditions) including accessibility_needs and optional coordinates before scoring.

Comment on lines +21 to +33
export const atMatchScore = (needs = [], job) => {
let s = 0;
const env = (job.work_environment || '').toLowerCase();

if (needs.includes('screen_reader')) {
if (['nvda', 'jaws', 'screen_reader'].some((k) => env.includes(k))) s += 0.5;
else if (job.accessibility_level === 'AAA') s += 0.3;
else if (job.accessibility_level === 'AA') s += 0.15;
}
if (needs.includes('voice_control') && job.is_remote) s += 0.3;
if (job.accessibility_level === 'AAA') s += 0.2;
else if (job.accessibility_level === 'AA') s += 0.1;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 7d6696a. atMatchScore now maps accessibility scoring to existing schema data (work_mode) while preserving existing compatibility checks.

Comment on lines +39 to +43
export const skillScore = (candidate_skills = [], job_skills = [], semantic_score = 0) => {
if (!job_skills || !job_skills.length) return semantic_score;
const jLower = job_skills.map((s) => s.toLowerCase());
const exact = candidate_skills.filter((s) => jLower.includes(s.toLowerCase())).length;
return Math.min(exact / job_skills.length + semantic_score * 0.3, 1);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 7d6696a. skillScore now normalizes both string and object skill shapes (e.g. { name: ... }) before matching.

Comment on lines +47 to +53
const cvRecord = await knex('cvs').where({ profile_id: profileId }).first() || {};

const profile = {
...profileRecord,
skills: cvRecord.skills || [],
experiences: cvRecord.experiences || []
};

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 7d6696a. In match.agent I now parse/normalize JSONB-backed CV/job fields (skills, experiences, conditions) before scoring and explanation generation.

Agent-Logs-Url: https://github.com/dscdut/D-Shiftify/sessions/9dbcbe07-c588-4bf2-bad3-afc62c3e0db1

Co-authored-by: dientruonggg <109971402+dientruonggg@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants