Skip to content

feat: implement extended nogood propagation and CPIP nogood learning#454

Open
ImkoMarijnissen wants to merge 61 commits into
mainfrom
feat/generalising-conflict-analysis
Open

feat: implement extended nogood propagation and CPIP nogood learning#454
ImkoMarijnissen wants to merge 61 commits into
mainfrom
feat/generalising-conflict-analysis

Conversation

@ImkoMarijnissen

@ImkoMarijnissen ImkoMarijnissen commented May 21, 2026

Copy link
Copy Markdown
Contributor

Based on the paper "From Literals to Atomic Constraints: Generalising Conflict-Driven Clause Learning for Constraint Programming - Imko Marijnissen, Maarten Flippo, and Emir Demirović" (to appear at CP'26), I have implemented the extended nogood propagation and CPIP nogood learning.

Overview

The PR consists of the following:

  • Adjusting the ResolutionResolver to be able to produce CPIP nogoods; this involves updating the stopping criterion for resolving and the structure LearnedNogood to ensure certain invariants.
  • Adjusting the NogoodPropagator in several ways:
  • When adding a(n) (asserting) nogood, the watchers need to be placed on atomic constraints over different variables, and the detection of propagation, which can take place at the root level, is adjusted.
  • When propagating, the watcher structure is different between the two approaches so this has been adjusted
  • I implemented extended nogood propagation based on the algorithm described in the paper.

This supersedes the branch feat/extended-conflict-analysis in several ways:

  • It now uses lazy explanations for the propagation
  • It also performs nogood database management; this is based on the original (SAT-based) strategy. This could be adjusted in the future since LBD could be uninformative.

Feedback

The main point I would like feedback on is whether it makes sense to keep the 1UIP + unit propagation and CPIP + extended nogood propagation approaches merged or whether it would be preferable to be separated into their own structs. I have kept it this way to be able to clearly see the differences between the two, but I can imagine that this does not make the code clearer.

Additionally, do we want to include testing this feature in the CI or is that unnecessary?

@maartenflippo I ran into the issue that retrieving from unit_nogood_inference_codes when using CPIP + extended nogood propagation led to some issues. I have resolved this in a way but it would be good to hear your opinion on this!

Experimentation

I tested the extended nogood propagation + CPIP learning using different priority schemes. The (Updated) instances are the ones where incremental stopping condition calculation is included.

Overall Results

1UIP + High Priority:

{'ERROR': 42,
 'OPTIMAL': 100,
 'SATISFIABLE': 153,
 'UNKNOWN': 93,
 'UNSATISFIABLE': 5}

CPIP with Extended Nogood Propagation + High Priority:

{'ERROR': 43,
 'OPTIMAL': 95,
 'SATISFIABLE': 156,
 'UNKNOWN': 94,
 'UNSATISFIABLE': 5}

CPIP with Extended Nogood Propagation + High Priority (Updated):

{'ERROR': 48,
 'OPTIMAL': 101,
 'SATISFIABLE': 146,
 'UNKNOWN': 93,
 'UNSATISFIABLE': 5}

1UIP + Very Low Priority:

{'ERROR': 43,
 'OPTIMAL': 101,
 'SATISFIABLE': 151,
 'UNKNOWN': 93,
 'UNSATISFIABLE': 5}

CPIP with Extended Nogood Propagation + Very Low Priority:

{'ERROR': 43,
 'OPTIMAL': 97,
 'SATISFIABLE': 154,
 'UNKNOWN': 94,
 'UNSATISFIABLE': 5}

CPIP with Extended Nogood Propagation + Very Low Priority (Updated):

{'ERROR': 32,
 'OPTIMAL': 99,
 'SATISFIABLE': 167,
 'UNKNOWN': 90,
 'UNSATISFIABLE': 5}

MiniZinc Scoring

{
    '1UIP + High Priority': 641.0,
    'CPIP + High Priority': 670.71,
    'CPIP + High Priority (Updated)': 653.61,
    '1UIP + Very Low Priority': 652.35,
    'CPIP + Very Low Priority': 698.09,
    'CPIP + Very Low Priority (Updated)': 708.24,
}

Average Primal Integral

{
    '1UIP + High Priority': 63.18,
    'CPIP + High Priority': 64.09,
    'CPIP + High Priority (Updated)': 69.13,
    '1UIP + Very Low Priority': 68.44,
    'CPIP + Very Low Priority': 63.37,
    'CPIP + Very Low Priority (Updated)': 46.45,
}

Overall Conclusions

In terms of the number of instances solved, 1UIP with a Very Low Priority appears to be best. However, looking at the MiniZinc scoring, both of the CPIP approaches outperform their 1UIP counterparts, with CPIP + Very Low priority performing the best. For the primal integral, the results are more mixed, with CPIP + Very Low Priority having the best anytime performance. This appears to indicate that CPIP has some better anytime performance compared to 1UIP learning (when using a very low priority of the nogood propagator). I think it would be better to keep the 1UIP as the default since I would expect 1UIP to outperform when using free search.

After Update: It appears that while 1UIP + Very Low Priority is generally the best at proving optimality, CPIP + Very Low Priority (Updated) proves optimality on a similar number of instances while providing solutions on 16 more. Additionally, the MiniZinc score of this approach is significantly higher, and the primal integral is significantly lower.

TODO

  • Rerun the experimentation to determine the impact of the changes
  • Calculate the stopping condition for learning CPIP nogoods incrementally rather than recalculating from scratch in every iteration

@maartenflippo maartenflippo left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started doing an in-depth review but got lost, so I am pausing here for a moment.

My overall impression of this PR so far is that it is incredibly hard to understand what is going on. Both in the propagator and the resolver. This is because two very different analysis and propagation routines are merged into one. I understand that there would be a lot of duplication, but this is almost impossible to understand in a code review.

The main point I would like feedback on is whether it makes sense to keep the 1UIP + unit propagation and CPIP + extended nogood propagation approaches merged or whether it would be preferable to be separated into their own structs. I have kept it this way to be able to clearly see the differences between the two, but I can imagine that this does not make the code clearer.

Depending on the results of the experiment, I think we may simply want to remove the one we do not use... However, if we want to keep everything, we definitely want to find a different way to structure it. Perhaps we can do a zoom meeting at some point to talk about how we can do this?

Additionally, do we want to include testing this feature in the CI or is that unnecessary?

Yes, we do. Now we are not running it so I cannot see whether this works or not. I expect it to fail at least on test cases where we run the solver + processor + checker, as the new propagation rule is not covered by our unit propagation inference rule in the proof.

@maartenflippo I ran into the issue that retrieving from unit_nogood_inference_codes when using CPIP + extended nogood propagation led to some issues. I have resolved this in a way but it would be good to hear your opinion on this!

I think I found the place you talk about here. I left a comment, there as well, but I think it is not right. Yet, running with proof logging should clarify a lot.

Comment on lines -91 to -102
#[derive(Debug, Clone, Copy, Default)]
/// Determines which type of learning is performed by the [`ResolutionResolver`].
pub enum AnalysisMode {
#[default]
/// Standard conflict analysis which returns as soon as the first unit implication point is
/// found (i.e. when a nogood is created which only contains a single predicate from the
/// current decision level)
OneUIP,
/// An alternative to 1-UIP which stops as soon as the learned nogood only creates decision
/// predicates.
AllDecision,
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This I don't want to remove. The ConflictResolverType may have a bunch of other variants that have nothing to do with the resolution resolver (e.g. no-learning or the hypercube linear resolver).

@@ -179,8 +177,87 @@ impl ResolutionResolver {
// level, and both will be decisions. This is accounted for below.
while {
match self.mode {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... can we not? 😅 Please, extract this into its own function or something...

On a serious note: Keeping the AnalysisMode here would be a good idea. We can give it an associated function keep_resolving(...) -> bool.

| ConflictResolverType::BoundsExtendedCPIP => dec_level == context.get_checkpoint(),
ConflictResolverType::AllDecision => !context.is_decision_predicate(predicate),
ConflictResolverType::NoLearning => unreachable!(),
} {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could also be a function on AnalysisMode

#[allow(unused, reason = "Will be reintroduced with database management")]
handle: PropagatorHandle<NogoodPropagator>,

analysis_mode: ConflictResolverType,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel this should be a separate type here. Make it specific to what the variants are that this propagator supports, and remove any unreachable branches.

Comment on lines +180 to +186
impl Eq for Watcher {}

impl PartialEq for Watcher {
fn eq(&self, other: &Self) -> bool {
self.nogood_id.eq(&other.nogood_id)
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am afraid this could shoot us in the foot later on. In case we want to write e.g. tests around this, the equality behavior is not what we expect. Instead perhaps we can introduce a method on the Wathcer that tests for this type of equality instead?

code: u64,
mut context: ExplanationContext,
) -> LazyExplanation<'_> {
if (code >> 32) > 0 {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is going on here? Can we make use of a bitstruct (like e.g. the element propagator) if we are packing fields into a u64?

Comment on lines +804 to +832
for predicate in nogood.iter().filter_map(|&predicate_id| {
// First, we filter out all of the predicates which are currently satisfied and
// which are not concerning the propagated domain id
let predicate = context.get_predicate(predicate_id);

is_falsified |= context.is_predicate_id_falsified(predicate_id);

pumpkin_assert_moderate!(
predicate.get_domain() == propagated_domain
|| context.is_predicate_id_satisfied(predicate_id),
"Expected {predicate} to be unassigned with {propagated_domain:?}; nogood: {:?}",
nogood
.iter()
.map(|predicate_id| {
let predicate = context.get_predicate(*predicate_id);

(
predicate,
context.lower_bound(&predicate.get_domain()),
context.upper_bound(&predicate.get_domain()),
)
})
.collect::<Vec<_>>()
);

(!context.is_predicate_id_satisfied(predicate_id)
&& predicate.get_domain() == propagated_domain)
.then_some(predicate)
}) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please pull this out into a named binding

Comment on lines +361 to +390
let unit_ic = unit_nogood_inference_codes.get(&predicate).or_else(|| {
// It could be the case that we attempt to get the reason for the predicate
// [x >= v] but that the corresponding unit nogood idea is the one for the
// predicate [x == v]
let domain_id = predicate.get_domain();
let right_hand_side = predicate.get_right_hand_side();

unit_nogood_inference_codes.get(&predicate!(domain_id == right_hand_side))
});

if let Some(unit_ic) = unit_ic {
let _ = proof_log.log_inference(
&mut state.constraint_tags,
unit_ic.clone(),
[],
Some(predicate),
&state.variable_names,
&state.assignments,
);
} else {
// Otherwise we log the inference which was used to derive the nogood
let _ = proof_log.log_inference(
&mut state.constraint_tags,
ic.clone(),
reason_buffer.as_ref().iter().copied(),
Some(predicate),
&state.variable_names,
&state.assignments,
);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt whether this is correct, but running with proof checking on should make that obvious.

@ImkoMarijnissen ImkoMarijnissen May 26, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly, running the test suite with the extended nogood propagation and CPIP passes all test cases after this change; any idea what could be going wrong? Should I try to create the proof and process it manually?

@ImkoMarijnissen

Copy link
Copy Markdown
Contributor Author

After discussion with @maartenflippo:

  • Remove ExtendedOneUIP
  • Separate propagation and learning enum (keep original enum in solver flags)

@ImkoMarijnissen ImkoMarijnissen marked this pull request as ready for review May 28, 2026 06:22
@EmirDe

EmirDe commented May 28, 2026

Copy link
Copy Markdown
Contributor

What is the status of this, is it ready for review/finalised?

@ImkoMarijnissen

Copy link
Copy Markdown
Contributor Author

What is the status of this, is it ready for review/finalised?

@EmirDe It should indeed be ready to be reviewed. I need to discuss with Maarten how this interacts with proof logging, but the functionality should be there.

@EmirDe

EmirDe commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

I had a look. This is quite a complicated PR, so I think we need to discuss how to approach it.

The nogood propagator is heavily parameterised (PropagationMode x AnalysisMode) and we are handling all these cases in one file, so looking at the code can be a bit difficult. CPIP has additional requirements that standard unit propagation does not have, so it makes things complicated for both.

Another point is that in my experience, the nogood propagator is quite delicate, in the sense that anything we add to it can impact performance. Say if we run 1UIP with this code, I would expect that it may be slower than the previous version.

I am not sure what the best way forward is, but we can discuss. My current impression is that it may be too much to pack everything into one propagator (but maybe it is the best, we can see). On the other hand, some functionality is the same amongst the nogood propagators, so it makes sense to reuse.

I am wondering if we can separate the code by use case, but handle things at compile time to avoid performance loss and make it look nicer.

Overall, let us discuss in person!

Some other comments (a bit messy, sorry!):

Comments in resolution_resolver.rs need to be updated throughout the file -> part of it seems to refer to only the "old" version (without CPIP), so it can be a bit confusing (it simply misses CPIP).

We sometimes use "1UIP" and sometimes "OneUIP", could standardise.

We refer to the unique implication point, but for CPIPs, I think we are doing "first propagation point".

(perhaps unrelated, but very surprising) I noticed we renamed "decision_level" as "checkpoint". Why? The trail has a "checkpoint" and predicates have "checkpoints". It seems odd given that we have not used that terminology in our papers.

to discuss: to_process_heap -> totally not clear in the "should_continue_resolving"

There is a todo:
// TODO: compute this incrementally -> how important is this? Looks intense. A new data structure that is like a map with a counter would work.

// We wait until there are only elements over a single variable left. -> what is element, predicate?

// Firstly, there should be only elements over a single element. -> what is element, predicate?

todo discuss (I do not fully follow what is going on) -> AnalysisMode::BoundsCPIP => {
todo discuss (I do not fully follow the comments) -> WatcherProcessingStatus

let inference_code = &self.inference_codes[self.nogood_predicates.get_nogood_index(&watcher.nogood_id)]; -> why is this being fetched at this point, whereas it might not even be used, and in the code, its usage appears 150 lines later?

We need to state the invariants for extended nogoods, that we are maintaining, in the code.

// If there is a falsified predicate over the same variable as the 0th predicate, then we need to replace it. -> do we need to? Could we simply take a new domain id?

// Similarly, we need to keep the invariant for lazy unit propagation explanations that the propagated predicate is at the 0th position. -> for CPIP, no, right, since multiple predicates can be propagated?

Above we use "variable", later we use "domain" in "process_potential_watcher". Probably should use just domain id.

@EmirDe

EmirDe commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

I have done a new pass! It would be best to discuss in person (will reach out separately for scheduling). Until then, a few high-level comments.

[1]

I now understand what my problem was. Even though I understand the methods individually, when presented with code that uses enums to switch between variants, it becomes hard to parse, since you need to consider all variants at once. This is both for the propagator and conflict analysis. And AnalysisMode is an enum will quite some elaborate methods.

It is true that many of the variants share the same code so that also makes sense to make it this way. But I am still wondering if we can do something to make this clearer. This is best in person to discuss also with Maarten.

Concretely, I am curious if we can define traits, and then have every variant implement their version of the functions for each trait. This way we can keep the high-level structure of the analysis and propagation procedure, and have a clearer way to fill in the differences. This would effectively remove the match cases for the difference mode (AnalysisMode and PropagationMode). Some code might still be duplicated, but not much, and maybe there are also solutions there.

[2]

The other issue is that when we are finding new watchers, for CPIP, we seem to be doing some replacements on the fly. However, some replacements may be redundant. I think we need to do the linear scan, and only do one replacement at the very end. For example, WatcherProcessingStatus::FoundNewWatchButContinue keeps replacing, but if we end up finding a predicate with a different variable, then we would use that predicate and stop with finding other watchers, and at this point, any changes done by WatcherProcessingStatus::FoundNewWatchButContinue will be irrelevant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants