fix: exclude wiped dataclips from work order body search#4889
Open
mvanhorn wants to merge 2 commits into
Open
Conversation
When a dataclip is wiped (per-run wipe or the project data-retention job), its body/request are cleared and wiped_at is stamped, but its full-text search_vector is left intact, so wiped dataclips stayed searchable by the exact erased content via work order body search. Guard the body full-text match on is_nil(input_dataclip.wiped_at) so erased dataclip content is no longer discoverable, regardless of any stale search_vector. The guard targets the input_dataclip binding the body search already uses, so it does not affect other search fields. Fixes OpenFn#4824
2 tasks
The raw Repo.query! binding passed the dataclip id as a 36-char string to a $1::uuid parameter, which Postgrex rejects (expects a 16-byte binary). Wrap it in Ecto.UUID.dump!/1, matching the existing pattern in runs_test.exs.
Author
|
Fixed the Heads-up on the |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR fixes a data-retention hole where wiped dataclips remained searchable by their erased content.
When a dataclip is wiped (per-run wipe with
save_dataclips: false, or the project data-retention job), itsbody/requestare cleared toNULLandwiped_atis stamped, but its full-textsearch_vectorcolumn is left intact (the indexing trigger/worker only runs on insert, never on the wipeUPDATE). The work order body search matches that stale vector with nowiped_atguard, so a wiped dataclip stays searchable by the exact content that was meant to be erased.The fix adds an
is_nil(input_dataclip.wiped_at)guard to the:bodybranch ofInvocation.build_search_fields_where/2, ANDed onto the full-text match. This targets the:input_dataclipbinding that body search already uses, so erased dataclip content is no longer discoverable regardless of any stalesearch_vector, and other search fields (id / log / dataclip_name / status) are unaffected.This mirrors the read-side guard already present in
search_workorders_for_retry/2viaexclude_wiped_dataclips/1, but applied at the precise binding the body full-text predicate matches rather than the work order's own dataclip.Closes #4824
Validation steps
search_fields: ["body"]) returns it.Lightning.Runs.wipe_dataclips/1, or the project data-retention job).search_vectoris still populated.The two new tests in
test/lightning/invocation_test.exs(describe "search_workorders/3") cover the regression and the non-wiped match case, including a positive control proving the body vector is populated before the negative assertion.Additional notes for the reviewer
search_vectorworkers, or any migration.search_vectorinQuery.wipe_dataclips/1was dropped:search_vectoris not a declared schema field (it is managed by raw-SQL workers), so a schemaupdate_allreferencing it would not be valid.AI Usage
Please disclose whether you've used AI anywhere in this PR (it's cool, we just
want to know!):
You can read more details in our
Responsible AI Policy
Pre-submission checklist
/reviewwith Claude Code)
(e.g.,
:owner,:admin,:editor,:viewer) — n/a, this is aread-side search filter with no authorization surface.