add(load): EP approval record migration stream#541
Conversation
4619216 to
dc83813
Compare
dc83813 to
cf8176b
Compare
cf8176b to
6de53c6
Compare
| self.ep_approval_metadata["resource_type"], | ||
| ) | ||
| # Add the APPRN identifier to the public record | ||
| self._sync_public_apprn_identifier(public_recid, report_number) |
There was a problem hiding this comment.
Is there a reason we do it after publishing the record?
There was a problem hiding this comment.
I think I have the same questions with @0einstein0 :) Why not applying here the same order as it would happen in real life? Create restricted -> Create/Approve request -> Create public record ?
There was a problem hiding this comment.
@0einstein0 no we can add this before publishing thanks :)
@zzacharo We can apply the same order, but we can't use the create public record method in the rdm implementation since we'll have different metadata.
I'll update the PR :)
|
|
||
| current_version_files[key] = deepcopy(file_data) | ||
|
|
||
| if not current_version_files: |
There was a problem hiding this comment.
if there are no new file but metadata changes in a version, do we skip those?
There was a problem hiding this comment.
It was already implemented like this, we create versions if there's a file version change. Since I dont change the version creation(from files) during transform, and I split the files here I added this to avoid duplicated versions.
| "mint_legacy_recid": True, | ||
| "save_original_dump": True, | ||
| "clc_sync": True, | ||
| "record_state": True, |
There was a problem hiding this comment.
what does that mean? to generate or not the record state?
There was a problem hiding this comment.
it means we'll not add the restricted record to record_state_logger, i can change the variable name if it's confusing
|
|
||
| # TODO: What if there are multiple experiments? | ||
| experiments = record_json.get("custom_fields", {}).get( | ||
| "cern:experiments", [] |
There was a problem hiding this comment.
shall we raise and identify if there is any case?
| ) | ||
|
|
||
| # Load the EP approval request | ||
| self._load_ep_approval(restricted_state, public_state, legacy_recid=recid) |
There was a problem hiding this comment.
| self._load_ep_approval(restricted_state, public_state, legacy_recid=recid) | |
| self._create_ep_approval(restricted_state, public_record_state, legacy_recid=recid) |
| self.ep_approval_metadata["resource_type"], | ||
| ) | ||
| # Add the APPRN identifier to the public record | ||
| self._sync_public_apprn_identifier(public_recid, report_number) |
There was a problem hiding this comment.
I think I have the same questions with @0einstein0 :) Why not applying here the same order as it would happen in real life? Create restricted -> Create/Approve request -> Create public record ?
| for _, version_data in split.get("versions", {}).items(): | ||
| current_version_files = OrderedDict() | ||
|
|
||
| for key, file_data in version_data.get("files", {}).items(): |
There was a problem hiding this comment.
Thinking....what about if we just copy all files in the restricted record unconditionally and keep only the public ones for the public record? if an EPPHAPP_FILE_TYPE exsits then we could use this restriction for the restricted record for all files otherwise just restricted files i.e members of the community. That will solve also the record you found that didnt have any restricted file. wdyt @kpsherva ?
| "8dfea666-5758-4614-bbc1-56209565c78a": { | ||
| "label": "EP approval", # shown in UI buttons/headings | ||
| "referee_group": "cds-ph-ep-publication", # CERN e-group slug | ||
| "report_number_pattern": "CERN-EP-{year}-{seq:03d}", |
There was a problem hiding this comment.
this one has changed in the latest implementation on https://github.com/CERNDocumentServer/cds-rdm/tree/feature/ep-approval
needs https://github.com/CERNDocumentServer/cds-rdm/tree/feature/ep-approval
This PR covers the migration of EP approved records.
We'll have a new loader (
CDSEPApprovalRecordServiceLoad) for records that went through EP approval. It splits one legacy record into two RDM records, creates the EP approval request with the original history (submitter, approver, dates, report number), and links everything together with related identifiers and an APPRN.Metadata split
CERN-EP-YYYY-*): removed from both splits before load. After the request is created, the APPRN is minted on the restricted record and added to the public record metadata.CERN-EP-*): removed from public record before load.Files split
EPPHAPP_FILE(restricted file for EP) files only.Loader
The loader creates 2 records and links them via:
isversionof/isvariantformof)How to run
Records with
ep_approvalcannot use the standard record stream it'll raise a migration error. They must be migrated with--ep-approval.Result
For now this only handles records with a approved EP approval history (exactly one
waitingand oneapprovedentry).Questions
waitingorrejectedstate, we would need a different split/load mechanism. Is this a case?EPPHAPP_FILEhttps://cds.cern.ch/record/2864686/files/