-
Notifications
You must be signed in to change notification settings - Fork 500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Delete Files From Dataset #11230
Delete Files From Dataset #11230
Conversation
asadmin set configs.config.default-config.network-config.protocols.protocol.http-listener-1.http.allow-payload-for-undefined-http-methods=true
if ((pid == null) && (dvObject instanceof DataFile df)) { | ||
pid = df.getOwner().getGlobalId(); | ||
} | ||
pidProvider = PidUtil.getPidProvider(pid.getProviderId()); | ||
XMLStreamWriter xmlw = XMLOutputFactory.newInstance().createXMLStreamWriter(outputStream); | ||
xmlw.writeStartElement("resource"); | ||
boolean deaccessioned=false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is already in a PR that is in QA. Once it's merged please remove this and merge from develop
fileIdList.add(((JsonNumber) value).longValue()); | ||
} | ||
// Find the files to be deleted | ||
List<FileMetadata> filesToDelete = dataset.getOrCreateEditVersion().getFileMetadatas().stream() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@qqmyers : maybe require all the file ids listed in the request to be in the dataset, instead of just ignoring the ones not found?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They aren't just ignored. Line 5400 below will return a 400 if all the files aren't in the dataset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, but still, it could fail faster and return a list of missing IDs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How could it fail faster?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I propose is compiling the list of file IDs in the request that are not in the dataset and returning the 400 status at line 5395 with in the body a message like "Files not found in dataset: [..ids..]". If I understand the code correctly, the failure will currently be thrown out of the commandEngine.submit()
call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is currently thrown out in the second of the next two if clauses. The first checks to see if there were no files that match, and the second checks to see if there are fewer files to delete than were in the original list. (Basically unless all files are in the latest dataset version, you get a 400 error, but you'll get different messages depending on whether there are none or some.)
In general, I'm hesitant to do extra work to help an API caller correct their input - any script using the call should be getting the right info to send, and the list of file ids in the dataset is available through other calls. That said, if there's a use case where it is hard for a script to figure out what to send, we could return more details. Perhaps even send the ones that were found instead since that's really the useful list?
I see continuous integration keeps failing for this PR. |
Testing Passed in internal - no issues found. Merging PR |
What this PR does / why we need it:
This PR adds a deleteFiles method to the native API for datasets, using PUT as the HTTP verb to match the existing delete metadata calls.
Which issue(s) this PR closes:
Special notes for your reviewer:
Note for posterity: To be able to use DELETE with a payload, Payara has to have an option set:
The consensus in slack was to not use this approach, but if we decide to in the future, we'd need to include instructions to make this change in Payara.
Suggestions on how to test this:
There's an IT test that creates a dataset with files, try to delete some, publishes the dataset and then tries to delete others. It also tests using a bad fileId or datasetId or using a user w/o permissions. Those or other cases could be manually tested as well. The documentation (in the native api section) gives a curl example.
Does this PR introduce a user interface change? If mockups are available, please link/include them here:
Is there a release notes update needed for this change?: included.
Additional documentation: