Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Published datapackage accessible via iRODS for anonymous users #467

Open
jelletreep opened this issue Sep 2, 2024 · 2 comments

Comments

@jelletreep
Copy link
Member

Is your feature request related to a problem? Please describe.

I published a data package via Yoda https://public.yoda.uu.nl/science/UU01/ABVQR4.html
Now, I would like to update my Machine Learning software to be able to automatically download or read the files using e.g. the python irodsclient or ibridges. I would like to do it in such a way that any potential user who uses the software can easily access the data without needing an account, so the potential end user can easily use/read the public data from within python. I see broad applicability, since publishing the code and data in a research project in such a way that the code can be run out-of-the-box and takes care of gathering the required input data is very elegant I think and promotes reproducibility.

Describe the solution you'd like

Being able to connect to a public yoda data package (read/get) using irods (e.g. python-irods client) without the need for a user account nor a password.

Describe alternatives you've considered

I have considered the webdav option, but I think it is slower, less elegant and less reliable. Not relevant to my project, but it would also be good to be able to use irods metadata.

Additional context

@stsnel
Copy link
Member

stsnel commented Oct 2, 2024

Thank you for your feedback.

As of Yoda 1.9.3, we have decided to restrict access to the anonymous account in Yoda so that it can no longer be used for the traditional purpose of anonymous access to public data via iRODS. The reason is that the anonymous user is treated as a regular user for most purposes in iRODS. Considering that it has access to just about any non-admin API endpoint, allowing everyone to use this account exposes a very large attack surface that can't really be secured effectively. We've decided that the security risk outweighs the legitimate use cases in this situation.

We can reconsider this decision for UU environments once iRODS has security restrictions on the anonymous account, such as an API call whitelist (see ticket irods/irods#4044).

For now, the alternative would be to add a readonly account for external researchers that need to work with public data via iRODS. When implemented, zip downloads of public data (#388) could also provide an alternative to work more easily with public data.

@jelletreep
Copy link
Member Author

Thank you @stsnel for your clear response!

In the meantime I found rclone via webdav as another workaround to work with the data. Not ideal as it doesn't work with the data objects and needs additional install/configuration, but it works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants