Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve local dataset loading #289

Merged
merged 2 commits into from
Mar 4, 2025

Conversation

a-r-r-o-w
Copy link
Owner

Fixes #286.

cc @mkturkcan Could you give this branch a try with your local dataset format? I didn't test a full run but it worked on a dummy run

@mkturkcan
Copy link

find_files seems to need a new remote argument:

  File "finetrainers/finetrainers/data/dataset.py", line 793, in _has_data_caption_file_pairs
    caption_files = utils.find_files(root.as_posix(), "*.txt", depth=0, remote=remote)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: find_files() got an unexpected keyword argument 'remote'

@a-r-r-o-w
Copy link
Owner Author

Sorry, I forgot to push changes due to working on lots of branches in parallel. It should work now 🤞

@mkturkcan
Copy link

Can confirm that it does!

@a-r-r-o-w a-r-r-o-w merged commit ea69aaf into main Mar 4, 2025
1 check passed
@a-r-r-o-w a-r-r-o-w deleted the datasets/local-data-file-caption-file-format branch March 4, 2025 20:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Using Custom Local Datasets
2 participants