Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need to move object UUID creation into the indexer, rather than the ingester #47

Open
fsgeek opened this issue Feb 7, 2024 · 2 comments
Assignees

Comments

@fsgeek
Copy link
Contributor

fsgeek commented Feb 7, 2024

The rationale for this is that we want to have multiple ingesters, but if the UUID is defined by one of the ingesters, the others become dependent upon that ingester.

The solution is to move the UUID generation out of the local ingester and put it into the local indexer. Then an ingester can depend just on the indexer file to generate additional ingest information.

@fsgeek
Copy link
Contributor Author

fsgeek commented Feb 24, 2024

Added commit db9779b to generate the UUID in the base indexer class. Still need to change ingester to use it if it is present.

@fsgeek
Copy link
Contributor Author

fsgeek commented Feb 25, 2024

Commit 8bafb75 handles an issue during testing where the st_birthtime field wasn't present. I have changed it to handle the instance where any time field is not present. The change I made previously (for Windows) that uses the UUID is working as expected. Hopefully it will work for the other platforms as easily.

fsgeek pushed a commit that referenced this issue Feb 25, 2024
hadisinaee added a commit that referenced this issue Apr 1, 2024
* This is work to add support for the Linux Indexer.

As part of this I added the logic for the IndalekoLinuxMachineConfig.py so it now creates the config file and adds that to the database.

This also provides a prelminary version of the IndalekoLinuxLocalIndexer.py, though that remains a work in progress.

Some refactoring along the way as I pull some code into the generic layers.

* Make the Linux local file system indexer work.  This has changed some of the common behavior, so before I merge this I'll verify that I can still index my Windows files as well.

In addition, I made changes to the generic layer to handle Issue #37, which relates to handling symlinks versus files.  I think this issue may require additional work but it should handle broken links now by ignoring them.

Finally, I did not create an issue but while working through the Linux local indexer I found a peculiar case of file names that could not be UTF-8 encoded.  For now I log the file name and keep going, but I'll open a new issue for this.  See Issue #42.

* Sync local and remote changes. Nothing material.

* This is further work on Issue #23 and Issue #24.
At this point the linux indexer and ingester do seem to be gathering data, so it is a reasonable time to capture
the current state. Before merging this change in I'd like to make sure it doesn't break other platforms.

* Further cleanup, fixed issue with counts in linux ingester, add logic to track good and back symlinks in indexer.  See Issue #23 #24 #37

* More cleanup for Issues #23 #24

* Add counters to allow checking indexer output against ingester input/output

* Add uuid generation into Indexer body.  Still requires changing ingester(s) to use the UUID as the primary key.

* Handle situation where there is no st_birthtime field in the stat data.

* Issue #47.  These changes are prospective, but are identical to what was done on Windows (where it worked).

* Use UUID for new data ingester.

* Create python-package.yml

---------

Co-authored-by: Tony Mason <[email protected]>
Co-authored-by: Tony Mason <[email protected]>
Co-authored-by: Tony Mason <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant