Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: finish support for path conflicts with 'prefer' #204

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

letFunny
Copy link
Collaborator

  • Have you signed the CLA?

@letFunny letFunny added the Priority Look at me first label Feb 12, 2025
Copy link

github-actions bot commented Feb 12, 2025

Command Mean [s] Min [s] Max [s] Relative
BASE 8.168 ± 0.030 8.144 8.240 1.07 ± 0.00
HEAD 7.663 ± 0.019 7.634 7.695 1.00

Copy link
Collaborator Author

@letFunny letFunny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments for the offline review.

if oldInfo.Kind == GlobPath && (newInfo.Kind == GlobPath || newInfo.Kind == CopyPath) {
if new.Package == old.Package {
continue
if !strdist.GlobPath(newPath, oldPath) {
Copy link
Collaborator Author

@letFunny letFunny Feb 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Note for reviewer]: The 10% speedup you can see is because of moving strdist.GlobPath to execute before looking at the package. I have changed main to do this check before checking if the package is the same and I can reproduce the speedup.

}
toCheck := []*Slice{new}
_, hasPrefers := prefers[preferKey{preferSource, newPath, ""}]
if hasPrefers {
Copy link
Collaborator Author

@letFunny letFunny Feb 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Note for reviewer]: The change here is that we now have the "head" of the chain in paths but we need to check each glob against all of the members. For example: if pkgA -> pkgB -> pkgC for /foo then paths has pkgA. If the glob is /fo? we need to check all three packages.

In reality we can shortcircuit this because with the current logic if there are two packages then it is impossible for the glob not to conflict. This is because if the files are extracted from the packages then the glob either matches one package or the other; and if the files are not extracted then the glob will also conflict naturally. However, we thought it was better to have a more general case here that will survive changes in the logic and the extra cost is only two if statements executed.

/path:
`,
},
relerror: `slices mypkg1_myslice and mypkg2_myslice conflict on /\*\* and /path`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I seem to be missing the underlying issue here, or at least the test case isn't very clear about what's the real problem. If we select both mypkg1 and mypkg2, won't we end up with the content of /** from mypkg1, but /path from mypkg2? Isn't that exactly what one is trying to achieve in this scenario? It's actually nice to have this test case, because it's a tricky scenario, but the fact it works as intended would be a nice bonus at this stage.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that is the expected outcome. /** conflicts with /path on the other package and there is no prefer relationship between them (there could never be as we don't allow prefer with wildcards). This test case is ensuring that conflict with globs work, having the extra /path is just to make sure the prefer resolution does interact properly with glob conflicts.

}

// SelectPackage returns true if path should be extracted from pkg.
func (s *Selection) SelectPackage(path, pkg string) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These new API changes seem to be corrupting what used to be quite a nice public interface in these modules. Even reading the comment for the Selection type right above already hints at how nice it was: Release gives you an entire Ubuntu release. Then you can make a Selection, which holds Packages and Slices, hand it to a Slicer, and it's all nicely sorted out. That's it.

With this change we're instead introducing multiple hidden caches, with an obscure "resetting" function (reset what? when?) and an interface to "select packages"... but... we literally just selected the slices in the Slices field? Why would we manually select individuals paths now? Do we have to know what the resolution algorithm is, and where should we use it?

This all feels quite haphazard. It's probably doing what it should, in the end, but without a principle.

As a suggested approach to try to get out of this corner, maybe we can try this single function:

func (selection *Selection) Prefers() (map[string]*Package, error) { ... }

This function would return a map from path to package, but only when the given path is indeed being affected by the prefers option used in slices inside the selection itself. In other words, even if a path inside the selection has a prefer, if no other slice refers to the given path, we don't return it in the map. This is really a conflict solving map, not a data reporting one. It also does the right job every time, for the current version of whatever the Selection is.

For the implementation, I'm hoping this can be trivially done with the existing (unchanged) prefers() function, and the preferredPathPackage function to make the choice between the two packages.

Let's please try that out, and make the path minimal so we're only introducing this specific mutation (no caching, etc), so we can try to make progress on this issue quickly.

Copy link
Collaborator Author

@letFunny letFunny Mar 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean, the hidden caches were probably an early optimization and it does make the code more complex because now it is stateful. I also agree with

and an interface to "select packages"... but... we literally just selected the slices in the Slices field?

Probably that would have been fixed with better naming but I also see the point in the rest of your comment so proceeding with that suggestion instead.

I have changed the code to follow your suggestion and remove all the extra changes to prefers and all the new data structures. It is now indeed much simpler and uses preferredPathPackage and prefers() exclusively to both, fix the failing test cases, and to enable prefer ordering in slicer.go. The only downside is the performance is a little bit worse than with the previous implementation, possibly both because we have removed the caching, and also because we are traversing the graph more times. However, we can enable caching later on and be more clever about it if it is needed and backed by proper performance numbers.

Please tell me what you think. I am happy with the new approach and I am also happy with a hybrid that doesn't require traversing the graph as much.

@letFunny letFunny requested a review from niemeyer March 11, 2025 16:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority Look at me first
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants