Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance: Use FormatOnly on imports #277

Merged
merged 3 commits into from
Sep 3, 2024

Conversation

jpbetz
Copy link
Contributor

@jpbetz jpbetz commented Aug 30, 2024

Revive kubernetes/kubernetes@f63343f.

This has a net improvement of about 26% for hack/update-codegen.sh: https://gist.github.com/jpbetz/227c38304fd168aa27505d5407800a74

This is dramatically faster for gengo executions with many small targets.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 30, 2024
@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Aug 30, 2024
@jpbetz
Copy link
Contributor Author

jpbetz commented Aug 30, 2024

/assign @thockin @skitt

Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but one comment from the k/k PR:

If we added a check that the import alias cannot equal the current pkg name, could we avoid this whole commit? It's weird to be in pkg "v1" and have an import named "v1".

NewImportTrackerForPackage() -> NewDefaultImportTracker(); DefaultImportTracker.ImportLines() could check this?

@thockin
Copy link
Member

thockin commented Aug 31, 2024

diff --git a/vendor/k8s.io/gengo/v2/generator/import_tracker.go b/vendor/k8s.io/gengo/v2/generator/import_tracker.go
index 70b86cf56b2..60f58e89c95 100644
--- a/vendor/k8s.io/gengo/v2/generator/import_tracker.go
+++ b/vendor/k8s.io/gengo/v2/generator/import_tracker.go
@@ -18,6 +18,7 @@ package generator
 
 import (
 	"go/token"
+	"path/filepath"
 	"strings"
 
 	"k8s.io/klog/v2"
@@ -45,7 +46,7 @@ import (
 func NewImportTrackerForPackage(local string, typesToAdd ...*types.Type) *namer.DefaultImportTracker {
 	tracker := namer.NewDefaultImportTracker(types.Name{Package: local})
 	tracker.IsInvalidType = func(*types.Type) bool { return false }
-	tracker.LocalName = func(name types.Name) string { return goTrackerLocalName(&tracker, name) }
+	tracker.LocalName = func(name types.Name) string { return goTrackerLocalName(&tracker, local, name) }
 	tracker.PrintImport = func(path, name string) string { return name + " \"" + path + "\"" }
 
 	tracker.AddTypes(typesToAdd...)
@@ -56,7 +57,7 @@ func NewImportTracker(typesToAdd ...*types.Type) *namer.DefaultImportTracker {
 	return NewImportTrackerForPackage("", typesToAdd...)
 }
 
-func goTrackerLocalName(tracker namer.ImportTracker, t types.Name) string {
+func goTrackerLocalName(tracker namer.ImportTracker, localPkg string, t types.Name) string {
 	path := t.Package
 
 	// Using backslashes in package names causes gengo to produce Go code which
@@ -65,6 +66,7 @@ func goTrackerLocalName(tracker namer.ImportTracker, t types.Name) string {
 		klog.Warningf("Warning: backslash used in import path '%v', this is unsupported.\n", path)
 	}
 
+	localLeaf := filepath.Base(localPkg)
 	dirs := strings.Split(path, namer.GoSeperator)
 	for n := len(dirs) - 1; n >= 0; n-- {
 		// follow kube convention of not having anything between directory names
@@ -74,7 +76,7 @@ func goTrackerLocalName(tracker namer.ImportTracker, t types.Name) string {
 		// packages, but aren't legal go names. So we'll sanitize.
 		name = strings.ReplaceAll(name, ".", "")
 		name = strings.ReplaceAll(name, "-", "")
-		if _, found := tracker.PathOf(name); found {
+		if _, found := tracker.PathOf(name); found || name == localLeaf {
 			// This name collides with some other package
 			continue
 		}

That seems to remove 99% of the k/k diff.

This primarily benefits applyconfiguration-gen today, which accidentaly
got this behavior by including the local package name in imports, which
caused goimports to deconflict with it before removing it since it was
unused.  This commit effictively preserves that behavior now that we
will not be relying on goimports to add/rename/remove imports.

Co-authored-by: Tim Hockin <[email protected]>
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Sep 2, 2024
@jpbetz
Copy link
Contributor Author

jpbetz commented Sep 2, 2024

That seems to remove 99% of the k/k diff.

Oh nice. I had been hoping to avoid a big diff, but hadn't realized that avoiding the local package name was justified.

@jpbetz
Copy link
Contributor Author

jpbetz commented Sep 2, 2024

Change applied. Synchronizing the k/k PR as well.

Comment on lines +81 to +83
// Or, this name is tne same name as the local package,
// which we avoid because it can be confusing. For example,
// if the local package is v1, we to avoid importing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very minor:

Suggested change
// Or, this name is tne same name as the local package,
// which we avoid because it can be confusing. For example,
// if the local package is v1, we to avoid importing
// Or, this name is the same name as the local package,
// which we avoid because it can be confusing. For example,
// if the local package is v1, we try to avoid importing

@skitt
Copy link
Member

skitt commented Sep 3, 2024

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 3, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jpbetz, skitt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit fb7743f into kubernetes:master Sep 3, 2024
3 of 4 checks passed
@skitt
Copy link
Member

skitt commented Sep 3, 2024

Oops, I hadn’t realised gengo allowed self-approval!

@jpbetz
Copy link
Contributor Author

jpbetz commented Sep 3, 2024

Oops, I hadn’t realised gengo allowed self-approval!

All good. I think this is what we want.

cc @thockin just in case. I can revert or fix forward as needed.

@skitt
Copy link
Member

skitt commented Sep 3, 2024

Yup, as I understand it this is indeed what we want, I just didn’t notice the automatic self-approval and thought Tim would get a say again before the merge.

@thockin
Copy link
Member

thockin commented Sep 3, 2024

LGTM

@brandond
Copy link

brandond commented Dec 20, 2024

Hey so this appears to now generate some pretty weird imports for stdlib packages. For example I'm now seeing changes like:

--- a/pkg/generated/clientset/versioned/clientset.go
+++ b/pkg/generated/clientset/versioned/clientset.go
@@ -19,8 +19,8 @@ limitations under the License.
 package versioned

 import (
-       "fmt"
-       "net/http"
+       fmt "fmt"
+       http "net/http"

        k3sv1 "github.com/k3s-io/k3s/pkg/generated/clientset/versioned/typed/k3s.cattle.io/v1"
        discovery "k8s.io/client-go/discovery"
diff --git a/pkg/generated/clientset/versioned/typed/k3s.cattle.io/v1/addon.go b/pkg/generated/clientset/versioned/typed/k3s.cattle.io/v1/addon.go
index 7eaa80a9f6..3afc29ece3 100644
--- a/pkg/generated/clientset/versioned/typed/k3s.cattle.io/v1/addon.go
+++ b/pkg/generated/clientset/versioned/typed/k3s.cattle.io/v1/addon.go
@@ -19,9 +19,9 @@ limitations under the License.
 package v1

 import (
-       "context"
+       context "context"

-       v1 "github.com/k3s-io/k3s/pkg/apis/k3s.cattle.io/v1"
+       k3scattleiov1 "github.com/k3s-io/k3s/pkg/apis/k3s.cattle.io/v1"
        scheme "github.com/k3s-io/k3s/pkg/generated/clientset/versioned/scheme"
        metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
        types "k8s.io/apimachinery/pkg/types"
@@ -37,31 +37,32 @@ type AddonsGetter interface {

Was this intentional? I'm not sure if it was the change to name generation, or the addition of FormatOnly: true to the goimports wrapper options that is causing this. Perhaps the latter?

brandond added a commit to brandond/gengo that referenced this pull request Dec 20, 2024
…orts"

This reverts commit fb7743f, reversing
changes made to a7b603a.

Signed-off-by: Brad Davidson <[email protected]>
@thockin
Copy link
Member

thockin commented Dec 20, 2024

The way gengo handles imports overall is not very clever. It's not that this was intended, but it was a side-effect of other changes we really did want, and wasn't worth fixing because it was still "correct".

Yes it's weird, and if someone wanted to fix the import handling, I am softly supportive, but I suspect it would be another breaking change (not such a big deal since we call v2 "internal" anyway).

@thockin
Copy link
Member

thockin commented Dec 20, 2024

Does this represent a real problem that I am not seeing?

@brandond
Copy link

brandond commented Dec 20, 2024

We use gengo/v2 both directly and by way of code-generator in rancher/wrangler, to handle codegen for our controllers. Unfortunately, the previous behavior around import pruning meant that we could take a kitchen-sink approach to imports and just pull in everything we might need regardless of which groups of functionality are enabled - knowing that gengo would clean up the unused ones.

With this change we're seeing both the weird import aliasing (fmt "fmt" for example), and also that the unused imports aren't being pruned any longer - so we have to do a bunch of extra work on our side to actually track imports and only add the ones that are needed.

@jpbetz
Copy link
Contributor Author

jpbetz commented Dec 20, 2024

@brandond Would having an option to re-enable the old import handling for code generators you've authored be sufficient? That is, if generator.NewContext() provide an option to re-enable the old behavior, would that solve your problem?

@brandond
Copy link

At the moment we're just calling gengo.Execute, which looks to me like it doesn't give us the ability to add additional options to the context... unless that is something we can modify in the getTargets callback? This is my first exploration into this codebase so I'm not super familiar with the flow here.

https://github.com/rancher/wrangler/blob/1345a309299c3e4aaf0f12188795b15978d5b9a2/pkg/controller-gen/main.go#L66-L72

@thockin
Copy link
Member

thockin commented Dec 20, 2024

A couple things:

gengo/v2 is specifically for k8s use -- we literally considered moving it to an "internal/" sub-pkg. While we are happy to make it keep working for projects like rancher, v2 is significantly different than v1 and you may need to adapt. In particular, the kitchen-sink approach to imports is different (among many others).

Running goimports on every file is noticeably slower, which is why we took the effort to excise it. I'm not a huge fan of adding it back - wouldn't we need some way of testing the difference in behavior? I'm not hard AGAINST it, but I don't have time to do the work in the near future. I don't control what Joe spends time on :)

I did look at the stupid identity-aliasing of imports but it seemed complicated, and I had run out of steam to fix a "cosmetic" issue.

As ever, PRs are welcome, but if you're going to retool something which has impact into kubernetes codegen tools, let's talk about it before you go too far :)

@thockin
Copy link
Member

thockin commented Dec 20, 2024

looks to me like it doesn't give us the ability to add additional options to the context

Yeah, it might need to become ExecuteWithOpts() and take some new options struct that includes things like "slower but more complete goimports".

Alternately, you could do a sync.Once from your getTargets() to change the Context.FileTypes[GoFileType] to something bespoke -- just steal NewGoFile() and importsWrapper() and change as needed. Or even add a imports.Options arg to NewGoFile() like:

func goimportsFormatter(opts *imports.Options) func([]byte)([]byte, error) {
    if  opts == nil {
        opts = &imports.Options{
            Comments:   true,
            TabIndent:  true,
            TabWidth:   8,
            FormatOnly: true, // Disable the insertion and deletion of imports
        }
    }
    return func(src []byte)([]byte, error) {
        return imports.Process("", src, opt)
    }
}

func NewGoFileWithOptions(opts *imports.Options) *DefaultFileType {
    return &DefaultFileType{
        Format:   goimportsFormatter(opts),
        Assemble: assembleGoFile,
    }
}

@brandond
Copy link

brandond commented Dec 20, 2024

gengo/v2 is specifically for k8s use -- we literally considered moving it to an "internal/" sub-pkg.

Understood. Unfortunately the move to go 1.23 and the use of generic type aliases has caused us to need to go to gengo/v2 as the alias support was not added to v1.

I'm not hard AGAINST it, but I don't have time to do the work in the near future. I don't control what Joe spends time on :)

I think we've got some cycles on our side, if that's helpful. I think long term we will probably read the writing on the wall and try to move off direct use of gengo but that's going to be a longer path, if we had a way to get the previous import behavior now, that'd save us a fork and give us some runway while we migrate off.

@thockin
Copy link
Member

thockin commented Dec 20, 2024

I think the path I laid out above gives you what you need -- but test and verify it, please. Joe independently came up with 90% the same solution (we talked).

If you sent us a PR to do what I outlined, the total impact on your code (for this) would be less than 10 LOC, I suspect. getTargets() is not the most obvious hook, but it's the thing called closest after NewContext() and it's (IMO) not a horrible hack.

@thockin
Copy link
Member

thockin commented Dec 20, 2024

I think long term we will probably read the writing on the wall and try to move off direct use of gengo

As long as the things you are doing are directionally aligned with k8s codegen, it's probably "safe enough". Gengo has some weird opinions about things, and maybe one day we'll do a v3 and "fix" those, but it is not this day.

@brandond
Copy link

brandond commented Dec 20, 2024

We were able to fix the imports stuff by just replacing the FileType in the context - thanks for that hint!

For the identity-aliasing stuff I will open a PR with a possible naive fix, we can take the discussion there:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants