Skip to content

match imports and variables against their canonical name#21

Merged
ashanbrown merged 4 commits intoashanbrown:masterfrom
pohly:import-aliases
Jan 29, 2023
Merged

match imports and variables against their canonical name#21
ashanbrown merged 4 commits intoashanbrown:masterfrom
pohly:import-aliases

Conversation

@pohly
Copy link
Copy Markdown
Contributor

@pohly pohly commented Oct 25, 2022

It was possible to use fmt.Print by importing "fmt" under a different name. Now a pattern can also get matched against the full package path. This is also useful to ban items from packages that may get imported under various different names.

For variables, the match can be against the type of the variable instead of the (usually) arbitrary variable name.

Fixes #20

@pohly
Copy link
Copy Markdown
Contributor Author

pohly commented Oct 25, 2022

The corresponding change in in golangci-lint is small:

diff --git a/pkg/golinters/forbidigo.go b/pkg/golinters/forbidigo.go
index a8c256b9..b7c467f7 100644
--- a/pkg/golinters/forbidigo.go
+++ b/pkg/golinters/forbidigo.go
@@ -64,7 +64,7 @@ func runForbidigo(pass *analysis.Pass, settings *config.ForbidigoSettings) ([]go
 
        var issues []goanalysis.Issue
        for _, file := range pass.Files {
-               hints, err := forbid.Run(pass.Fset, file)
+               hints, err := forbid.Run(pass, file)
                if err != nil {
                        return nil, errors.Wrapf(err, "forbidigo linter failed on file %q", file.Name.String())
                }

However, I have only tested that it builds. When running "go test ./..." in golangci-lint I got several unrelated test failures.

@pohly pohly force-pushed the import-aliases branch 2 times, most recently from 97821fb to bcd9be6 Compare October 26, 2022 10:01
@pohly pohly mentioned this pull request Oct 26, 2022
@pohly
Copy link
Copy Markdown
Contributor Author

pohly commented Oct 26, 2022

I tried the modified golangci-lint in Kubernetes and it worked as intended:

  forbidigo:
    forbid:
    - '^k8s.io/kubernetes/test/e2e.*:k8s.io/apimachinery/pkg/util/wait\.Until(# Gomega Eventually or Consistently should be used instead)?'

->

test/e2e/framework/kubelet/stats.go:378:5: use of `wait.Until` forbidden because "Gomega Eventually or Consistently should be used instead" (forbidigo)
	go wait.Until(func() { r.collectStats(oldStats) }, r.pollingInterval, r.stopCh)
	   ^

That was with #22, otherwise it would have complained also in non-test code.

@ashanbrown
Copy link
Copy Markdown
Owner

ashanbrown commented Oct 30, 2022

Thanks for the proposal. My understanding is that the way this works is to find matches for both the post-alias name of an identifier and the full unaliased package name of the identifier.

I think you saw when integrating this with golangci lint that the proposal as it stands may be a breaking change for existing patterns. I think this is a real cause for concern and we should strive not to break existing patterns.

Just for reference, it seems to me that there might be three cases we are concerned about:

import (
  "fmt"
  fmt2 "fmt"
  fmt3 "example.com/fmt"
)

If someone has a pattern like ^fmt.Print\.*, that currently wouldn't match usages for fmt2 but with these changes, it will. That said, I think that may be an acceptable breaking change since it probably still matches the intent of the user. I'm a bit more concerned about matches like ^example that might now start to match package paths.

My inclination would be to initially (when we create the linter) split up the patterns into legacy "naive" patterns (without package names or imports considered) and fully-qualified identifiers (i.e. those with package names) and then match each type of pattern against a different set of text values. The proposal to use <package path>: as a prefix does seem appealing and I think we could use the presence of the : symbol in the pattern to determine if each we should be matching fully-qualified identifier names. The one place this could cause backwards incompatibility might be if someone had used a : in the embedded comment syntex that had been proposed (e.g. ^fmt (? # never use fmt: it's a bad idea)). Perhaps we could exclude usages inside of comment-like constructs.

We might also need a special way to identify standard packages and perhaps that could be done with a leading :. So ^:fmt\,Print.* might be a new default pattern.

I am wondering how this with play with the proposal at #22, which was to be able to restrict patterns to specific files. This is an orthogonal concern with package names and something I've typically deferred to meta-linters. I'm wondering if we could use something like angle brackets <> to define a glob list of includes and excludes, perhaps similar to what Intellij uses in their search patterns (e.g. <blah/*.go,!file.go>).

@pohly I know this is a bit of a change from what you've proposed here and I appreciate all the effort you've put in already. Please let me know what you think.

@pohly
Copy link
Copy Markdown
Contributor Author

pohly commented Oct 30, 2022

If someone has a pattern like ^fmt.Print.*, that currently wouldn't match usages for fmt2 but with these changes, it will. That said, I think that may be an acceptable breaking change since it probably still matches the intent of the user.

Agreed, that was also my thinking.

I'm a bit more concerned about matches like ^example that might now start to match package paths.

Is that something that you expect to happen in practice, or is this just a theoretic concern? I agree that it can happen, but it seemed unlikely to me and (when it happens) easy to fix by changing the pattern.

My inclination would be to initially (when we create the linter) split up the patterns into legacy "naive" patterns (without package names or imports considered) and fully-qualified identifiers (i.e. those with package names) and then match each type of pattern against a different set of text values. The proposal to use : as a prefix does seem appealing and I think we could use the presence of the : symbol in the pattern to determine if each we should be matching fully-qualified identifier names.

As you noted, just checking for : is fragile and not 100% reliable. If we go down this route, then we should let the user configure two different kinds of patterns, traditional "text" patterns which literally match what appears in the source code, and "package" patterns that match package symbols after expanding the import name.

I am wondering how this with play with the proposal at #22, which was to be able to restrict patterns to specific files.

Then we'll have four kinds of patterns?! That doesn't sound appealing. Let's try something else:

  • A pattern gets matched against the actual package name if and only if the pattern contains a pkg subgroup (^(?P<pkg>example.com/some/pkg).Print).
  • A pattern gets matched against text with the file prefix (PR per-file patterns #22) if and only if the pattern contains a file subgroup (^(?P<file>example.com/some/pkg/pkg.go:)spew.Log$).

Then we don't need to change how patterns get configured, it's extremely unlikely that someone has written patterns with such subgroup names (i.e. existing patterns work as before), and each pattern gets matched only once (better performance).

This is an orthogonal concern with package names and something I've typically deferred to meta-linters.

True, except that golangci-lint doesn't seem to be on a path towards offering the necessary configuration options.

@pohly
Copy link
Copy Markdown
Contributor Author

pohly commented Oct 30, 2022

I like this approach more than the initial one and it works in practice, too - even better! 😄

I've force-pushed an update. The bit set will make more sense with the also updated PR #22.

@ashanbrown
Copy link
Copy Markdown
Owner

Thanks for iterating on this, @pohly. The use of named subgroups to prevent breaking changes sounds good to me. I still am fond of the idea of using globs as a way of being able to exclude files, but I don't see that as being incompatible with named subgroups. We could still find a way to add glob-based file filters later.

I'll try to finish reviewing this PR this weekend.

@ashanbrown
Copy link
Copy Markdown
Owner

One thing I'm not clear on is what should happen if we don't have the type info for an object. I think at the moment we silently don't match it. Are there cases we might want to raise an error? If your pattern expects a package and we don't have type info at all, that feels to me like it should just fail. If we do have type info but the left side of a selector doesn't appear to be a package (maybe it's a variable shadowing a package name), then that does seem like a case where we should just silently not match it. The latter case also seems like a case we could write a test for.

@pohly
Copy link
Copy Markdown
Contributor Author

pohly commented Jan 17, 2023

I found that golangci-lint uses viper+mapstructure for parsing its config file, therefore my previous plan for using UnmarshalYAML fell apart. I came up with a different solution.

Here's how this will look in golangci-lint (no PR yet, just a diff, include the per-file support):
https://github.com/golangci/golangci-lint/compare/master...pohly:forbidigo-update?expand=1

@ashanbrown
Copy link
Copy Markdown
Owner

Thanks for the latest updates. I've created an MR at pohly#2 with a few suggestions. I've also got CI running again and it looks like golangci-lint is failing.

Regarding the configuration, my inclination would not be to make our linter config compatible with golangci-lint (i.e. don't add tags for mapstructure). I'd be fine with duplicating code in golangci-lint to populate our structure. I'd prefer that mapstructure tags not be part of our API (since we don't otherwise use mapstructure).

I'd also still favor an alternative name for expand-expressions. I had proposed match_methods_and_attributes with the thought that a "method", in golang parlance, is function with a receiver. That said, I think this could be too subtle, and maybe it can be claimed that packages have attributes (and maybe even "methods"). Maybe something like expand_receiver_type would be more explicit, although it doesn't obviously apply to attributes in addition to methods.

@ashanbrown
Copy link
Copy Markdown
Owner

BTW, regarding golangci/golangci-lint@master...pohly:forbidigo-update, I don't see that this handles backwards-compatibility with the old string types. I have to admit that I'm not sure if it's even possible to unmarshal into a value that can be either a string or a yaml object. I imagine we could just unmarshal the entire config into an arbitrary object and then parse it from there.

@pohly
Copy link
Copy Markdown
Contributor Author

pohly commented Jan 23, 2023

Regarding the configuration, my inclination would not be to make our linter config compatible with golangci-lint

Okay, then I'll handle this entirely in golangci-lint.

I'd also still favor an alternative name for expand-expressions. I had proposed match_methods_and_attributes with the thought that a "method", in golang parlance, is function with a receiver.

My concern with the original match_fields_and_methods from #21 (comment) was that Printf in fmt.Printf is neither a method nor a field. match_methods_and_attributes is a bit better because one could call Printf a "package attribute", although that seems a bit unusual. I'm fine with whatever name you want for this, so just let me know and I'll change it.

BTW, regarding golangci/golangci-lint@master...pohly:forbidigo-update, I don't see that this handles backwards-compatibility with the old string types.

This is handled by adding viper.DecodeHook(mapstructure.TextUnmarshallerHookFunc() to pkg/config/reader.go and then implementing the TextUnmarshaller interface for the Pattern struct. If mapstructure encounters a string, it will use that unmarshaller to decode it (backwards compatibility). If it encounters a map, it will populate the individually fields of the struct (new config API).

@ashanbrown
Copy link
Copy Markdown
Owner

Thanks for the additional changes and the flexibility on the flag name. Let's use analyze_types as the boolean to enable expanding the variable type.

So I can't remember if I raised this point before, but, if I understand this current state correctly, analyze_types still doesn't keep us from doing something like returnThing().CallForbiddenMethod(). You'd actually have to stuff the value in a variable for it to be forbidden.

@pohly
Copy link
Copy Markdown
Contributor Author

pohly commented Jan 29, 2023

Thanks for the additional changes and the flexibility on the flag name. Let's use analyze_types as the boolean to enable expanding the variable type.

That sounds good.

So I can't remember if I raised this point before, but, if I understand this current state correctly, analyze_types still doesn't keep us from doing something like returnThing().CallForbiddenMethod().

That should work. It's covered by the "If we are lucky, the entire selector expression has a known type" part. There's a test case for it in expandtext.go:

// Selector expression with result of function call in package.
somepkg.NewCustom().AlsoForbidden() // want "somepkg.NewCustom...AlsoForbidden.*forbidden by pattern.*\\^pkg..CustomType.*Forbidden"

It came out of my observation that trying to deduce the type of the selector entirely myself would have led down a rathole of basically supporting arbitrarily deep AST expressions. I then looked for something simpler and found that TypesInfo.Types is indexed by expression. It doesn't cover everything, therefore there's still support for resolving identifiers.

It sounds like we are almost done. I'll squash the commits and then do the remaining changes (analyze_types, some code cleanups).

@pohly pohly force-pushed the import-aliases branch 2 times, most recently from 0db12ac to 7ca3c2e Compare January 29, 2023 09:35
@pohly
Copy link
Copy Markdown
Contributor Author

pohly commented Jan 29, 2023

No code changes in the first force-push. The second one contains the actual changes.

ashanbrown and others added 2 commits January 29, 2023 10:41
It was possible to use fmt.Print by importing "fmt" under a different name. Now
a pattern can get matched against the full package path instead of the import
name. This is also useful to ban items from packages that may get imported
under various different names.

It was possible to forbid a function in a package, but not a method or field of
a certain type in a package. The same approach for replacing the match text
can now also be used for those.

The only way of debugging was to single-step with a debugger and manually
printing variables. This gets tiresome because in particular for the analyzer
test, inner functions get called a lot. Debug log output during unit testing
helps. Informing the user about text expansion may also be useful, but is not
addressed yet.
Otherwise "go install" fails:

    # github.com/golangci/golangci-lint/pkg/golinters/goanalysis
    pkg/golinters/goanalysis/runner_loadingpackage.go:126:3: unknown field 'Instances' in struct literal of type types.Info
    note: module requires Go 1.19
@pohly pohly force-pushed the import-aliases branch 2 times, most recently from 7a0344e to 2621c02 Compare January 29, 2023 10:32
Python 2 is no longer available:

   W: https://download.docker.com/linux/ubuntu/dists/jammy/InRelease: Key is stored in legacy trusted.gpg keyring (/etc/apt/trusted.gpg), see the DEPRECATION section in apt-key(8) for details.
   E: Package 'python' has no installation candidate
@pohly
Copy link
Copy Markdown
Contributor Author

pohly commented Jan 29, 2023

The CI is also happy now.

@ashanbrown ashanbrown merged commit 1396000 into ashanbrown:master Jan 29, 2023
@ashanbrown
Copy link
Copy Markdown
Owner

I've merged your MR! Thanks for keeping at it. These seem like exciting changes. I'll turn my attention to #22 later today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

support import aliasing

2 participants