-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define TR caching comms in ATD #353
Conversation
To make things more concrete about the interface we want with the backend. test plan: make
Backwards compatibility summary:
|
semgrep_output_v1.atd
Outdated
type tr_cache_key = { | ||
rule_id: rule_id; | ||
(* ex: http://some-website/hello-world.0.1.2.tgz like in found_dependency *) | ||
resolved_url: string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this will be easy to find in many cases, though I agree that if it's possible it makes the best key. I guess it is probably fine to start with this, and add another key later if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes we can always refine. This is just defining the interface. Once we start the implementation we will discover
we need to refine it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thoughts on or experience with PackageURL?
Also curious how we determine the url to fetch the code for a package, in the event that our dependency scanning logic doesn't yield a resolved_url?
semgrep_output_v1.atd
Outdated
* and [transitive_unreachable] records? | ||
* TODO? make it a list? match_results: ... list; ? | ||
*) | ||
match_result: sca_match_kind; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit odd, since when scanning a package with a rule it will result in direct code matches, not sca matches. Maybe we should just return the match here? Then, the CLI could treat those matches the same as matches that it receives from a call to Semgrep locally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could. This is a bigger data structure to store then though, and for TR what we really need is actually just the sca_transitive_match_kind; that's the thing we try to optimize to avoid downloading the dependency and run semgrep on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should at least store the match locations. I don't think it makes sense to store sca_match_kind
because if there are matches in multiple packages, we will somehow need to combine those into a single finding in the cli
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
To make things more concrete about the interface we want with the backend. test plan: make - [x] I ran `make setup && make` to update the generated code after editing a `.atd` file (TODO: have a CI check) - [x] I made sure we're still backward compatible with old versions of the CLI. For example, the Semgrep backend need to still be able to *consume* data generated by Semgrep 1.50.0. See https://atd.readthedocs.io/en/latest/atdgen-tutorial.html#smooth-protocol-upgrades Note that the types related to the semgrep-core JSON output or the semgrep-core RPC do not need to be backward compatible!
To make things more concrete about the interface we want with the backend. test plan: make - [x] I ran `make setup && make` to update the generated code after editing a `.atd` file (TODO: have a CI check) - [x] I made sure we're still backward compatible with old versions of the CLI. For example, the Semgrep backend need to still be able to *consume* data generated by Semgrep 1.50.0. See https://atd.readthedocs.io/en/latest/atdgen-tutorial.html#smooth-protocol-upgrades Note that the types related to the semgrep-core JSON output or the semgrep-core RPC do not need to be backward compatible!
To make things more concrete about the interface we want
with the backend.
test plan:
make
make setup && make
to update the generated code after editing a.atd
file (TODO: have a CI check)For example, the Semgrep backend need to still be able to consume data
generated by Semgrep 1.50.0.
See https://atd.readthedocs.io/en/latest/atdgen-tutorial.html#smooth-protocol-upgrades
Note that the types related to the semgrep-core JSON output or the
semgrep-core RPC do not need to be backward compatible!