-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Secure Excecution of Groovy Code #14197
Conversation
… specific method names
…er for groovy executor
…urity configuration
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #14197 +/- ##
============================================
+ Coverage 61.75% 63.79% +2.04%
- Complexity 207 1535 +1328
============================================
Files 2436 2624 +188
Lines 133233 144560 +11327
Branches 20636 22109 +1473
============================================
+ Hits 82274 92224 +9950
- Misses 44911 45541 +630
- Partials 6048 6795 +747
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
import org.codehaus.groovy.classgen.BytecodeExpression; | ||
|
||
|
||
public class GroovyMethodSanitizer implements GroovyCodeVisitor { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class can be significantly simplified if extending CodeVisitorSupport
.
Why do we want to block on method names? Is this mostly for flexibility?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll replace this with this feature: https://docs.groovy-lang.org/latest/html/api/org/codehaus/groovy/control/customizers/SecureASTCustomizer.ExpressionChecker.html
There are methods like execute
(which let's a user execute a string as a shell command) and invoke
which need to be blocked but can't be blocked with the other rule sets. We want people to be able to use Strings and string operations but want to block execute
, so adding this Visitor or adding an ExpressionChecker rule to block any call to methods in the disallow list is what I went with here.
"Groovy({return [\"bash\", \"-c\", \"env\"].execute().text})" | ||
); | ||
|
||
final GroovyStaticAnalyzerConfig config = new GroovyStaticAnalyzerConfig( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(minor, convention) We don't usually use final
in local variable, same for other places
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(format) This doesn't follow Pinot Style
, please reformat the changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just working shopping how an immutable by default approach to Java would work, I'll get rid of it.
|
||
public class GroovyStaticAnalyzerConfig { | ||
final boolean _enabled; | ||
final private List<String> _allowedReceivers; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(convention) Put private
before final
|
||
|
||
public class GroovyStaticAnalyzerConfig { | ||
final boolean _enabled; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
enabled
is not really useful. Let's remove it and it should be enabled whenever the config exists
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
manager.setConfig(config); | ||
} | ||
} catch (Exception _ex) { | ||
LOGGER.error("Failed to read config from ZK. Loading Default configuration."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Directly throw exception instead of logging error. We want to fail the server start if security is not guaranteed
.expireAfterWrite(5, TimeUnit.MINUTES) | ||
.build(new CacheLoader<>() { | ||
@Override | ||
@Nonnull |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(convention) We don't usually annotate @Nonnull
, but only annotate @Nullable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was added to match the CacheLoader interface, but will remove.
} | ||
|
||
public ZNRecord toZNRecord() throws JsonProcessingException { | ||
ZNRecord record = new ZNRecord("groovySecurityConfiguration"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably don't want this config to be a separate ZNode. We can consider putting it as a field in the cluster config
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing
I find this post very similar to what we wanted: https://melix.github.io/blog/2011/05/12/customizing_groovy_compilation_process.html |
Not quite sure what is happening with the failing tests, looks like it's some CLP tests that are failing. |
this covers both ingestion time and query time groovy i assume? having it start with "pinot.server.query" would be confusing |
Is this the right convention to use for such an API? Only until cluster/config/groovy are we doing a hierarchy, so perhaps something like cluster/configs/groovystaticConfig is a more apt name? |
Good catch. Will fix.
…On Sat, Oct 12, 2024 at 8:36 PM Neha Pawar ***@***.***> wrote:
pinot.server.query.groovy.analyzer.static
this covers both ingestion time and query time groovy i assume? having it
start with "pinot.server.query" would be confusing
—
Reply to this email directly, view it on GitHub
<#14197 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/BAASDJ7SYAMR7XZOTGADAZTZ3G57LAVCNFSM6AAAAABPVKS7F2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMBYG43DGNZUGQ>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
I initially had that, but then I thought we might add additional Groovy security features (such as executing in a sandbox) in the future, and grouping Groovy configs under perhaps |
… and ingestion, so "query" is too restrictive)
@@ -168,4 +170,74 @@ public SuccessResponse deleteClusterConfig( | |||
throw new ControllerApplicationException(LOGGER, errStr, Response.Status.INTERNAL_SERVER_ERROR, e); | |||
} | |||
} | |||
|
|||
@GET | |||
@Path("/cluster/groovy/analyzer/static/default") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The path looks long, shall we use a shorter path? /cluster/config/groovy...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a separate API to the default config? IMO, https://github.com/apache/pinot/pull/14197/files#diff-510a3030bf6c520bbe87caa12b6eff67938c4299feef0c1433f913da810081a3R189 API should get the current active config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current design is that if the feature is disabled if there is no config and enabled if there is a config (per a comment on an older version of this PR). Hence adding a method to easily get a default config that can then be used to turn on this feature.
Alternatively, a query param could be added to the Set method which, when true, will set the configuration to the Default configuration and if false will use the config provided in the request body.
} | ||
|
||
@GET | ||
@Path("/cluster/groovy/analyzer/static/config") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, use shorter path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps we can assume that anything under /cluster
is a config and use cluster/groovy/staticAnalysis
.
@ApiResponse(code = 500, message = "Internal server error") | ||
}) | ||
public GroovyStaticAnalyzerConfig getDefaultGroovyStaticAnalysisConfig() { | ||
return GroovyStaticAnalyzerConfig.createDefault(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the default security config, and is it backward compatible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Starting here: https://github.com/apache/pinot/pull/14197/files#diff-106af3bbf7f8d5f13d3ccb902ed2cfc28edb30931dfeb0ac8d64a7a64c373cfeR97
Assuming backwards compatible means that it won't break exist (safe) scripts, then testing so far looks good, but I'm working with Minesh to see what functions current customers are running in Groovy scripts.
Close this one and prefer #14844 |
Description
Adds security and safety checks for Groovy scripts by performing static analysis on the AST. When enabled this feature will check every
Groovy script sent to Pinot against an Allowed list of operations and classes. If a script uses anything which is not in the Allowed list then
it will be rejected by Pinot.
The allowed constructs can be managed via Cluster Config APIs. The administrator can define allow lists for: imports, static imports,
receivers, and set a list of disallowed method names. The imports lists define what constructs can be imported and used within a
Groovy script, this includes referencing a construct through it's full canonical name (e.g.
System.out.println("hello")
). The disallowedmethods list is provided to block calls to methods like
invoke
andexecute
which would allow a user to execute some malicious operationssuch as shell commands.
The static analysis is performed by the
org.codehaus.groovy.control.customizers.SecureASTCustomizer
, which traverses the AST of aGroovy script and applies configuration validation rules to make sure a Groovy script satisfies the security restrictions.
The configuration for the analyzer is stored in the Cluster Config under the
pinot.server.query.groovy.analyzer.static
field:To enable the Analyzer, add a configuration to the
pinot.server.query.groovy.analyzer.static
field. If this field has a value then Pinot willenable the static analyzer. This property can be easily configured by POSTing to the
/cluster/groovy/analyzer/static/config
API, with abody like the following example:
The current config can be retrieved by GETting
/cluster/groovy/analyzer/static/config
. A recommended default configuration can be gottenby GETting
/cluster/groovy/analyzer/static/default
.Testing
Unit tests were added that check the analyzer with legal scripts, illegal scripts, and changing the configuration. Unit tests were also added
to validate the serialization and deserialization for configurations to and from JSON.
A Pinot image with this feature was deployed to a test cluster and verified that when enabled Groovy scripts with illegal operations were blocked
and legal operations were allowed, and, when disabled, all Groovy scripts were allowed. In addition, and independent engineer went through
and tested the analyzer against a set of malicious scripts they had compiled.