Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rust: More metrics for tracking taint. #18501

Merged
merged 12 commits into from
Jan 17, 2025
9 changes: 7 additions & 2 deletions rust/ql/integration-tests/hello-project/summary.expected
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,11 @@
| Macro calls - resolved | 2 |
| Macro calls - total | 2 |
| Macro calls - unresolved | 0 |
| Sensitive data | 0 |
| Taint edges - number of edges | 2 |
| Taint reach - nodes tainted | 0 |
| Taint reach - per million nodes | 0 |
| Taint sinks - cryptographic operations | 0 |
| Taint sinks - query sinks | 0 |
| Taint sources - active | 0 |
| Taint sources - total | 0 |
| Taint sources - disabled | 0 |
| Taint sources - sensitive data | 0 |
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,11 @@
| Macro calls - resolved | 2 |
| Macro calls - total | 2 |
| Macro calls - unresolved | 0 |
| Sensitive data | 0 |
| Taint edges - number of edges | 2 |
| Taint reach - nodes tainted | 0 |
| Taint reach - per million nodes | 0 |
| Taint sinks - cryptographic operations | 0 |
| Taint sinks - query sinks | 0 |
| Taint sources - active | 0 |
| Taint sources - total | 0 |
| Taint sources - disabled | 0 |
| Taint sources - sensitive data | 0 |
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,11 @@
| Macro calls - resolved | 2 |
| Macro calls - total | 2 |
| Macro calls - unresolved | 0 |
| Sensitive data | 0 |
| Taint edges - number of edges | 2 |
| Taint reach - nodes tainted | 0 |
| Taint reach - per million nodes | 0 |
| Taint sinks - cryptographic operations | 0 |
| Taint sinks - query sinks | 0 |
| Taint sources - active | 0 |
| Taint sources - total | 0 |
| Taint sources - disabled | 0 |
| Taint sources - sensitive data | 0 |
59 changes: 59 additions & 0 deletions rust/ql/src/queries/summary/CryptographicOperations.ql
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
/**
* @name Cryptographic Operations
* @description List all cryptographic operations found in the database.
* @kind problem
* @problem.severity info
* @id rust/summary/cryptographic-operations
* @tags summary
*/

import rust
import codeql.rust.Concepts
import codeql.rust.security.WeakSensitiveDataHashingExtensions

/**
* Gets the type of cryptographic algorithm `alg`.
*/
string getAlgorithmType(Cryptography::CryptographicAlgorithm alg) {
alg instanceof Cryptography::EncryptionAlgorithm and result = "EncryptionAlgorithm"
or
alg instanceof Cryptography::HashingAlgorithm and result = "HashingAlgorithm"
or
alg instanceof Cryptography::PasswordHashingAlgorithm and result = "PasswordHashingAlgorithm"
}

/**
* Gets a feature of cryptographic algorithm `alg`.
*/
string getAlgorithmFeature(Cryptography::CryptographicAlgorithm alg) {
alg.isWeak() and result = "WEAK"
}

/**
* Gets a description of cryptographic algorithm `alg`.
*/
string describeAlgorithm(Cryptography::CryptographicAlgorithm alg) {
result =
getAlgorithmType(alg) + " " + alg.getName() + " " + concat(getAlgorithmFeature(alg), ", ")
}

/**
* Gets a feature of cryptographic operation `op`.
*/
string getOperationFeature(Cryptography::CryptographicOperation op) {
result = "inputs:" + strictcount(op.getAnInput()).toString() or
result = "blockmodes:" + strictcount(op.getBlockMode()).toString()
}

/**
* Gets a description of cryptographic operation `op`.
*/
string describeOperation(Cryptography::CryptographicOperation op) {
result = describeAlgorithm(op.getAlgorithm()) + " " + concat(getOperationFeature(op), ", ")
or
not exists(op.getAlgorithm()) and
result = "(unknown) " + concat(getOperationFeature(op), ", ")
}

from Cryptography::CryptographicOperation operation
select operation, describeOperation(operation)
17 changes: 17 additions & 0 deletions rust/ql/src/queries/summary/QuerySinkCounts.ql
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
/**
* @name Query Sink Counts
* @description Lists the number of query sinks of each type found in the database. Query sinks are
* flow sinks that are used as possible locations for query results. Cryptographic
* operations are excluded.
* @kind metric
* @id rust/summary/query-sink-counts
* @tags summary
*/

import rust
import codeql.rust.dataflow.DataFlow
import Stats

from string kind, int num
where num = strictcount(DataFlow::Node n | getAQuerySinkKind(n) = kind)
select kind, num
17 changes: 17 additions & 0 deletions rust/ql/src/queries/summary/QuerySinks.ql
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
/**
* @name Query Sinks
* @description Lists query sinks that are found in the database. Query sinks are flow sinks that
* are used as possible locations for query results. Cryptographic operations are
* excluded (see `rust/summary/cryptographic-operations` instead).
* @kind problem
* @problem.severity info
* @id rust/summary/query-sinks
* @tags summary
*/

import rust
import codeql.rust.dataflow.DataFlow
import Stats

from DataFlow::Node n
select n, "Sink for " + strictconcat(getAQuerySinkKind(n), ", ") + "."
24 changes: 24 additions & 0 deletions rust/ql/src/queries/summary/Stats.qll
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,13 @@
*/

import rust
private import codeql.rust.dataflow.DataFlow
private import codeql.rust.dataflow.internal.DataFlowImpl
private import codeql.rust.dataflow.internal.TaintTrackingImpl
private import codeql.rust.AstConsistency as AstConsistency
private import codeql.rust.controlflow.internal.CfgConsistency as CfgConsistency
private import codeql.rust.dataflow.internal.DataFlowConsistency as DataFlowConsistency
private import codeql.rust.security.SqlInjectionExtensions

/**
* Gets a count of the total number of lines of code in the database.
Expand Down Expand Up @@ -41,3 +43,25 @@ int getTotalCfgInconsistencies() {
int getTotalDataFlowInconsistencies() {
result = sum(string type | | DataFlowConsistency::getInconsistencyCounts(type))
}

/**
* Gets the total number of taint edges in the database.
*/
int getTaintEdgesCount() {
result =
count(DataFlow::Node a, DataFlow::Node b |
RustTaintTracking::defaultAdditionalTaintStep(a, b, _)
)
}

/**
* Gets a kind of query for which `n` is a sink (if any).
*/
string getAQuerySinkKind(DataFlow::Node n) {
(n instanceof SqlInjection::Sink and result = "SqlInjection")
}

/**
* Gets a count of the total number of query sinks in the database.
*/
int getQuerySinksCount() { result = count(DataFlow::Node n | exists(getAQuerySinkKind(n))) }
20 changes: 17 additions & 3 deletions rust/ql/src/queries/summary/SummaryStats.ql
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,10 @@
import rust
import codeql.rust.Concepts
import codeql.rust.security.SensitiveData
import codeql.rust.security.WeakSensitiveDataHashingExtensions
import codeql.rust.Diagnostics
import Stats
import TaintReach

from string key, int value
where
Expand Down Expand Up @@ -54,9 +56,21 @@ where
or
key = "Macro calls - unresolved" and value = count(MacroCall mc | not mc.hasExpanded())
or
key = "Taint sources - total" and value = count(ThreatModelSource s)
or
key = "Taint sources - active" and value = count(ActiveThreatModelSource s)
or
key = "Sensitive data" and value = count(SensitiveData d)
key = "Taint sources - disabled" and
value = count(ThreatModelSource s | not s instanceof ActiveThreatModelSource)
or
key = "Taint sources - sensitive data" and value = count(SensitiveData d)
or
key = "Taint edges - number of edges" and value = getTaintEdgesCount()
or
key = "Taint reach - nodes tainted" and value = getTaintedNodesCount()
or
key = "Taint reach - per million nodes" and value = getTaintReach().floor()
or
key = "Taint sinks - query sinks" and value = getQuerySinksCount()
or
key = "Taint sinks - cryptographic operations" and
value = count(Cryptography::CryptographicOperation o)
select key, value order by key
31 changes: 31 additions & 0 deletions rust/ql/src/queries/summary/TaintReach.qll
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
/**
* Taint reach computation. Taint reach is the proportion of all dataflow nodes that can be reached
* via taint flow from any active thread model source. It's usually expressed per million nodes.
*/

import rust
private import codeql.rust.Concepts
private import codeql.rust.dataflow.DataFlow
private import codeql.rust.dataflow.TaintTracking

/**
* A taint configuration for taint reach (flow to any node from any modeled source).
*/
private module TaintReachConfig implements DataFlow::ConfigSig {
predicate isSource(DataFlow::Node node) { node instanceof ActiveThreatModelSource }

predicate isSink(DataFlow::Node node) { any() }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This certainly looks like something that will not perform very well...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, though:

  • we don't compute a path graph for this config.
  • we never had any real issues with the same code in Swift.
  • even on our largest database (windows-rs) with a slightly warmed up cache, quick eval-ing getTaintReach() takes 2 seconds. With vast amounts of fake sources etc added to increase reach beyond what I think is ever plausible, 24s.

I'm guessing execution time is roughly linear in the number of nodes reached??? I'd be interested to hear your thoughts, concerns and suggestions - even if this means changing what we measure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, let's leave it as-is for now then.

}

private module TaintReachFlow = TaintTracking::Global<TaintReachConfig>;

/**
* Gets the total number of data flow nodes that taint reaches (from any source).
*/
int getTaintedNodesCount() { result = count(DataFlow::Node n | TaintReachFlow::flowTo(n)) }

/**
* Gets the proportion of data flow nodes that taint reaches (from any source),
* expressed as a count per million nodes.
*/
float getTaintReach() { result = (getTaintedNodesCount() * 1000000.0) / count(DataFlow::Node n) }
9 changes: 7 additions & 2 deletions rust/ql/test/query-tests/diagnostics/SummaryStats.expected
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,11 @@
| Macro calls - resolved | 8 |
| Macro calls - total | 9 |
| Macro calls - unresolved | 1 |
| Sensitive data | 0 |
| Taint edges - number of edges | 2 |
| Taint reach - nodes tainted | 0 |
| Taint reach - per million nodes | 0 |
| Taint sinks - cryptographic operations | 0 |
| Taint sinks - query sinks | 0 |
| Taint sources - active | 0 |
| Taint sources - total | 0 |
| Taint sources - disabled | 0 |
| Taint sources - sensitive data | 0 |