[SPARK-51008][SQL] Add ResultStage for AQE #49715

liuzqt · 2025-01-28T19:50:59Z

What changes were proposed in this pull request?

Added ResultQueryStageExec for AQE

How does the query plan look like in explain string:

AdaptiveSparkPlan isFinalPlan=true
+- == Final Plan ==
   ResultQueryStage 2 ------> newly added
   +- *(5) Project [id#26L]
      +- *(5) SortMergeJoin [id#26L], [id#27L], Inner
         :- *(3) Sort [id#26L ASC NULLS FIRST], false, 0
         :  +- AQEShuffleRead coalesced
         :     +- ShuffleQueryStage 0
         :        +- Exchange hashpartitioning(id#26L, 200), ENSURE_REQUIREMENTS, [plan_id=247]
         :           +- *(1) Range (0, 25600, step=1, splits=10)
         +- *(4) Sort [id#27L ASC NULLS FIRST], false, 0
            +- AQEShuffleRead coalesced
               +- ShuffleQueryStage 1
                  +- Exchange hashpartitioning(id#27L, 200), ENSURE_REQUIREMENTS, [plan_id=257]
                     +- *(2) Ran...

How does the query plan look like in Spark UI:

Why are the changes needed?

Currently AQE framework is not fully self-contained since not all plan segments can be put into a query stage: the final "stage" basically executed as a nonAQE plan. This PR added a result query stage for AQE to unify the framework. With this change, we can build more query stage level features, one use case like #44013 (comment)

Does this PR introduce any user-facing change?

NO

How was this patch tested?

new unit tests.

Also exisiting tests which are impacted by this change are updated to keep their original test semantics.

Was this patch authored or co-authored using generative AI tooling?

NO

liuzqt · 2025-02-04T00:12:52Z

@cloud-fan

sql/core/src/test/scala/org/apache/spark/sql/execution/QueryExecutionSuite.scala

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala

cloud-fan · 2025-02-04T05:08:49Z

cc @ulysses-you

ulysses-you · 2025-02-05T01:56:22Z

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala

@@ -588,7 +639,7 @@ case class AdaptiveSparkPlanExec(
      if (plan.children.isEmpty) {
        CreateStageResult(newPlan = plan, allChildStagesMaterialized = true, newStages = Seq.empty)
      } else {
-        val results = plan.children.map(createQueryStages)
+        val results = plan.children.map(createQueryStagesInternal)
        CreateStageResult(
          newPlan = plan.withNewChildren(results.map(_.newPlan)),
          allChildStagesMaterialized = results.forall(_.allChildStagesMaterialized),


It seems the new code is a bit hard to read. Not sure if there are some developing context.

Can we create result query stage here ? If the plan is root query and allChildStagesMaterialized then we wrap ResultQueryStage and it is not a materialized stage, so aqe will materialize it.

sounds reasonable to me, cc @liuzqt

ulysses-you

so this pr is just one of the stage level feature prs ?

cloud-fan · 2025-02-05T07:35:37Z

@ulysses-you yes, after this PR, we can implement the proposed idea in #44013 (comment) and keep contexts in the AQE query stage.

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/QueryStageExec.scala

…e/AdaptiveSparkPlanExec.scala Co-authored-by: Wenchen Fan <[email protected]>

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala

cloud-fan · 2025-02-06T02:39:33Z

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala

@@ -579,23 +592,52 @@ case class AdaptiveSparkPlanExec(
        allChildStagesMaterialized = false,
        newStages = Seq(newStage))

-    case q: QueryStageExec =>
+    case q: QueryStageExec if q ne currentPhysicalPlan =>


what does this condition protect?

We can have plan like this:

ShuffleQueryStage 0 +- Exchange hashpartitioning(key#17, 5), REPARTITION_BY_COL, [plan_id=89] +- *(1) SerializeFromObject [invoke(knownnotnull(assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData, true])).key()) AS key#17, static_invoke(UTF8String.fromString(invoke(knownnotnull(assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData, true])).value()))) AS value#18] +- Scan[obj#14]

where the root plan is a ShuffleQueryStageExec and we have to create a ResultQueryStage on top of it.
==>

ResultQueryStage 1 +- AQEShuffleRead coalesced +- ShuffleQueryStage 0 +- Exchange hashpartitioning(key#17, 5), REPARTITION_BY_COL, [plan_id=89] +- *(1) SerializeFromObject [invoke(knownnotnull(assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData, true])).key()) AS key#17, static_invoke(UTF8String.fromString(invoke(knownnotnull(assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData, true])).value()))) AS value#18] +- S...

This refactor is equivalent to my previous implementation using createQueryStagesInternal and create result stage in the external createQueryStages

ah I see, let's add comments to explain it

// We can skip creating a new query stage if the given plan is already a query stage. // Note: if this query stage is the root node, we still need to create a result query stage.

actually, the caller always invokes createQueryStages with currentPhysicalPlan, so we know when to deal with the result stage. Now I feel the previous code is clearer. Maybe just name it better? e.g. createQueryStages and createNonResultQueryStages.

I just noticed an even more complicated pattern from broken test: We create a new non-result query stage as the root node, and that query stage is immediately materialized due to stage reuse, so we have to create result stage right after. Current implementation can not handle such case, and fixing is might be hacky...

So yes I think maybe separating result and non-result query stage creation is a better option. I'll rename it and add some comments to clarify.

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala

cloud-fan · 2025-02-06T02:44:04Z

sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala

+    assert(plan2.isInstanceOf[ResultQueryStageExec])
+    assert(plan1 ne plan2)
+    assert(plan1.asInstanceOf[ResultQueryStageExec].plan
+      .fastEquals(plan2.asInstanceOf[ResultQueryStageExec].plan))


should they be equal? I think these two result stages should have different handler functions?

Yes they have different handler function. But the root plan they wrap should be the same(which is the original AQE root plan)

sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala

ulysses-you · 2025-02-06T07:52:46Z

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala

-            currentPhysicalPlan = newPhysicalPlan
-            currentLogicalPlan = newLogicalPlan
-            stagesToReplace = Seq.empty[QueryStageExec]
+        if (!currentPhysicalPlan.isInstanceOf[ResultQueryStageExec]) {


why do we need to skip ResultQueryStageExec ?

Result stage is already the last step, there is nothing to reoptimize.

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala

ulysses-you · 2025-02-06T08:06:58Z

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala

+   * Run `fun` on finalized physical plan
+   */
+  def withFinalPlanUpdate[T](fun: SparkPlan => T): T = lock.synchronized {
+    _isFinalPlan = false


so when we call df.collect multi-times, we will re-optimize final stage multi-times. It is due to for each call we need to wrap new ResultQueryStageExec.

In this case we construct QueryResultStageExec directly and won't re-optimize it: https://github.com/apache/spark/pull/49715/files#diff-ec42cd27662f3f528832c298a60fffa1d341feb04aa1d8c80044b70cbe0ebbfcR536

ulysses-you · 2025-02-06T08:17:58Z

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala

    }
+    _isFinalPlan = true
+    finalPlanUpdate
+    currentPhysicalPlan.asInstanceOf[ResultQueryStageExec].resultOption.get().get.asInstanceOf[T]


does it mean we would cache result data ? is it expected ?

Good point, this is actually a side effect of all QueryStageExec...

We can implement a "fetch-oncesemantic which only fetch once at the end of AQE loop. But still we can not prevent user from accessing it multiple times as long as they can access theResultQueryStageExec` node from the query plan.

@cloud-fan what do you think

This is a good catch! This stops the result from being GCed if the users throw away the result of df.collect() but still keep the df around.

Maybe the final outcome of a ResultStage should be Unit which is only used to trigger the final plan calculation. The caller side is still responsible for running the function to get the result.

The proposal above can also simplify things: once a result stage is created, we never need to recreate it as the final plan is finalized. It's similar to the def getFinalPhysicalPlan() style before.

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala

draft

2f1669e

github-actions bot added the SQL label Jan 28, 2025

fix

5bfb8e5

liuzqt force-pushed the SPARK-51008 branch from 08df46a to 5bfb8e5 Compare January 31, 2025 19:50

liuzqt added 2 commits January 31, 2025 16:10

fix tests

6e1fd83

fix test

4251762

liuzqt changed the title ~~[SPARK-51008][SQL][WIP] Add ResultStage for AQE~~ [SPARK-51008][SQL] Add ResultStage for AQE Feb 3, 2025

cloud-fan reviewed Feb 4, 2025

View reviewed changes

sql/core/src/test/scala/org/apache/spark/sql/execution/QueryExecutionSuite.scala Outdated Show resolved Hide resolved

cloud-fan reviewed Feb 4, 2025

View reviewed changes

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala Show resolved Hide resolved

cloud-fan reviewed Feb 4, 2025

View reviewed changes

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala Show resolved Hide resolved

cloud-fan reviewed Feb 4, 2025

View reviewed changes

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala Show resolved Hide resolved

cloud-fan mentioned this pull request Feb 4, 2025

[SPARK-46090][SQL] Support plan fragment level SQL configs in AQE #44013

Closed

update

f13d11d

liuzqt requested a review from cloud-fan February 4, 2025 19:34

Merge remote-tracking branch 'upstream/master' into SPARK-51008

14f4ba8

ulysses-you reviewed Feb 5, 2025

View reviewed changes

cloud-fan reviewed Feb 5, 2025

View reviewed changes

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala Outdated Show resolved Hide resolved

cloud-fan reviewed Feb 5, 2025

View reviewed changes

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala Outdated Show resolved Hide resolved

cloud-fan reviewed Feb 5, 2025

View reviewed changes

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala Outdated Show resolved Hide resolved

cloud-fan reviewed Feb 5, 2025

View reviewed changes

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala Show resolved Hide resolved

cloud-fan reviewed Feb 5, 2025

View reviewed changes

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/QueryStageExec.scala Outdated Show resolved Hide resolved

liuzqt and others added 4 commits February 5, 2025 11:05

Update sql/core/src/main/scala/org/apache/spark/sql/execution/adaptiv…

eb2875b

…e/AdaptiveSparkPlanExec.scala Co-authored-by: Wenchen Fan <[email protected]>

Update sql/core/src/main/scala/org/apache/spark/sql/execution/adaptiv…

1ad4061

…e/AdaptiveSparkPlanExec.scala Co-authored-by: Wenchen Fan <[email protected]>

minor

915bf39

refactor createQueryStages

4248e55

liuzqt force-pushed the SPARK-51008 branch from 82a5871 to 4248e55 Compare February 6, 2025 00:32

cloud-fan reviewed Feb 6, 2025

View reviewed changes

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala Outdated Show resolved Hide resolved

cloud-fan reviewed Feb 6, 2025

View reviewed changes

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala Outdated Show resolved Hide resolved

cloud-fan reviewed Feb 6, 2025

View reviewed changes

sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala Outdated Show resolved Hide resolved

liuzqt added 2 commits February 5, 2025 22:48

update

cc82864

update

d9017c4

liuzqt requested review from cloud-fan and ulysses-you February 6, 2025 06:58

ulysses-you reviewed Feb 6, 2025

View reviewed changes

cloud-fan reviewed Feb 6, 2025

View reviewed changes

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala Outdated Show resolved Hide resolved

cloud-fan reviewed Feb 6, 2025

View reviewed changes

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala Outdated Show resolved Hide resolved

liuzqt added 2 commits February 6, 2025 17:35

refactor back

7ba69f4

minor

1f376c3

liuzqt requested a review from cloud-fan February 7, 2025 02:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-51008][SQL] Add ResultStage for AQE #49715

[SPARK-51008][SQL] Add ResultStage for AQE #49715

liuzqt commented Jan 28, 2025 •

edited

Loading

liuzqt commented Feb 4, 2025

cloud-fan commented Feb 4, 2025

ulysses-you Feb 5, 2025

cloud-fan Feb 5, 2025

ulysses-you left a comment

cloud-fan commented Feb 5, 2025

cloud-fan Feb 6, 2025

liuzqt Feb 6, 2025

cloud-fan Feb 6, 2025 •

edited

Loading

cloud-fan Feb 6, 2025

liuzqt Feb 7, 2025

liuzqt Feb 7, 2025

cloud-fan Feb 6, 2025

liuzqt Feb 6, 2025

ulysses-you Feb 6, 2025

cloud-fan Feb 6, 2025 •

edited

Loading

ulysses-you Feb 6, 2025

cloud-fan Feb 6, 2025

ulysses-you Feb 6, 2025

liuzqt Feb 7, 2025

cloud-fan Feb 7, 2025 •

edited

Loading

cloud-fan Feb 7, 2025

[SPARK-51008][SQL] Add ResultStage for AQE #49715

Are you sure you want to change the base?

[SPARK-51008][SQL] Add ResultStage for AQE #49715

Conversation

liuzqt commented Jan 28, 2025 • edited Loading

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

liuzqt commented Feb 4, 2025

cloud-fan commented Feb 4, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ulysses-you left a comment

Choose a reason for hiding this comment

cloud-fan commented Feb 5, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cloud-fan Feb 6, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cloud-fan Feb 6, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cloud-fan Feb 7, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

liuzqt commented Jan 28, 2025 •

edited

Loading

cloud-fan Feb 6, 2025 •

edited

Loading

cloud-fan Feb 6, 2025 •

edited

Loading

cloud-fan Feb 7, 2025 •

edited

Loading