-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CALCITE-6145] Function 'TRIM' without parameters throw NullPointerEx… #3869
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should improve the description in reference.md
Do you've something explicit on your mind? I've checked other functions from |
Thanks for reminding me, I didn't notice this. |
@@ -10857,6 +10857,7 @@ void assertSubFunReturns(boolean binary, String s, int start, | |||
f.checkFails("trim('' from 'abcde')", | |||
"Trim error: trim character must be exactly 1 character", | |||
true); | |||
f.checkFails("trim()", "String to trim cannot be null", false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the expected error for this function?
Or should the result be null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The creator of the original https://issues.apache.org/jira/browse/CALCITE-6145 suggested to throw an exception with an exact description.
I've checked the behavior of PostgreSQL (v15):
SELECT TRIM() AS trimmed_string;
leads to a syntax error: Query Error: error: syntax error at or near ")"
.
SELECT TRIM("") AS trimmed_string;
leads to an error: Query Error: error: zero-length delimited identifier at or near """"
.
SELECT TRIM(null) AS trimmed_string;
returns null
.
So I think this aligns with the behavior of PostgreSQL throwing an error during query parsing, but I think the message could be a bit better: trim cannot be called without arguments
maybe?
I've added an explicit test for when the value is null
, which indeed returns null
(aligns with PostgreSQL's behavior).
WDYT? @mihaibudiu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the function trim() need to have arguments, then the type checker should have complained about not finding a function with a signature without arguments. I don't expect the parser to complain about it, that is strange. The parser should just accept the function, and the type checker should validate it.
The original message is not good, since clearly there is no 'null' involved. However, if the error comes from the parser, there isn't much you can do about it.
@@ -130,6 +130,9 @@ public SqlTrimFunction(String name, SqlKind kind, | |||
if (operands[1] == null) { | |||
operands[1] = SqlLiteral.createCharString(" ", pos); | |||
} | |||
if (operands[2] == null) { | |||
throw new IllegalArgumentException("String to trim cannot be null"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"The second argument of trim cannot be null"
It's not clear what "string" you are referring to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's either the first or the third:
You can specify one argument (only the string to trim) or three (where to trim (leading, trailing both sides), what to trim from the string, the string to trim).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"The second argument of trim cannot be null"
This is not 100% correct I think as the string to trim from can be null
: trim('a' from cast(null as varchar(1)))
returns null
, which is valid. I think it's more about the absence of the argument, which feels to me like a different thing than a value/field being null
.
Ideally we should correctly fall through to the default case or handle the cases, where the string to trim is absent explicitly.
@@ -130,6 +130,9 @@ public SqlTrimFunction(String name, SqlKind kind, | |||
if (operands[1] == null) { | |||
operands[1] = SqlLiteral.createCharString(" ", pos); | |||
} | |||
if (operands[2] == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
something is wrong here. If you reached this point you have one argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I've two (flag from which side(s) to trim, string to trim from the 3rd operand) right?
For trim(both 'a' from 'aAa')
you've three operands
:
operand[0]: "BOTH"
operand[1]: "a"
operand[2]: "aAa"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, and the message says that there are "no arguments"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, currently adjusting this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok so I've debugged through this:
If you call trim()
the following happens:
- Enters the block for 3 operands (first gets a default value, second and third are
null
, not in the sense of their value, but of their absence) operands[0]
gets set toBOTH
operands[1]
gets so to" "
operands[2]
isnull
(indicating absence of the arg not that its actual value isnull
)
So the actual method call from a user perspective was one without arguments (trim()
).
trim(" a ")
works fine
trim(both " a ")
works fine
trim("a" from)
syntax error (string to trim is absent)
trim(both "a" from)
syntax error (string to trim is absent)
So to handle the no-args case I think it's correct to check for if (operands[2] == null)
in the block with 3
operands. I'll add a comment explaining it, otherwise it's confusing I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, the first two arguments have some default values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mihaibudiu @timgrein Perhaps we can refer to SparkSQL's exception information.
SparkSQL:
spark-sql> select trim();
Error in query: Invalid number of arguments for function trim. Expected: one of 1 and 2; Found: 0; line 1 pos 7
Please let us know when this is again ready for review. |
Looking into this again, is there any guidance on where these type of validations/semantic analysis should live? I see there's another method called |
I think your code is in the right place. |
So it this ready for review? |
@mihaibudiu Sorry I was on vacation the last week, I'll give it a read again and let you know, whether it's ready to review! |
Thank you for your patience! I've adjusted the error message to follow what Calcite uses at a lot of other places Edit: grepped through the codebase and it seems like messages starting with |
Quality Gate passedIssues Measures |
Looks like you have some lines that are too long. |
@timgrein this is almost ready, with just a little more work we can merge it |
This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 90 days if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions. |
Added an explicit check at parsing time to make sure that the string to trim is not
null