You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Issue: While using registerFunctions to register a function in Presto, it requires a format following catalog.schema.function_name. While this makes sense, it begs the following question when implementing a dynamically loading library in Presto Cpp:
Should we gracefully create a user experience that if a user were to not follow the guidelines in this instance, it does not crash the worker?
Current Behavior
Background: When loading functions (types, or anything else for that matter) dynamically, the DynamicLibraryLoader loads the symbols without knowing what is being loaded in the first place. While this is known behavior for the dlopen library, it can lead to abrupt terminations of the worker. While the burden of responsibility does fall on the user to register functions properly, it can be confusing especially when there seems to be no straightforward way for me to declare a default namespace for these registrations. Moreover, another source of confusion can arise from Velox's function registrations which does not require namespaces at this time.
PR for dynamically loading library in Presto CPP [https://github.com//pull/24330]
as an example, if I attempt to register this function dynamically, my worker crashes and leads to the following error:
terminate called after throwing an instance of 'facebook::velox::VeloxUserError'
what(): Exception: VeloxUserError
Error Source: USER
Error Code: INVALID_ARGUMENT
Reason: Prefix missing for function dynamic_3
Retriable: False
Expression: parts.size() == 3
Function: getFunctionNameParts
Possible Solution
after discussing with @aditi-pandit and @czentgr, there seem to be four potential ways to handle this. Which way, if any would be the best way to move forward?
Create a wrapper function registry in Presto following roughly this format:
This function can perform some checks and then call the velox function registry with the proper Presto function naming.
Create a registerNamespace<> function where each user will be required to register a namespace for every dynamic library they wish to load one time. After that, all function registries will follow the namespace name that was registered for this file.
for the above solution, there is a caveat that the user will have to separate any functions they want to add to a different namespace onto a separate file.
Create a config for a namespace value. this may be unflexible and would lead to having the user restart the server each time they want to make a change to the namespace or use multiple configs somehow.
REST API calls to dynamically load namespaces. This was we can deploy without having to restart the presto server. Not sure how we will be able to connect which namespace correspond to which dynamic function registrations.
This error seems to be coming from a check inside FunctionMetadata.cpp::getFunctionNameParts which gets invoked inside FunctionMetadata.cpp::getFunctionsMetadata(). This function is invoked when a rest call of /v1/functions is made likely by the coordinator.
The catalog.schema prefix is a Presto artifact. How are the Java UDFs registered?
Why does Velox require a prefix for the function? I thought functions in Velox had no concept of namespaces.
Java UDFs may be registered under the default namespace, or with a custom namespace. In C++, I would expect we need the ability to declare a function under a particular namespace, with this namespace reflecting upward to the execution engine through the sidecar process.
@tdcmeehan apologies for the disconnect, i agree that velox does not reqire a prefix to the function name!! I was just trying to mention that as a way to mention the different requirements for velox and presto which IMO means we deal with it differently between the two libraries. this requirement comes from the presto side from FunctionMetadata.cpp
Moreover, another source of confusion can arise from Velox's function registrations which does not require namespaces at this time.
the prefix is checked in FunctionMetadata.cpp in the function getFunctionNameParts
Your Environment
Expected Behavior
Issue: While using registerFunctions to register a function in Presto, it requires a format following catalog.schema.function_name. While this makes sense, it begs the following question when implementing a dynamically loading library in Presto Cpp:
Should we gracefully create a user experience that if a user were to not follow the guidelines in this instance, it does not crash the worker?
Current Behavior
Background: When loading functions (types, or anything else for that matter) dynamically, the DynamicLibraryLoader loads the symbols without knowing what is being loaded in the first place. While this is known behavior for the dlopen library, it can lead to abrupt terminations of the worker. While the burden of responsibility does fall on the user to register functions properly, it can be confusing especially when there seems to be no straightforward way for me to declare a default namespace for these registrations. Moreover, another source of confusion can arise from Velox's function registrations which does not require namespaces at this time.
PR for dynamically loading library in Presto CPP [https://github.com//pull/24330]
as an example, if I attempt to register this function dynamically, my worker crashes and leads to the following error:
Possible Solution
after discussing with @aditi-pandit and @czentgr, there seem to be four potential ways to handle this. Which way, if any would be the best way to move forward?
This function can perform some checks and then call the velox function registry with the proper Presto function naming.
for the above solution, there is a caveat that the user will have to separate any functions they want to add to a different namespace onto a separate file.
Create a config for a namespace value. this may be unflexible and would lead to having the user restart the server each time they want to make a change to the namespace or use multiple configs somehow.
REST API calls to dynamically load namespaces. This was we can deploy without having to restart the presto server. Not sure how we will be able to connect which namespace correspond to which dynamic function registrations.
Steps to Reproduce
The text was updated successfully, but these errors were encountered: