Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal API to resolve a Callable to CallableCustom*, optimizing repeat call performance. #11787

Open
Ivorforce opened this issue Feb 16, 2025 · 0 comments

Comments

@Ivorforce
Copy link

Ivorforce commented Feb 16, 2025

Describe the project you are working on

Godot performance.

Describe the problem or limitation you are having in your project

For a PR attempting to optimize Array.filter, I profiled repeat callp performance.

In particular, it was interesting to see that manual GDScript array filtering was substantially faster than Array.filter. It would be possible to improve Array.filter to close the gap.

Digging into the profiler, considerable time was wasted on repeat callp preparation and validation:

Image
1.17 Gc  	16,9 %	Callable::callp(Variant const**, int, Variant&, Callable::CallError&) const
799.40 Mc	11,5 %	GDScriptLambdaCallable::is_valid() const
136.82 Mc	2,0 %	Callable::is_valid() const
62.54 Mc	0,9 %	Callable::is_null() const
61.47 Mc	0,9 %	Callable::is_custom() const
19.57 Mc	0,3 %	Callable::get_custom() const

This could be optimized.

Describe the feature / enhancement and how it helps to overcome the problem or limitation

It would be possible resolve Callable to a reference-owned CallableCustom.

A function could be added like CallableCustom *Callable::to_custom(CallError &r_call_error) const. It would return a CallableCustom, either new (refcount 1) or existing (refcounted incremented). The caller would hold on to it until finished with all calls. For repeat calls, this should eliminate the overhead for repeat calls, changing from O(n) to O(1).

For single calls, this would be the same speed or slower than callp. Depending on the real performance difference, the callp interface could perhaps be eliminated to compensate for the increase in complexity.

Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams

The function could be implemented easily like:

CallableCustom *Callable::to_custom(CallError &r_call_error) const {
	if (is_null()) {
		r_call_error.error = CallError::CALL_ERROR_INSTANCE_IS_NULL;
		r_call_error.argument = 0;
		r_call_error.expected = 0;
	} else if (is_custom()) {
		if (!is_valid() || !custom->ref_count.ref()) {
			r_call_error.error = CallError::CALL_ERROR_INSTANCE_IS_NULL;
			r_call_error.argument = 0;
			r_call_error.expected = 0;
			return nullptr;
		}

		return custom;
	} else {
		Object *obj = ObjectDB::get_instance(ObjectID(object));
#ifdef DEBUG_ENABLED
		if (!obj) {
			r_call_error.error = CallError::CALL_ERROR_INSTANCE_IS_NULL;
			r_call_error.argument = 0;
			r_call_error.expected = 0;
			return nullptr;
		}
#endif
		return obj->method_to_custom_callable(method, r_call_error);
	}
}

Object would require a similar implementation, to resolve a StringName to a CallableCustom that is pre-injected with the relevant Object*:

  • For RefCounted, it would increase the reference count by 1 and assume the Object persists in memory, gaining similar speeds as other CustomCallable implementations.
  • For non-RefCounted, the CustomCallable would need to verify the ObjectID on each call. This would be slow (similar speed as currently) right now, but may be acceleratable in the future.

GDScript could also make use of this API: If it is known within a function frame that a Callable isn't re-assigned, it could resolve to CustomCallable on first call, and re-use the returned function pointer each repeat call. This should give it a similar boost as other callers.

All in all, the proposed change should speed up repeat callp calls by an amortized 17%.

Future projects could attempt to dig in further and re-use a call frame of GDScriptFunction::call for repeat calls, possibly improving performance further.

If this enhancement will not be used often, can it be worked around with a few lines of script?

No.

Is there a reason why this should be core and not an add-on in the asset library?

It's core.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants