Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lower-level extension object read/write hooks for non-standard property backing stores #17512

Open
dktapps opened this issue Jan 18, 2025 · 10 comments

Comments

@dktapps
Copy link
Contributor

dktapps commented Jan 18, 2025

Description

I maintain a threading extension. One of the features offered by the extension is a ThreadSafe class. Implementing this requires overriding all of the default property read/write handlers to push values into a thread-safe store on write and pull them out on read. (This comes with some limitations of course, but that's not relevant for this issue.)

Unfortunately, because of the implementation of the likes of zend_std_read_property and friends, it's horribly inconvenient to mirror the behaviour of standard objects. In PHP 8.4 in particular, I find myself having to copy-paste large sections of php-src code to deal with property hooks.
This is because lots of logic about visibility, read-only, __get & __set, property hooks etc are baked directly into the std handlers.

This issue is requesting the introduction of more low-level property read/write hooks to allow extension objects to work with a different property backing store than the default, without losing all of the features provided by php-src for normal objects (or needing to copy-paste code).

Alternative solutions

I've spent several days recently working on a way to have the extension wrap around the default property handlers, but because there's so many behaviours (__set, property hooks, etc) I still found myself needing to duplicate various parts of the logic to get the behaviour to match up. There's 250 lines of complex logic in zend_std_write_property alone.

@iluuu1994
Copy link
Member

What are your needs exactly? Why can't you just call the std handlers from your hooks? That's what most extensions do. It's not clear to me in what way those handlers would be different.

@dktapps
Copy link
Contributor Author

dktapps commented Jan 18, 2025

What are your needs exactly? Why can't you just call the std handlers from your hooks? That's what most extensions do. It's not clear to me in what way those handlers would be different.

Because the std handlers only write to local property stores which can't be shared between threads. The extension has to mirror the changes in a thread-safe property store.

I recently attempted to wrap the std handlers to have them do their thing and then mirror changes afterwards, but this turned out to be a giant pain because of stuff like __set() and property hooks.

@iluuu1994
Copy link
Member

Without understanding your problem space, it's very hard to know how we can improve things. You'll either need to explain your problem in much more detail, or propose a solution. What would new handlers look like and how would they make things easier for you, specifically?

@nielsdos
Copy link
Member

As far as I understand, they want to not store the properties inside the properties table, but in a custom backing storage that they can share between different threads in a safe manner (probably by using some more abstractions)

@iluuu1994
Copy link
Member

Ok. I think you'll run into some issues with any approach. Specifically, you'll need to disable all references in properties of such objects, because writing to the references would otherwise circumvent your object handlers (and I'm guessing you cannot just return a pointer to the shared memory from get_property_ptr_ptr because you need to synchronize access to it). Similarly, compound-operations such as ++ and += are usually performed on the zval directly. If you return NULL from get_property_ptr_ptr, they'll instead use read_property and write_property (see ZEND_ASSIGN_OBJ_OP/zend_assign_op_overloaded_property). This is especially problematic because you'll need to lock both of these calls, so that no other thread reads/writes the value in the meantime. I'm not sure what the correct solution is here.

Ofc, we could implement purer read/write handlers that just read/write from the backing store that you can override. But I'm not sure that alone solves your problem.

@dktapps
Copy link
Contributor Author

dktapps commented Jan 18, 2025

Specifically, you'll need to disable all references in properties of such objects, because writing to the references would otherwise circumvent your object handlers

Yep, we already do this.

and I'm guessing you cannot just return a pointer to the shared memory from get_property_ptr_ptr because you need to synchronize access to it

Yup, we also have to nix this handler.

Similarly, compound-operations such as ++ and += are usually performed on the zval directly. If you return NULL from get_property_ptr_ptr, they'll instead use read_property and write_property (see ZEND_ASSIGN_OBJ_OP/zend_assign_op_overloaded_property). This is especially problematic because you'll need to lock both of these calls, so that no other thread reads/writes the value in the meantime. I'm not sure what the correct solution is here.

Yeah, this is an awkward problem that we've kinda just had to accept the existence of. In practice it's never caused issues as far as I know.

Ofc, we could implement purer read/write handlers that just read/write from the backing store that you can override. But I'm not sure that alone solves your problem.

If you want to get into the weeds with this, there are many, many things php-src could do differently to improve the ability to make threading extensions. I don't know if this solution is necessarily the best one for this particular issue.

To be honest I've mostly just accepted that php-src wasn't designed with threading in mind. All threading extensions are basically huge hacks anyway lol. I just got bothered by this particular issue because I've been trying to update to PHP 8.4 recently.

@iluuu1994
Copy link
Member

Ok, but then I suppose that a get_property_slot would not be sufficient, because you can't control the duration of the lock this way, correct? Then you would need new read/write handlers, which also means that you can't let the default handlers handle many things that would require them to know the zval value before doing the actual read/write:

  • If the property slot was explicitly unset(), call __get/__set instead
  • If the property is readonly and already initialized, write_property must fail
  • If the property is uninitialized and the object is lazy, initialize the object

I just looked at a few checks, there are more. So, I still don't know how this would be best solved. Did you have anything specific in mind?

@dktapps
Copy link
Contributor Author

dktapps commented Jan 18, 2025

Ok, but then I suppose that a get_property_slot would not be sufficient, because you can't control the duration of the lock this way, correct?

Right, also need to consider dynamic properties too. Unless the lock is held, the dynamic property table entry might disappear before Zend is done with it.

I guess I was thinking something along the lines of:

  • Extension wraps read_property and write_property to lock, call std_read_property/std_write_property, and unlock
  • Extension also implements low-level get_property_value and set_property_value which would interact with the backing store. These would have to assume they were called with a lock (which would be fine as long as they weren't invoked by any other part of the VM). I can see how that might turn into a problem later on though...

This also doesn't solve the issue of increment/decrement or any other in-place modification operators though. Perhaps a handler that accepts an operation argument and a property name/offset would work for that.

@iluuu1994
Copy link
Member

Extension wraps read_property and write_property to lock, call std_read_property/std_write_property, and unlock

I think this also means that your lock could potentially be held for a really long time, because with magic methods and property hooks they can essentially do anything. That might become a problem too.

which would be fine as long as they weren't invoked by any other part of the VM

I think that should be fine, as long as you disable fast paths in the VM, which you're probably already doing (or you would be missing reads/writes after a warmed property info cache).

@dktapps
Copy link
Contributor Author

dktapps commented Jan 18, 2025

I think this also means that your lock could potentially be held for a really long time, because with magic methods and property hooks they can essentially do anything. That might become a problem too.

True. I hadn't thought much about that.

There doesn't seem to be a great solution to this tbh. Anyone fancy implementing threading into php-src so I can stop spending my brainpower on this? (I'm joking... mostly)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants