-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite Expression
class?
#527
Comments
Thanks for getting this discussion started. For me it would be important to understand the requirements better & I try to do this by going over the code / writing tests. Some random thoughts and questions:
|
A problem with going away from the expression as a string could be that it becomes harder to save the expression. I think it is important to save the expression in some way, only saving the parameters won't get you anywhere if you lost the expression, but we could handle that by still setting a name and asking the user to put all relevant information there. Then we could save the name as an attribute with the parameters.
I like that idea, then we could get rid of the
That is possible for the structure I gave above, I'm just not sure how the optimization would handle that... |
I see the appeal of this solution, but I see there a major problem. In summary, it clashes with the objective of The whole objective of the class Expression was to finally be capable of doing all distributions and all expressions on parameters. To have full flexibility. With this alternative, the user will have to define its functions and pass them as arguments. We have to make sure that the same functions are used for training AND emulation. Thus pass the functions to the users. Thus have a script with all potential functions. Which is precisely what i wanted to avoid, being constrained to predefined functions. In my experience, there will always be new ideas, new expressions. So, every single time, we would have to add new functions to the script with all predefined functions. I sincerely fear that it will be messy to have them all. So, we have to choose, do we pass directly the functions and lose the full flexibility, or do we stick to the strings, with the fix on the parsing? My personal opinion is for the full flexibility, because of how much time i've spent in this code, because of future developments & applications. But again, I see the appeal of your solution. One extra point: the object returned by the alternative class is the On the other points raised by @mathause:
Of course, it requires some understanding of the string as input, with the parsing failing with extra commas (but potentially easy fix). It also requires |
Thanks for your reply - it's important to understand the requirements and design decisions.
Yes I think the usage is fine - but I still don't like it - in principle (sorry) and because it makes the code more difficult to understand (and is also a bit slower). You do a parsing step which might avoid arbitrary code execution (but that is a difficult problem to solve in general).
If we need to know that we e.g. call
See my comment above.
This is something we could do for the |
I understand the principle of writing a subclass for each distribution... But this is precusely what i want to avoid. Yes, it would take quite some time indeed, and it will slow down future developments. That is why the object returned by Regarding the covariates, I also understand your point... but there are limits with my version (constrained to math/numpy only), and with your version (constrained to prepared functions). And you want to avoid developing our own syntax... but using prepared functions is also developing our own syntax. Thus, same problem. As I said, I have already implemented bounds on parameters. To summarize:
Can we not find a middle ground? I understand that you dislike the eval/exec, but I feel like we would lose a lot of flexibility and time, now and in the future. |
@yquilcaille we are now working through your code and getting familiar with it and understanding what it needs to be able to do. To answer one of questions: yes it is required to understand the structure of the expression. we need to know which coefficient belongs to which param (e.g. that mesmer/mesmer/mesmer_x/train_l_distrib_mesmerx.py Lines 910 to 915 in 2b5da4a
mesmer/mesmer/mesmer_x/train_l_distrib_mesmerx.py Line 1093 in 2b5da4a
(as a side note |
I was thinking why we do not actually pass functions as arguments instead of a strings to the Expression class. I was thinking something like this:
And then feeding the functions to the optimization routines like we would after parsing? This is not incredibly thought through yet but it might be a way around the parsing.
The text was updated successfully, but these errors were encountered: