Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add integrand interface #497

Merged
merged 28 commits into from
Sep 21, 2023
Merged

add integrand interface #497

merged 28 commits into from
Sep 21, 2023

Conversation

lxvm
Copy link
Contributor

@lxvm lxvm commented Sep 17, 2023

Hi,

As discussed in SciML/Integrals.jl#169 and SciML/Integrals.jl#175, Integrals.jl needs integrand wrapper types to support features such as inplace evaluation, batched evaluation, and inplace and batched evaluation at the same time. This pr adds 3 structs which support these 3 distinct interfaces and are intended as an eventual replacement for the batch and nout keywords to IntegralProblem. (I don't think that those need to be removed just yet, since we can dispatch on these wrapper types to opt into new behaviors.)

  1. InplaceIntegrand(f!, result::AbstractArray): the caller provides an inplace function f!(y, x, p) and an array result which will store the solution (i.e. it has the same shape as the integrand output array y, but could have a different type due to units of the limits).
  2. BatchIntegrand(f!, y::AbstractVector, x::AbstractVector; max_batch): the caller provides an inplace function f!(y, x, p) of the form y .= f.(x, Ref(p)) and pre-allocated buffers that parallelize the calculation of the integrand over multiple quadrature nodes. For multi-dimensional integrands, the elements of x can be StaticArrays
  3. InplaceBatchIntegrand(f!, result::AbstractArray, y::AbstractArray, x::AbstractVector; max_batch): the caller provides an inplace function f!(y, x, p) that evaluates an array-valued inplace integrand at multiple points simultaneously. The solution is written to result. The buffer y is of size (size(result)..., size(x)...) and should be resize-able in its outer dimension (i.e. an ElasticArray)

I would appreciate any feedback on this pr. In particular, for multidimensional batched integrands it is nice to allow x to be a vector of SVectors instead of a Matrix, which is the current style. For the C libraries that only use Matrix we can use reinterpret(reshape, ...) to bring it into SVector form.

@ChrisRackauckas
Copy link
Member

There's no need for it to be a completely separate system from what already exists. Some uniformity would be good here. Make it IntegralFunction and BatchIntegralFunction, both as AbstractSciMLFunctions following a similar form to what is done in the others. In-place should just be the first argument and can be determined automatically using the machinery that's used in the other SciMLFunctions. That would make this feel much more in place (pun intended).

@lxvm
Copy link
Contributor Author

lxvm commented Sep 19, 2023

I hadn't used the AbstractSciMLFunctions before and I'm glad this is a solved problem. I'll try implementing the interface although most of the interface may be unused.

On the note of uniformity across the ecosystem, is there a SciML integral operator? These would act on differential forms instead of arrays, and perhaps there are applications?

@lxvm
Copy link
Contributor Author

lxvm commented Sep 19, 2023

@ChrisRackauckas I added IntegralFunction and BatchIntegralFunction. I wasn't very comfortable with relying on isinplace because for in-place integrands only the caller knows the array element type to allocate and also the BatchIntegralFunction requires a 3-argument function for both in-place and out-of-place forms. So the interface assumes a function is out-of-place unless a result array is provided. This is all in the documentation. Does it look good to you?

Comment on lines 2279 to 2300
jac = __has_jac(f) ? f.jac : nothing,
paramjac = __has_paramjac(f) ? f.paramjac : nothing,
analytic = __has_analytic(f) ? f.analytic : nothing,
syms = __has_syms(f) ? f.syms : nothing,
jac_prototype = __has_jac_prototype(f) ? f.jac_prototype : nothing,
sparsity = __has_sparsity(f) ? f.sparsity : nothing,
colorvec = __has_colorvec(f) ? f.colorvec : nothing,
observed = __has_observed(f) ? f.observed : nothing)
```

Note that only `f` is required, and in the case of inplace integrands a mutable container
`I` to store the result of the integral.

The remaining functions are optional and mainly used for accelerating the usage of `f`:
- `jac`: unused
- `paramjac`: unused
- `analytic`: unused
- `syms`: unused
- `jac_prototype`: unused
- `sparsity`: unused
- `colorvec`: unused
- `observed`: unused
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those arguments can be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks

Comment on lines 2308 to 2309
If `I` is present, `f` is interpreted as in-place, and otherwise `f` is assumed to be
out-of-place.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed? This should just be auto-determined.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, when passed it's iip

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I or integral_prototype is presumably an array that could have any element type and dimension specific to an inplace f, so the caller is basically declaring the output of f by passing the container. The C libraries assume everything is Vector{Float64}, with length nout passed to IntegralProblem, so that could be a fallback, but I would prefer it as a type assertion in the C wrappers since the user is already taking their time to write something efficient and in-place.

@ChrisRackauckas
Copy link
Member

I took a pass through it:

  1. We will need some things like observed in the future, but for now leave it off. We generally add things as the interface is built out, and the MTK side of Integrals.jl is just missing for now so we can ignore it until it's there.
  2. I'm a little weary that we cannot do proper dispatch checking on the batch version, but we at least we can test too many/too few arguments for appropriate errors.
  3. The extra values were renamed prototype values to match other parts of the interfaces. What's expected then is that init will make a copy so that the solves are thread-safe. Something we need to keep in mind with the implementation (I don't think we've done much thread safety in the testing since there's also C programs involved).

Let me know if there's anything here you'd disagree with.

@codecov
Copy link

codecov bot commented Sep 19, 2023

Codecov Report

Merging #497 (a6fd63a) into master (434e752) will increase coverage by 39.85%.
Report is 29 commits behind head on master.
The diff coverage is 84.74%.

@@             Coverage Diff             @@
##           master     #497       +/-   ##
===========================================
+ Coverage    0.00%   39.85%   +39.85%     
===========================================
  Files          50       50               
  Lines        3635     3756      +121     
===========================================
+ Hits            0     1497     +1497     
+ Misses       3635     2259     -1376     
Files Changed Coverage Δ
src/SciMLBase.jl 57.14% <ø> (+57.14%) ⬆️
src/scimlfunctions.jl 55.10% <79.48%> (+55.10%) ⬆️
src/problems/basic_problems.jl 88.73% <95.00%> (+88.73%) ⬆️

... and 39 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@lxvm
Copy link
Contributor Author

lxvm commented Sep 19, 2023

Hi Chris, this looks greatly improved! Thank you for looking into it

I made a few tweaks and here these are my thoughts:

  1. Sounds good, and I think MTK would be useful, especially for distinguishing integrand parameters from integration variables.
  2. For the batch version, the only type check we can do is on the output_prototype. I think we should decide right away how to handle this. I don't know of any pure Julia libraries that offer inplace batched integrands, but the way the C libraries handle this is that the output is a vector when batching out-of-place and the output is a matrix when batching in-place. Tentatively, we could check output_prototype isa AbstractVector when batching out-of-place and output_prototype isa ElasticArray && integral_prototype isa AbstractArray && ndims(output_prototype) == ndims(integral_prototype) + 1. Packages like ArraysOfArrays.jl could wrap an ElasticArray into a vector of views, and it is an elegant solution we could handle as well, but I think we need to decide on the number of dimensions of output_prototype and I vote for the former, i.e. ndim > 1 (in line with C libraries), instead of the latter, i.e ndim == 1 (which is a possibility using Julia). I might be backwards-thinking on this.
  3. I renamed integrand_prototype to integral_prototype, since I believe the program should write the result of the integral to the prototype because it is in-place evaluation. The only other difference is that the integrand and the integral could differ in units. About thread safety, I don't think any integration library does multi-threaded integrand evaluation because of thread-safety issues, but with the BatchIntegralFunction the user does the parallelization themselves.

Let me know if this looks good to you, especially the proposal in 2.

@ChrisRackauckas
Copy link
Member

About thread safety, I don't think any integration library does multi-threaded integrand evaluation because of thread-safety issues, but with the BatchIntegralFunction the user does the parallelization themselves.

It's more if the user calls solve while multithreading. If this uses the same caches between the different solves (because of the same problem), then there will be clashing and incorrect results.

For the batch version, the only type check we can do is on the output_prototype. I think we should decide right away how to handle this. I don't know of any pure Julia libraries that offer inplace batched integrands, but the way the C libraries handle this is that the output is a vector when batching out-of-place and the output is a matrix when batching in-place. Tentatively, we could check output_prototype isa AbstractVector when batching out-of-place and output_prototype isa ElasticArray && integral_prototype isa AbstractArray && ndims(output_prototype) == ndims(integral_prototype) + 1. Packages like ArraysOfArrays.jl could wrap an ElasticArray into a vector of views, and it is an elegant solution we could handle as well, but I think we need to decide on the number of dimensions of output_prototype and I vote for the former, i.e. ndim > 1 (in line with C libraries), instead of the latter, i.e ndim == 1 (which is a possibility using Julia). I might be backwards-thinking on this

That might be overthinking it. I think we can just match the C libraries unless we gain some extra feature or performance.

@lxvm
Copy link
Contributor Author

lxvm commented Sep 21, 2023

It's more if the user calls solve while multithreading. If this uses the same caches between the different solves (because of the same problem), then there will be clashing and incorrect results.

I see, yes we should avoid race conditions when convenient and that goes with the "prototype" naming convention. I reverted the name back to integrand_prototype since that is what makes sense now.

That might be overthinking it. I think we can just match the C libraries unless we gain some extra feature or performance.

Sounds good. For now the documentation recommends the output_prototype have an extra batching dimension compared to integrand_prototype, but we don't check any types since this may need to be done downstream by individual algorithms with unique APIs.

I'm happy with how this looks and would be interested in following up with some PRs on the Integrals.jl side to use these containers to add features. Do most SciML packages wrap the user's input into a SciMLFunction, or expect the user to provide it themselves?

@ChrisRackauckas
Copy link
Member

Do most SciML packages wrap the user's input into a SciMLFunction, or expect the user to provide it themselves?

I missed that part. The problem type always has a default dispatch to wrap in the simplest function type. The IntegralProblem should default to wrapping the f in a IntegralFunction

@lxvm
Copy link
Contributor Author

lxvm commented Sep 21, 2023

Ok, I added the wrapper, but this would be breaking behavior unless we make IntegralFunction callable. I notice some function types have these backwards compatibility methods, so should I define those?

If we are OK with breaking changes, I would also introduce the following:

  • remove the nout and batch keywords to IntegralProblem, since these can be obtained from output_prototypes and BatchIntegralFunctions
  • Replace the limits [lb, ub] with a single domain argument, similar to how Symbolics.jl handles integrals. Also add a method for converting [lb, ub] arguments into a domain

@ChrisRackauckas
Copy link
Member

I notice some function types have these backwards compatibility methods, so should I define those?

No need for those.

but this would be breaking behavior unless we make IntegralFunction callable.

Yes make it callable and just forward the args.

remove the nout and batch keywords to IntegralProblem, since these can be obtained from output_prototypes and BatchIntegralFunctions

Yes let's go for it.

Replace the limits [lb, ub] with a single domain argument, similar to how Symbolics.jl handles integrals. Also add a method for converting [lb, ub] arguments into a domain

That sounds good.

@lxvm
Copy link
Contributor Author

lxvm commented Sep 21, 2023

Alright, I committed the breaking changes. Now I should open a pr in Integrals.jl to migrate to this interface

Comment on lines -381 to -382
lb, ub, nout, p,
batch, kwargs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can have a deprecation path where if nout and batch are supplied we throw a warning and just define the prototypes as Arrays appropriately sized

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks for bringing this up. While implementing it I realized that the BatchIntegralFunction can let the user pass a two-argument out-of-place form and a 3-argument in-place form, which was allowed before and would match IntegralFunction. Then I'll remove the output_prototype field, since we can still query the output type of an out-of-place BatchIntegralFunction by calling the function on an empty vector of input points. The details of allocating an output_prototype may differ across libraries, but we have a mechanism to get the output type for both iip and oop forms, so this buffer can be correctly allocated by the solver, and solves our previous issue.

@lxvm
Copy link
Contributor Author

lxvm commented Sep 21, 2023

@ChrisRackauckas I added the deprecation methods and added the 2-argument out-of-place form for BatchIntegralFunction. In the deprecation method, I'm not sure if I'm handling the oop nout>1 case correctly, so I might need some time to review and test that.

@ChrisRackauckas
Copy link
Member

The simple IntegralProblem(f, (lb,ub), p) dispatch for non-AbstractIntegralFunction is something that should still be considered standard and make work without error or warning. I think that's handled correctly, but it's a bit hard to see.

However, I think it's safe to merge this into the v2.0. The reason is because Integrals.jl is the only consumer (that I know of) of the SciMLBase.jl stuff, and so labeling it as a major means that we can do the breaking release and it won't actually be put into action until a PR bumps the lower bound on Integrals.jl. That will give us a little leeway for testing it out in implementation without being breaking to anyone's workflows until it's clear that it works.

Because of that, I'm going to merge. Let's do it, and we can fix what comes up during the bump of Integrals.jl

@lxvm
Copy link
Contributor Author

lxvm commented Sep 21, 2023

I think there will be a problem with how the batch keyword is handled. Like nout, it should be given a default value, zero, if it is unset. I can't fix it for another hour, but with that fix I think we should merge

@lxvm
Copy link
Contributor Author

lxvm commented Sep 21, 2023

@ChrisRackauckas The obvious bugs are now fixed in the IntegralProblem constructor. I'm happy with how it is

@ChrisRackauckas ChrisRackauckas merged commit f14a1ff into SciML:master Sep 21, 2023
60 of 71 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants