Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-49554: Do not attempt forced photometry in NO_DATA regions #300

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mrawls
Copy link
Contributor

@mrawls mrawls commented Mar 26, 2025

This PR changes forcedMeasurementTask so it skips measuring sources with pixelFlags_nodataCenter.

This approach is inspired by ip_diffim's dipoleFitTask. We have to run the centroid plugin first, and then we run PixelFlags next so we can drop sources from the measCat before other plugins are run. By default, only sources with the nodataCenter flag are dropped. In practice, most NO_DATA regions are from template holes in diffims, so the result will be an identical length mergedForcedSource table but with NaNs for the fluxes of the obviously garbage diaSources.

…order.

This approach is inspired by ip_diffim's dipoleFitTask.
We have to run the centroid plugin first, and then we run PixelFlags next
so we can drop sources from the measCat before other plugins are run.
By default, only sources with the nodataCenter flag are dropped.
In practice, most NO_DATA regions are from template holes in diffims,
so the result will be an identical length mergedForcedSource table but
with NaNs for the fluxes of the obviously garbage diaSources.
Copy link
Member

@TallJimbo TallJimbo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd prefer for us to implement this behavior differently, and part of that would be making it so base_PixelFlags is not required in all configurations of forced photometry (and hence some of the other PRs for this ticket could just disappear).

Measurement plugins already have a concept of an "execution order", which is how we sort them when we run them on each source. Happily centroids already go first. Pixel flags do not currently go terribly early, but we can fix that with a one-line change here:

wrapSimpleAlgorithm(PixelFlagsAlgorithm, Control=PixelFlagsControl,
executionOrder=BasePlugin.FLUX_ORDER)

by using a number between 0 and 1 instead of the FLUX_ORDER constant.

And then I think we'd want to put the logic that decides whether to actually run the rest of the measurements in the per-source loop over plugins here:

for plugin in self.plugins.iter():
if beginOrder is not None and plugin.getExecutionOrder() < beginOrder:
continue
if endOrder is not None and plugin.getExecutionOrder() >= endOrder:
break
self.doMeasurement(plugin, measRecord, *args, **kwds)

And I think that means we really want to move this logic and configuration out of ForcedMeasurementTask into BaseMeasurementTask, which has the nice bonus of letting us turn it on in any other kind of measurement as well.

The simplest approach would be to move your badPixelFlags option to BaseMeasurementConfig, also add another config option that sets the execution order at which we look at those flags, and break out of the loop when we first encounter a plugin higher than that execution order and one of the bad flags is set.

A nice generalization would be to instead add a SourceSelector configuration to BaseMeasurementConfig instead of just a list of bad flags, and let that selector decide whether to keep running the plugins or break (we'd still also have a config option for the execution order at which we run that check). But I'm not certain a SourceSelector can be called one afw.table.SourceRecord at a time; my impression is that it's been moving in the direction of operating on full catalogs and numpy arrays so it can do vectorization (which is good most of the time, just not here). So this may not work, but I think it's worth a look.

@mrawls
Copy link
Contributor Author

mrawls commented Mar 27, 2025

Thanks so much Jim — I think I generally understand and agree with your suggested approach here.

The one bit I don't think we can do is use a SourceSelector, because the substantial changes are all in meas_base, and SourceSelectors live in meas_algorithms (which have meas_base as a dependency). That's why it is just a list.

@TallJimbo
Copy link
Member

👍 that's a good reason not to attempt the SourceSelector approach.

@mrawls
Copy link
Contributor Author

mrawls commented Mar 27, 2025

@parejkoj and I chatted about this today, and while I still like your generalization idea in principle, he pointed out it might be a lot harder to implement than you outline because forcedMeasurement has its own special snowflake run method and doesn't really use baseMeasurement's. I invite the two of you to discuss 😄

@TallJimbo
Copy link
Member

BaseMeasurementTask doesn't actually have a run method itself, and ForcedMeasurementTask does call BaseMeasurementTask.callMeasure, so I think it still holds up. But it's possible I missed something.

@parejkoj
Copy link
Contributor

Ah, sorry, when I made that statement, I hadn't seen the details of your proposal to use callMeasure.

I agree that pixelFlags should be the lowest possible run order, since it doesn't depend on anything else, only the existence of a footprint. I'd be surprised if there was ever a time that we didn't want to run it in a measurement task: it contains the most basic possible information about whether a source is valid.

Having "this cannot be a source" logic in the base class is a good idea, but I'm a bit skeptical of putting if: break logic for this in base callMeasure. That makes for somewhat complicated logic to test (there are no unittests of the Measurement framework itself; the closest is test_forcedPhot.py), given that the plugin order also matters there. The reason I proposed doing it the way on this ticket is that we can directly reject those records at the start, so they no longer exist at all (since they cannot possibly be real), and it makes the code pretty clear. We don't need the noise replacer run before pixelFlags, so it doesn't need to be in the usual plugin loop.

@TallJimbo
Copy link
Member

TallJimbo commented Mar 31, 2025

I'm a bit skeptical of putting if: break logic for this in base callMeasure. That makes for somewhat complicated logic to test (there are no unittests of the Measurement framework itself; the closest is test_forcedPhot.py), given that the plugin order also matters there. The reason I proposed doing it the way on this ticket is that we can directly reject those records at the start, so they no longer exist at all (since they cannot possibly be real), and it makes the code pretty clear.

Note that my proposal also avoids the new _pre_measure method entirely, and that (at present) includes a mostly-duplicate callMeasure loop, a temporary copy of the plugins map, and a copy of the catalog. I don't think there's any way that's simpler than an if ...: break in the existing loop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants