Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

weighting of scenarios? #569

Open
mathause opened this issue Nov 27, 2024 · 2 comments
Open

weighting of scenarios? #569

mathause opened this issue Nov 27, 2024 · 2 comments

Comments

@mathause
Copy link
Member

mathause commented Nov 27, 2024

I always thought that the scenario weights applied to the linear regression is given by 1 / (n_ens * n_ts). However it's 1 / n_ens. I probably miss-interpreted this. The original code (v0.8.0) is here:

# assumption: nr_runs per scen and nr_ts for these runs can vary
# derive weights such that each scenario receives same weight (divide by nr samples)
nr_samples = 0
wgt_scen_eq = []
for scen in scens:
nr_runs, nr_ts, nr_gps = targ[scen].shape
nr_samples_scen = nr_runs * nr_ts
wgt_scen_eq = np.append(wgt_scen_eq, np.repeat(1 / nr_runs, nr_samples_scen))
nr_samples += nr_samples_scen

I refactored this in #143 and adapted the comment to

derive scenario weights such that each has equal weight, i.e., 1 / number of samples
(= nr_runs * nr_ts)

but importantly the code stayed the same:

weights.append(np.full(nr_samples_scen, 1 / nr_runs))

From Beusch et al. (2022):

"To obtain robust MESMER parameter estimates for each ESM, MESMER is trained on all available ensemble members of each available scenario and equal weight is given to each scenario."


I think it's not 100% clear - you could argue that the historical scenario does get a bit more weight as it has more time steps. But saying the weight is 1 / n_ens is a just-as-valid interpretation of "equal weight for each scenario". So in conclusion there is nothing to do here (except maybe to adapt my comment).

Originally commented in #567 (review)

edit: corrected n_scen -> n_ens

@veni-vidi-vici-dormivi
Copy link
Collaborator

Shouldn't n_scen be n_ens or n_runs? Or do you actually mean n_scen because if you were to weigh each sample by 1/n_scen scenarios with more members would be overrepresented.

@mathause
Copy link
Member Author

Shouldn't n_scen be n_ens or n_runs? Or do you actually mean n_scen because if you were to weigh each sample by 1/n_scen scenarios with more members would be overrepresented.

Yes you are right - I mean n_ens. I'll correct it above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants