You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We currently assume that the number of samples used to estimate mutation probabilities is equal to the number of samples upon which predictions are made, e.g.,
This is unnecessary: a different number of samples may be used to estimate (train) than is used when computing expectations. The code should therefore be generalized to have two independent variables, representing the number of training samples, and the number of prediction samples. The values of these variables may be inferred from data files, or furnished by the user.
We currently assume that the number of samples used to estimate mutation probabilities is equal to the number of samples upon which predictions are made, e.g.,
training:
constraint-tools/train-model/estimate_mutation_probabilities
Line 185 in 73ce304
prediction:
constraint-tools/predict-constraint/compute_mutation_counts.py
Line 40 in a4b6022
This is unnecessary: a different number of samples may be used to estimate (train) than is used when computing expectations. The code should therefore be generalized to have two independent variables, representing the number of training samples, and the number of prediction samples. The values of these variables may be inferred from data files, or furnished by the user.
cc: @jkunisak
The text was updated successfully, but these errors were encountered: