generalize to two independent variables representing: number of training samples, and number of prediction samples #16

petermchale · 2021-10-19T16:34:45Z

We currently assume that the number of samples used to estimate mutation probabilities is equal to the number of samples upon which predictions are made, e.g.,

training:

constraint-tools/train-model/estimate_mutation_probabilities

Line 185 in 73ce304

'number_tumors': args.number_tumors,

prediction:

constraint-tools/predict-constraint/compute_mutation_counts.py

Line 40 in a4b6022

model['number_tumors'],

This is unnecessary: a different number of samples may be used to estimate (train) than is used when computing expectations. The code should therefore be generalized to have two independent variables, representing the number of training samples, and the number of prediction samples. The values of these variables may be inferred from data files, or furnished by the user.

cc: @jkunisak

petermchale · 2021-10-26T22:39:55Z

A new comment, 👍

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

generalize to two independent variables representing: number of training samples, and number of prediction samples #16

generalize to two independent variables representing: number of training samples, and number of prediction samples #16

petermchale commented Oct 19, 2021

petermchale commented Oct 26, 2021

generalize to two independent variables representing: number of training samples, and number of prediction samples #16

generalize to two independent variables representing: number of training samples, and number of prediction samples #16

Comments

petermchale commented Oct 19, 2021

petermchale commented Oct 26, 2021