Poisson Distribution #459

Karzinisierung · 2025-01-14T06:51:39Z

Any sequence of DNA read produces 6 possible reading frames. Sorting through all reading frames individually using whole algorithm will be computationally wasteful.

Use Poisson distribution on each of 6 reading frames, with a mean expected STOP codon frequency of 3/64. Genes are usually above 100 base pairs. Any long sequence of low stop codons will likely encode a gene.

Stop codon distribution can also correlate with GC content, may be used in conjunction with other code by multiplying probabilities of two independent events.

Requires the use of an accurate p-value. 0.05 will not work, maybe 510^-6 to 510^-8 will be more accurate, but also risks eliminating good data.

Instead, one may also take the found DNA sequence and insert directly into BLAST to see what comes up. Overall this will improve efficiency and resource use.

VerisimilitudeX self-assigned this Jan 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Poisson Distribution #459

Poisson Distribution #459

Karzinisierung commented Jan 14, 2025

Poisson Distribution #459

Poisson Distribution #459

Comments

Karzinisierung commented Jan 14, 2025