Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pydna documentation #244

Closed
manulera opened this issue Jul 19, 2024 · 4 comments
Closed

Pydna documentation #244

manulera opened this issue Jul 19, 2024 · 4 comments

Comments

@manulera
Copy link
Collaborator

cc @BjornFJohansson @hiyama341 @dgruano.

@JeffXiePL is going to work on the pydna documentation in the next weeks, and I made a list of what I think should be covered. The idea is to have in the style of a cookbook (how to achieve a task) rather than library documentation (what every class method does, etc.). I know there is a bit of that in the cookbook folder, but we would like to cover a bit more.

Below is the link of the guidelines for the documentation, feel free to edit / add things within reason for @JeffXiePL to cover.

https://docs.google.com/document/d/19sRRAMIHqn0rg-oHSdqIR6DxTIHYo2uj15nRdjq8D5Q/edit?usp=drive_link

@JeffXiePL
Copy link
Collaborator

Hi all,

I have a quick question on the pydna Dseqrecord page: is there no built-in method to remove a feature from a, say .gb file? Is the best way of going about it to use list comprehension?

Thanks!

@manulera
Copy link
Collaborator Author

Hi there @JeffXiePL, the list comprehension where you use an if statement is probably the best way to filter a list. There are similar ways, but they are not better

from Bio.SeqRecord import SeqRecord
from Bio.SeqFeature import SeqFeature, SimpleLocation

# We create a seqrecord with two features
f1 = SeqFeature(SimpleLocation(1, 5), type='CDS', id='f1')
f2 = SeqFeature(SimpleLocation(8, 15), type='misc_feature', id='f2')
seqr = SeqRecord('AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA', features=[f1, f2])

# We filter out a feature
seqr.features = [f for f in seqr.features if f.type != 'misc_feature']

print(seqr.features)

@JeffXiePL
Copy link
Collaborator

Hello all,

I wanted to ask why is there two Contig objects after using assembly_circ on an Assembly object? I couldn't find much details on the documentations.

Peilun Xie

@manulera
Copy link
Collaborator Author

Hi @JeffXiePL, I think you probably mean assemble_circular.

A Contig is a subclass of Dseqrecord with some extra methods that allow you to see how it was assembled. When you call assemble_circular, in principle you will be getting all possible circular assemblies that can be produced given the algorithm that you passed as Contigs. A set of fragments may be assembled in different ways. If you share an example you don't understand I can explain a bit better.

Note however that the current implementation sometimes gives unexpected results, given how the possible assemblies are computed. This will be fixed once I merge the new implementation.

In the example below, where the homology region of a gibson assembly ACGTAATG appears in several fragments, assemble_circular returns 4 contigs each representing a fragment circularised, in forward and reverse orientation. All this to say that if you are getting results that you think don't make sense, it may be because of that. In any case, feel free to share an example.

from pydna.assembly import Assembly
from pydna.dseqrecord import Dseqrecord

a = Dseqrecord("ACGTAATGaccACGTAATG")
b = Dseqrecord("ACGTAATGcgcACGTAATG")

assembly = Assembly((a, b), limit=8)

for out in assembly.assemble_circular():
    print(out.seq)

More info on what gives this behaviour (no need to go into it, but putting here for documentation purposes)

#166
#200
#192

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants