SlideDeck.io – A repository of great HTML presentations
Exploring the diversity of unmapped reads from human deep sequencing – Amin Saffari – Data and Preparation
View Github Repository
Open presentation in a new window
khikho
See all presentation from khikho
Exploring the diversity of unmapped reads from human deep sequencing – Amin Saffari – Data and Preparation
0
0
Presentation
Last presenation
On Github
khikho / Presentation
Exploring the diversity of unmapped reads from human deep sequencing
Amin Saffari
13-08-12
Aligning
Mapped region
Unmapped region
Very high or low GC-content
Sequencing error
Repeat elements
Currently not discovered
Data and Preparation
Data and Preparation
Workflows
Pros and Cons
Strategy 1:
More reads lead to better assembly
Higher N50
Slow
Needs a lot of memory
Strategy 2:
Fast
Some contigs disappeared
Assembly Results
Assembly Results
Higher N50 or longer asmLg?
Compare Assemblies
Assembly parameters (kmer-value/coverage-cutoff) as a function of indicators (asmlength/N50)
Every Kmer/cvCut are useful
Combine assembly results
Blood - Blood
Saliva - Saliva
Blood - Saliva
Extract non-redundant contiges
Combining blood results
~113 million contigs
~4 million non-redundant contigs
¼ annotated against the nr_DB
Top3
Pongo abelii (Sumatran orangutan)
Female
Chromosome X
Blood
Pongo abelii chromosome X unlocalized genomic
Top3
Toxoplasma gondii (Parasite)
Carried by many warm-blooded animals
30%-65% of human have antibody
Malassezia globosa (fungi)
Naturally found on the skin surfaces of many animals, including humans
Combining Saliva results
~510 million contigs
~3 million non-redundant contigs
¾ annotated against the nr_DB
Unannotated part
Comparing the length of unannotated contigs
Blood & Saliva
~113 million contigs(Blood)
~510 million contigs (Saliva)
60171 non-redundant contigs
1/6 annotated
DNA should be more or less the same on different tissues
Blood & Saliva
All annotated are bacteria
Streptococcus mitis SK579
Prevotella
Taxonomy dist and freq of blood results
Chimp data
Data sampled from gut
Human & Chimp
~105 million contigs(Gut)
60171 non-redundant contigs (Blood and saliva)
52113 non-redundant contigs (Blood and saliva & Gut)
1/8 annotated
Taxonomy dist and freq of human & chimp results
Database
Conclusion & Future works
Some sequenced mapped to closely related species
Larg fraction of bacteria lives with us
Looking to the unannotated contigs
Looking to the orphan reads
Acknowledgements
Thank You!