New telephone numbers:
Office: +36 30 427 6251 / Sequencing Lab: +36 30 427 6152 / Food Safety Lab: +36 30 392 4192 / Animal Health Lab: +36 30 392 3642
Déri S, Borbás J, Hartai T, Hategan L, Csányi B, Visnyovszki Á, Madácsy T, Maléth J, Hegedűs Z, Nagy I, Arora R, Labro AJ, Környei L, Varró A, Sepp R, Ördög B. Impaired cytoplasmic domain interactions cause co-assembly defect and loss of function in the p.Glu293Lys KNCJ2 variant isolated from an Andersen-Tawil Syndrome patient. Cardiovasc Res. 2020
Recent paper: The Role of MicroRNAs upon Epithelial-to-Mesenchymal Transition in Inflammatory Bowel Disease
Cells 2019, 8 (11), 1461

Éva Boros, István Nagy: The Role of MicroRNAs upon Epithelial-to-Mesenchymal...

Read more ...
Recent paper:Y-chromosome haplogroups from Hun, Avar and conquering Hungarian period nomadic people of the Carpathian Basin
Sci Rep 9, 16569 (2019) doi:10.1038/s41598-019-53105-5

Endre Neparáczki, Zoltán Maróti, Tibor Kalmár, Kitti Maár, István Nagy, Dóra Latinovics, Ágnes Kustár, György Pálfi, Erika Molnár, Antónia...

Read more ...
Recent paper: Clinical and genetic landscape of treatment naive cervical cancer: Alterations in PIK3CA and in epigenetic modulators associated with sub-optimal outcome
EBioMedicine. 2019 Apr 2. pii: S2352-3964(19)30214-2.

Scholl S, Popovic M, de la Rochefordiere A, Girard E, Dureau S, Mandic A, Koprivsek K, Samet N, Craina M, Margan M, Samuels S, Zijlmans H,...

Read more ...
Recent paper: CRISPR-interference based modulation of mobile genetic elements in bacteria
Synthetic Biology, https://doi.org/10.1093/synbio/ysz008

...

Read more ...
Recent paper: Phylogenetic barriers to horizontal transfer of antimicrobial peptide resistance genes in the human gut microbiota
Bálint Kintses, Orsolya Méhi, Eszter Ari, Mónika Számel, Ádám Györkei, Pramod K. Jangir, István Nagy, Ferenc Pál, Gergely Fekete, Roland Tengölics, Ákos Nyerges, István Likó, Anita Bálint, Tamás Molnár, Balázs Bálint, Bálint Márk Vásárhelyi, Misshelle Bustamante, Balázs Papp & Csaba Pál. Nature Microbiology volume 4, pages 447–458 (2019)

Bálint Kintses, Orsolya Méhi, Eszter Ari, Mónika Számel, Ádám...

Read more ...
Recent paper: Draft Genome Sequence of Propionibacterium acnes subsp. elongatum Strain Asn12
Microbiol Resour Announc 7:e00801-18.

Read more ...

LCA-assignment algorithm

The main problem addressed by MEGAN is to compute a “species profile” by assigning the reads from a metagenomics sequencing experiment to appropriate taxa in the NCBI taxonomy. At present, this program implements the following naive approach to this problem:

1. Compare a given set of DNA reads to a database of known sequences, such as NCBI-NR or NCBI-NT, using a sequence comparison tool such as BLAST.
2. Process this data to determine all hits of taxa by reads.
3. For each read r, let H be the set of all taxa that r hits.
4. Find the lowest node v in the NCBI taxonomy that encompasses the set of hit taxa H and assign the read r to the taxon represented by v.

We call this the naive LCA-assignment algorithm (LCA = “lowest common ancestor”). In this approach, every read is assigned to some taxon. If the read aligns very specifically only to a single taxon, then it is assigned to that taxon. The less specifically a read hits taxa, the higher up in the taxonomy it is placed. Reads that hit ubiquitously may even be assigned to the root node of the NCBI taxonomy.

If a read has significant matches to two different taxa a and b, where a is an ancestor of b in the NCBI taxonomy, then the match to the ancestor a is discarded and only the more specific match to b is used. 

The program provides a threshold for the bit disjointScore of hits. Any hit that falls below the threshold is discarded. Secondly, a threshold can be set to discard any hit whose disjointScore falls below a given percentage of the best hit. Finally, a third threshold is used to report only taxa that are hit by a minimal number of reads or minimal percent of all assigned reads. By default, the program requires at least 0:1% of all assigned reads to hit a taxon, before that taxon is deemed present. All reads that are initially assigned to a taxon that is not deemed present are pushed up the taxonomy until a node is reached that has enough reads. This is set using the Min Support Percent or Min Support item.

Taxa in the NCBI taxonomy can be excluded from the analysis. For example, taxa listed under root - unclassified sequences - metagenomes may give rise to matches that force the algorithm to place reads on the root node of the taxonomy. This feature is controlled by Preferences!Taxon Disabling menu. At present, the set of disabled taxa is saved as a program property and not as part of a MEGAN document.

Note that the LCA-assignment algorithm is already used on a smaller scale when parsing individual blast matches. This is because an entry in a reference database may have more than one taxon associated with it. For example, in the NCBI-NR database, an entry may be associated with up to 1000 different taxa. This implies, in particular, that a read that may be assigned to a high level node (even the root node), even though it only has one significant hit, if the corresponding reference sequence is associated with a number of very different species.

Note that the list of disabled taxa is also taken into consideration when parsing a BLAST file. Any taxa that are disabled are ignored when attempting to determine the taxon associated with a match, unless all recognized names are disabled, in which case the disabled names are used.

Weighted LCA Algorithm

The weighted LCA algorithm is identical to the weighted LCA algorithm used in Metascope. It operates as follows: In a first round of analysis, each reference sequence is given a weight. This is the number of reads that align to the given reference and that have the property that all the significant alignments for the read are to the same species as the reference sequence (but can also be to a strain or sub-species below the species node). In a second round of analysis, each read is placed on the node that is above 75% of the total weight of all references for which the read has a significant alignment.
The Weighted LCA algorithm will assign reads more specifically than the naive LCA algorithm. Because it performs two rounds of read and match analysis, it takes twice as long as the naive algorithm.

Please publish modules in offcanvas position.