Home / Publication
JOURNAL PAPERS
2024
-
Xubo Tang, Jiayu Shang, Guowei Chen, Kei Hang Katie Chan, Mang Shi, Yanni Sun, SegVir: Reconstruction of complete segmented RNA viral genomes from metatranscriptomes, Molecular Biology and Evolution, 2024, msae171. SegVir: Reconstruction of complete segmented RNA viral genomes from metatranscriptomes
-
Yu, Runzhou, Ziyi Huang, Theo YC Lam, and Yanni Sun. “Utilizing profile hidden Markov model databases for discovering viruses from metagenomic data: a comprehensive review.” Briefings in Bioinformatics 25, no. 4 (2024). Utilizing profile hidden Markov model databases for discovering viruses from metagenomic data: a comprehensive review.
-
Huang, Ziyi, Dehan Cai, and Yanni Sun. “Towards more accurate microbial source tracking via non-negative matrix factorization (NMF).” Bioinformatics 40, no. Supplement_1 (2024): i68-i78. Towards more accurate microbial source tracking via non-negative matrix factorization (NMF).
2023
-
Herui Liao, Jiayu Shang, and Yanni Sun. “GDmicro: classifying host disease status with GCN and Deep adaptation network based on the human gut microbiome data.” Bioinformatics, btad747, 12 Dec. 2023. GDmicro: classifying host disease status with GCN and deep adaptation network based on the human gut microbiome data.
-
Jiaojiao Guan, Cheng Peng, Jiayu Shang, Xubo Tang, and Yanni Sun. “PhaGenus: genus-level classification of bacteriophages using a Transformer model.” Briefings in Bioinformatics 24, no. 6 (2023): bbad408. PhaGenus: genus-level classification of bacteriophages using a Transformer model.
-
Donglin Wang, Jiayu Shang, Hui Lin, Jinsong Liang, Chenchen Wang, Yanni Sun, Yaohui Bai, and Jiuhui Qu. “Identifying ARG-carrying bacteriophages in a lake replenished by reclaimed water using deep learning techniques.” Water Research (2023): 120859. Identifying ARG-carrying bacteriophages in a lake replenished by reclaimed water using deep learning techniques.
-
Jiayu Shang, Cheng Peng, Xubo Tang, and Yanni Sun. “PhaVIP: Phage VIrion Protein classification based on chaos game representation and Vision Transformer.” Bioinformatics, Volume 39, June 2023, Pages i30–i39. PhaVIP: Phage VIrion Protein classification based on chaos game representation and Vision Transformer.
-
Jiayu Shang, Cheng Peng, Herui Liao, Xubo Tang, and Yanni Sun. “PhaBOX: A web server for identifying and characterizing phage contigs in metagenomic data.” Bioinformatics Advances, Volume 3, Issue 1, 2023, vbad101 (2023). PhaBOX: A web server for identifying and characterizing phage contigs in metagenomic data.
-
Liao, H., Ji, Y. & Sun, Y. “High-resolution strain-level microbiome composition analysis from short reads.” Microbiome 11, 183 (2023). High-resolution strain-level microbiome composition analysis from short reads.
-
Jiayu Shang, Xubo Tang, and Yanni Sun. “PhaTYP: predicting the lifestyle for bacteriophages using BERT.” Briefings in Bioinformatics 24, no. 1 (2023): bbac487. PhaTYP: predicting the lifestyle for bacteriophages using BERT.
-
Runzhou Yu, Abdullah, Syed Muhammad Umer, and Yanni Sun. “HMMPolish: a coding region polishing tool for TGS-sequenced RNA viruses.” Briefings in bioinformatics bbad264. 21 July, 2023. HMMPolish: a coding region polishing tool for TGS-sequenced RNA viruses.
-
Xubo Tang, Jiayu Shang, Yongxin Ji, and Yanni Sun. “PLASMe: a tool to identify PLASMid contigs from short-read assemblies using transformer.” Nucleic Acids Research (2023): gkad578. PLASMe: a tool to identify PLASMid contigs from short-read assemblies using transformer.
-
Yongxin Ji, Jiayu Shang, Xubo Tang, and Yanni Sun. “HOTSPOT: hierarchical host prediction for assembled plasmid contigs with transformer.” Bioinformatics 39, no. 5 (2023): btad283. HOTSPOT: hierarchical host prediction for assembled plasmid contigs with transformer
-
Guowei Chen, Xubo Tang, Mang Shi, and Yanni Sun. “VirBot: an RNA viral contig detector for metagenomic data.” Bioinformatics 39, no. 3 (2023): btad093. VirBot: an RNA viral contig detector for metagenomic data
-
Runzhou Yu, Dehan Cai, and Yanni Sun. “AccuVIR: an ACCUrate VIRal genome assembly tool for third-generation sequencing data.” Bioinformatics 39, no. 1 (2023): btac827. AccuVIR: an ACCUrate VIRal genome assembly tool for third-generation sequencing data
-
Jonathan Daniel Ip, Allen Wing-Ho Chu, Wan-Mui Chan, Rhoda Cheuk-Ying Leung, Syed Muhammad Umer Abdullah, Yanni Sun, Kelvin Kai-Wang To, “Global prevalence of SARS-CoV-2 3CL protease mutations associated with nirmatrelvir or ensitrelvir resistance”, eBioMedicine, Volume 91, 2023, 104559, ISSN 2352-3964. Global prevalence of SARS-CoV-2 3CL protease mutations associated with nirmatrelvir or ensitrelvir resistance
2022
-
Cai, Dehan, Jiayu Shang, and Yanni Sun. “HaploDMF: viral haplotype reconstruction from long reads via deep matrix factorization.” Bioinformatics 38, no. 24 (2022): 5360-5367. HaploDMF: viral haplotype reconstruction from long reads via deep matrix factorization
-
Shang, Jiayu et al. “Accurate identification of bacteriophages from metagenomic data using Transformer.” Briefings in bioinformatics, bbac258. 30 Jun. 2022. Accurate identification of bacteriophages from metagenomic data using Transformer.
-
Mu Ku Chen, Xiaoyuan Liu, Yanni Sun, and Din Ping Tsai. “Artificial Intelligence in Meta-optics.” Chemical Reviews (2022). Artificial Intelligence in Meta-optics.
-
Shang, Jiayu, and Yanni Sun. “CHERRY: a Computational metHod for accuratE pRediction of virus-pRokarYotic interactions using a graph encoder-decoder model.” Briefings in bioinformatics, bbac182. 21 May. 2022. CHERRY: a Computational metHod for accuratE pRediction of virus-pRokarYotic interactions using a graph encoder-decoder model.
-
Dehan Cai and Yanni Sun, “Reconstructing viral haplotypes using long reads”, Bioinformatics, btac089, 14 Feb. 2022. Reconstructing viral haplotypes using long reads
-
Xubo Tang, Jiayu Shang, and Yanni Sun, “RdRp-based sensitive taxonomic classification of RNA viruses for metagenomic data”, Briefings in Bioinformatics, accepted, 2022. RdRp-based sensitive taxonomic classification of RNA viruses for metagenomic data
-
Herui Liao, Dehan Cai, and Yanni Sun, “VirStrain : a strain identification tool for RNA viruses.”, Genome Biology, accepted, 2022. VirStrain : a strain identification tool for RNA viruses
2021
-
Yang Li, Ning Jiang, and Yanni Sun, “AnnoSINE: a short interspersed nuclear elements annotation tool for plant genomes”, Plant Physiology, 2021. AnnoSINE: a short interspersed nuclear elements annotation tool for plant genomes
-
Jiayu Shang, and Yanni Sun, “Predicting the hosts of prokaryotic viruses using GCN-based semi-supervised learning”, BMC Biology, 2021. Predicting the hosts of prokaryotic viruses using GCN-based semi-supervised learning
-
Jiayu Shang, Jingzhe Jiang, and Yanni Sun, “Bacteriophage classification for assembled contigs using graph convolutional network”, Bioinformatics (ISMB/ECCB 2021 Proceedings), 2021. Bacteriophage classification for assembled contigs using graph convolutional network
-
Haiying Ma, Herui Liao, Walter Dellisanti, Yanni Sun, Leo Lai Chan, Liang Zhang, “Characterizing the Host Coral Proteome of Platygyra carnosa Using Suspension Trapping (S-Trap)”, Journal of Proteome Research, 20(3), pp 1783–1791, 25 February 2021. Characterizing the Host Coral Proteome of Platygyra carnosa Using Suspension Trapping (S-Trap)
-
Nan Du, Jiayu Shang, and Yanni Sun*, “Improving protein domain classification for third-generation sequencing reads using deep learning”, BMC genomics, 22(1), 1-13, April 9, 2021. Improving protein domain classification for third-generation sequencing reads using deep learning
-
Jiayu Shang and Yanni Sun*, “CHEER: HierarCHical taxonomic classification for viral mEtagEnomic data via deep leaRning”, Methods, 189: 95-103, May, 2021. CHEER: HierarCHical taxonomic classification for viral mEtagEnomic data via deep leaRning
Before 2021
-
Xubo Tang and Yanni Sun*, “Fast and accurate microRNA search using CNN”, BMC Bioinformatics, 20(Suppl 33):646, 14 pages, 27 December 2019. Fast and accurate microRNA search using CNN
-
Jiao Chen, Yingchao Zhao, and Yanni Sun, “De novo haplotype reconstruction in viral quasispecies using paired-end read guided path finding”, Bioinformatics, 2018. IF: 7.307. De novo haplotype reconstruction in viral quasispecies using paired-end read guided path finding
-
Daewoo Pak, Nan Du, Yunsoon Kim, Yanni Sun, and Zachary F. Burton, “Rooted tRNAomes and evolution of the genetic code”, Transcription, 9 (3): 137-151, 2017 Rooted tRNAomes and evolution of the genetic code
-
Jiao Chen, DongXiao Zhu, and Yanni Sun, “Cap-seq reveals complicated miRNA transcriptional mechanisms in C. elegans and mouse”, Quantitative Biology, 5 (4): 352-367, 2017. Cap-seq reveals complicated miRNA transcriptional mechanisms in C. elegans and mouse
-
Prapaporn Techa-Angkoon, Yanni Sun, and Jikai Lei, “A sensitive short read homology search tool for paired-end read sequencing data”, BMC Bioinformatics, 18 (Suppl 12): 414. 2017.
-
Nan Du and Yanni Sun, “Improve homology search sensitivity of PacBio data by correcting frameshifts”, Bioinformatics 2016,32 (17): i529-i537
-
Jikai Lei and Yanni Sun, “Assemble CRISPRs from metagenomic sequencing data”, Bioinformatics, 2016, 32 (17): i520-i528
-
Laura Kirby, Yanni Sun, David Judah, Scooter Nowak, Donna Koslowsky, Analysis of the Trypanosoma brucei EATRO 164 Bloodstream Guide RNA Transcriptome, PLOS Neglected Tropical Deceases 2016
-
Qiong Wang, Jordan A Fish, Mariah Gilman, Yanni Sun, Titus Brown, James M Tiedje, James R Cole, “Xander: Employing a novel method for efficient gene-targeted metagenomic assembly”, Microbiome, 2015
-
Rujira Achawayantakun, Jiao Chen, Yanni Sun, and Yuan Zhang, “LncRNA-ID: Long non-coding RNA IDentification using balanced random forests”, Bioinformatics, 2015
-
Mingyu Shao, Yanni Sun, and Shuigeng Zhou, “Identifying TF-MiRNA Regulatory Relationships Using Multiple Features”, PLOS ONE, 2015
-
Cheng Yuan, Jikai Lei, James R. Cole, and Yanni Sun. “Reconstructing 16S rRNA genes in metagenomic data”, Bioinformatics (special issue of ISMB 2015)
-
Yuan Zhang, Yanni Sun, and James Cole, “A Scalable and Accurate Targeted gene Assembly tool (SAT-Assembler) for next-generation sequencing data.”, PLOS Computational Biology, 2014
-
Jikai Lei and Yanni Sun, “miR-PREFeR: an accurate, fast, and easy-to-use plant miRNA prediction tool using small RNA-Seq data”, Bioinformatics, 2014
-
Campbell M, Law M, Holt C, Stein J, Moghe G, Hufnagel D, Lei J, Achawanantakun R, Jiao D, Lawrence C, Ware D, Shiu SH, Childs K, Sun Y, Jiang N, Yandell M., “MAKER-P: a tool-kit for the rapid creation, management, and quality control of plant genome annotations.” Plant Physiol. 2013 Dec 6.
-
Roy M, Kim N, Kim K, Chung WH, Achawanantakun R, Sun Y, Wayne R. “Analysis of the canine brain transcriptome with an emphasis on the hypothalamus and cerebral cortex.” Mamm Genome, 24(11-12):484-99. 2013, doi: 10.1007/s00335-013-9480-0
-
Cole JR, Wang Q, Fish JA, Chai B, McGarrell DM, Sun Y, Brown CT, Porras-Alfaro A, Kuske CR, Tiedje JM. “Ribosomal Database Project: data and tools for high throughput rRNA analysis.” Nucleic Acids Res. Jan 1;42(1):D633-42, 2014, doi: 10.1093/nar/gkt1244
-
Cheng Yuan and Yanni Sun, “RNA-CODE: a noncoding RNA Classification tOol for short reaDs in NGS data lacking rEference genomes”, PLOS ONE, 8(10):e77596, 2013
-
Jordan A. Fish, Benli Chai, Qiong Wang, Yanni Sun, C. Titus Brown, James M. Tiedje, and James R. Cole, “FunGene: the functional gene pipeline and repository”, Front. Microbiol., 01 October 2013, doi: 10.3389/fmicb.2013.00291
-
Donna J. Koslowsky, Yanni Sun, Jordan Hindenach, Terence Theisen, Jasmine Lucas, “The Insect-phase gRNA Transcriptome in Trypanosoma brucei”, Nucleic Acids Research, 2013
-
Qiong Wang, John F. Quensen III, Jordan A. Fish, Tae Kwon Lee, Yanni Sun, James M. Tiedje, James R. Cole, “Ecological patterns of nifH genes in four terrestrial climatic zones explored with targeted metagenomics using FrameBot, a new informatics tool”, mBio, 2013
-
Yuan Zhang, Yanni Sun, and James R. Cole, “A Sensitive and Accurate protein domain cLassification Tool (SALT) for short reads.” Bioinformatics, June 2013, 9 pages, doi:10.1093/bioinformatics/btt357.
-
Rujira Achawanantakun and Yanni Sun. “Shape and secondary structure prediction for ncRNAs including pseudoknots based on linear SVM.” BMC Bioinformatics, 2013.
-
Cheng Yuan and Yanni Sun. “Efficient known ncRNA search including pseudoknots.” BMC Bioinformatics, 2013.
-
A. Vieler, G. Wu, et. al. “Genome, functional gene annotation, and nuclear transformation of the heterokont oleaginous alga Nannchloropsis oceanica CCMP1779”, Plos Genetics, 8(11), 25 pages, 2012.
-
Jikai Lei, Prapaporn Techa-Angkoon, Yanni Sun, “Chain-RNA: a comparative ncRNA search tool based on the two-dimensional chain algorithm”, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2012.
-
Yanni Sun, Osama Aljawad, Jikai Lei, and Alex Liu, “Genome-scale NCRNA homology search using a Hamming distance-based filtration strategy”, BMC Bioinformatics, 13(Suppl 3):S12, 13 pages, 2012
-
Yanni Sun, Jeremy Buhler, Cheng Yuan, “Designing Filters for Fast Known NcRNA Identification.” IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2011.
-
Rujira Achawanantakun, Yanni Sun, Seyedeh Shohreh Takyar, “Using a novel secondary structure representation for consensus ncRNA secondary structure derivation.” Journal of Bioinformatics and Computational biology, 9(2): 317-337, 2011.
-
Yuan Zhang and Yanni Sun, “HMM-FRAME: accurate protein domain classification for metagenomic sequences in the presence of frameshift errors.” BMC Bioinformatics, 12:198-213, 2011.
-
Yanni Sun and Jeremy Buhler, “Designing patterns and profiles for profile HMM search.” IEEE/ACM Transactions on Computational Biology and Bioinformatics, 6(2):232-43, 2009 Apr-Jun.
-
Yanni Sun and Jeremy Buhler. “Designing patterns for profile HMM search”, Bioinformatics 23:e36-43, 2007, special issue of ECCB 06.
-
Yanni Sun and Jeremy Buhler, “Choosing the best heuristic for seeded alignment of DNA sequences.” BMC Bioinformatics 7:133, 2006.
-
Jeremy Buhler, Uri Keich, and Yanni Sun, “Designing seeds for similarity search in genomic DNA.” Journal of Computing and Systems Science 70:342-363, 2005.
-
Yanni Sun and Jeremy Buhler, “Designing multiple simultaneous seeds for DNA similarity search.” Journal of Computational Biology 12:847-861, 2005.
CONFERENCE PAPERS
-
Nan Du and Yanni Sun, “Improve homology search sensitivity of PacBio data by correcting frameshifts”, Proceeding of ECCB 2016, the Hague, Netherlands, September 4, 2016
-
Jikai Lei and Yanni Sun, “Assemble CRISPRs from metagenomic sequencing data”, Proceeding of ECCB 2016, the Hague, Netherlands, September 4, 2016
-
Prapaporn Techa-Angkoon, Yanni Sun and Jikai Lei, “Improve Short Read Homology Search using Paired-End Read Information”, Proceeding of ISBRA, Minsk, Belarus, June 2016
-
Cheng Yuan, Jikai Lei, James R. Cole, and Yanni Sun. “Reconstructing 16S rRNA genes in metagenomic data”, Proceedings of ISMB 2015, Dublin, Ireland, July 10th, 2015
-
Prapaporn Techa-Angkoon and Yanni Sun. “glu-RNA: aliGn highLy strUctured ncRNAs using only sequence similarity.” Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine (ACM BCB 2013), Washington D.C., USA, 2013
-
Jikai Lei, Prapaporn Techa-Angkoon, and Yanni Sun. “ ChainKnot: a comparative H-type pseudoknot prediction tool using multiple ab initio folding tool.” Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine (ACM BCB 2013), Washington D.C., USA, 2013
-
Rujira Achawanantakun and Yanni Sun. “Shape and secondary structure prediction for ncRNAs including pseudoknots based on linear SVM.” Proceedings of the Eleventh Asia Pacific Bioinformatics Conference (APBC 2013), Vancouver, Canada, 2013.
-
Cheng Yuan and Yanni Sun. “Efficient known ncRNA search including pseudoknots.” Proceedings of the Eleventh Asia Pacific Bioinformatics Conference (APBC 2013), Vancouver, Canada, 2013.
-
Yuan Zhang and Yanni Sun. “PseudoDomain: identification of processed pseudogenes based on protein domain classification.”, Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine (ACM BCB 2012), Orlando, FL, USA, 2012, regular paper, acceptance ratio: 19%.
-
Jikai Lei, Prapaporn Techa-Angkoon, and Yanni Sun. “NCRNA homology search based on an extended two-dimensional chain algorithm. “ Proceedings of the Tenth Asia Pacific Bioinformatics Conference (APBC 2012), Melbourne, Australia, 2012.
-
Yuan Zhang and Yanni Sun. “MetaDomain: a profile HMM-based protein domain classification tool for short sequences. “ Proceedings of the Pacific Symposium on Biocomputing (PSB) 2012, Big Island, HI, USA, 2012.
-
Osama Aljawad, Yanni Sun, Alex Liu, and Jikai Lei. “NcRNA Homology Search Using Hamming Distance Seeds. “ Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine (ACM BCB 2011), Chicago, USA, 2011, regular paper, acceptance ratio: 19%.
-
Rujira Achawanantakun, Seyedeh Shohreh Takyar, Yanni Sun. “Grammar String: A Novel ncRNA Secondary Structure Representation.” Proceedings of the Ninth Annual International Conference on Computational Systems Bioinformatics (CSB 10), CA, USA, 2010, acceptance ratio: 22%.
-
Stuart King, Yanni Sun, James Cole, and Sakti Pamanik. “BLAST Tree: Fast Filtering for Genomic Sequence Classification. “ 10th IEEE International Conference on “Bioinformatics and Bioengineering (BIBE-2010), Philadelphia, PA, USA, 2010.
-
Yanni Sun and Jeremy Buhler, “Designing Secondary Structure Profiles for Fast ncRNA Identification.” Proceedings of the Seventh Annual International Conference on Computational Systems Bioinformatics (CSB 08), CA, USA, 2008, acceptance ratio: 22%.
-
Yanni Sun and Jeremy Buhler. “Designing patterns for profile HMM search.” Proceedings of the 5th European Conference on Computational Biology (ECCB06), Eilat, Israel, acceptance ratio: 18%.
-
Yanni Sun and Jeremy Buhler, “Designing multiple simultaneous seeds for DNA similarity search.” Proceedings of the Eighth Annual International Conference on Computational Molecular Biology (RECOMB04), 76-84, San Diego, CA USA, 2004, acceptance ratio: 18%.
-
Jeremy Buhler, Uri Keich, and Yanni Sun, “Designing seeds for similarity search in genomic DNA.” Proceedings of the Seventh Annual International Conference on Computational Molecular Biology (RECOMB03), 67-75, Berlin, Germany, April 2003, acceptance ratio: 20%.