Steven Salzberg
Johns Hopkins University
H-index: 161
North America-United States
Description
Steven Salzberg, With an exceptional h-index of 161 and a recent h-index of 105 (since 2020), a distinguished researcher at Johns Hopkins University, specializes in the field of Computational Biology, Genomics, Bioinformatics, Metagenomics, Biomedical Data Science.
His recent articles reflect a diverse array of research interests and contributions to the field:
Upstream open reading frames may contain hundreds of novel human exons
Implementing governmental oversight of enhanced potential pandemic pathogen research
Novel metagenomics analysis of stony coral tissue loss disease
A genome sequence for the threatened whitebark pine
Detecting differential transcript usage in complex diseases with SPIT
CHESS 3: an improved, comprehensive catalog of human genes and transcripts based on large-scale expression data, phylogenetic analysis, and protein structure
The first gapless, reference-quality, fully annotated genome from a Southern Han Chinese individual
The status of the human gene catalogue
Professor Information
University | Johns Hopkins University |
---|---|
Position | Bloomberg Distinguished Professor |
Citations(all) | 333890 |
Citations(since 2020) | 154044 |
Cited By | 242757 |
hIndex(all) | 161 |
hIndex(since 2020) | 105 |
i10Index(all) | 339 |
i10Index(since 2020) | 259 |
University Profile Page | Johns Hopkins University |
Research & Interests List
Computational Biology
Genomics
Bioinformatics
Metagenomics
Biomedical Data Science
Top articles of Steven Salzberg
Upstream open reading frames may contain hundreds of novel human exons
Several recent studies have presented evidence that the human gene catalogue should be expanded to include thousands of short open reading frames (ORFs) appearing upstream or downstream of existing protein-coding genes, each of which would comprise an additional bicistronic transcript in humans. Here we explore an alternative hypothesis that would explain the translational and evolutionary evidence for these upstream ORFs without the need to create novel genes or bicistronic transcripts. We examined 2,199 upstream ORFs that have been proposed as high-quality candidates for novel genes, to determine if they could instead represent protein-coding exons that can be added to existing genes. We checked for the conservation of these ORFs in four recently sequenced, high-quality human genomes, and found a large majority (87.8%) to be conserved in all four as expected. We then looked for splicing evidence that would connect each upstream ORF to the downstream protein-coding gene at the same locus, thus creating a novel splicing variant using the upstream ORF as its first exon. These protein coding exon candidates were further evaluated using protein structure predictions of the protein sequences that included the proposed new exons. We determined that 582 out of 2,199 upstream ORFs have strong evidence that they can form protein coding exons that are part of an existing gene, and that the resulting protein is predicted to have similar or better structural quality than the currently annotated isoform.Author SummaryWe analyzed over 2000 human sequences that have been proposed to represent novel protein-coding …
Authors
Hyun Joo Ji,Steven L Salzberg
Journal
bioRxiv
Published Date
2024/3/23
Implementing governmental oversight of enhanced potential pandemic pathogen research
We write in response to the commentary “Virology—the path forward”(1). The authors argue against the recommendations of the US National Science Advi sory Board for Biosecurity (NSABB) to strengthen the oversight of enhanced potential pandemic pathogen (ePPP) research (2). The authors assert that adopting the NSABB recommendations would have a sweeping negative impact on US research and harm US competitiveness, and the authors cite the development of vaccines against measles and cytomegalovirus as examples of research that would be harmed. The claim of sweeping negative impact is false. ePPP research as defined by the NSABB represents< 0.01% of biomedical research and< 1% of virology research. At most, a dozen current US-funded virology research projects, of more than 2,000, would be affected.
Authors
Richard H Ebright,Raina MacIntyre,Joseph P Dudley,Colin D Butler,Andre Goffinet,Edward Hammond,Elisa D Harris,Hideki Kakeya,Yanna Lambrinidou,Milton Leitenberg,Stuart A Newman,Bryce E Nickels,Monali C Rahalkar,Matt W Ridley,Steven L Salzberg,Harish Seshadri,Günter Theißen,Antonius M VanDongen,Alex Washburne
Journal
Journal of virology
Published Date
2024/3/13
Novel metagenomics analysis of stony coral tissue loss disease
Stony coral tissue loss disease (SCTLD) has devastated coral reefs off the coast of Florida and continues to spread throughout the Caribbean. Although a number of bacterial taxa have consistently been associated with SCTLD, no pathogen has been definitively implicated in the etiology of SCTLD. Previous studies have predominantly focused on the prokaryotic community through 16S rRNA sequencing of healthy and affected tissues. Here, we provide a different analytical approach by applying a bioinformatics pipeline to publicly available metagenomic sequencing samples of SCTLD lesions and healthy tissues from four stony coral species. To compensate for the lack of coral reference genomes, we used data from apparently healthy coral samples to approximate a host genome and healthy microbiome reference. These reads were then used as a reference to which we matched and removed reads from …
Authors
Jakob M Heinz,Jennifer Lu,Lindsay K Huebner,Steven L Salzberg,Markus Sommer,Stephanie M Rosales
Journal
bioRxiv
Published Date
2024/1/3
A genome sequence for the threatened whitebark pine
Whitebark pine (WBP, Pinus albicaulis) is a white pine of subalpine regions in the Western contiguous United States and Canada. WBP has become critically threatened throughout a significant part of its natural range due to mortality from the introduced fungal pathogen white pine blister rust (WPBR, Cronartium ribicola) and additional threats from mountain pine beetle (Dendroctonus ponderosae), wildfire, and maladaptation due to changing climate. Vast acreages of WBP have suffered nearly complete mortality. Genomic technologies can contribute to a faster, more cost-effective approach to the traditional practices of identifying disease-resistant, climate-adapted seed sources for restoration. With deep-coverage Illumina short reads of haploid megagametophyte tissue and Oxford Nanopore long reads of diploid needle tissue, followed by a hybrid, multistep assembly approach, we produced a final assembly …
Authors
David B Neale,Aleksey V Zimin,Amy Meltzer,Akriti Bhattarai,Maurice Amee,Laura Figueroa Corona,Brian J Allen,Daniela Puiu,Jessica Wright,Amanda R De La Torre,Patrick E McGuire,Winston Timp,Steven L Salzberg,Jill L Wegrzyn
Journal
G3: Genes, Genomes, Genetics
Published Date
2024/3/25
Detecting differential transcript usage in complex diseases with SPIT
Differential transcript usage (DTU) plays a crucial role in determining how gene expression differs among cells, tissues, and developmental stages, contributing to the complexity and diversity of biological systems. In abnormal cells, it can also lead to deficiencies in protein function and underpin disease pathogenesis. Analyzing DTU via RNA sequencing (RNA-seq) data is vital, but the genetic heterogeneity in populations with complex diseases presents an intricate challenge due to diverse causal events and undetermined subtypes. Although the majority of common diseases in humans are categorized as complex, state-of-the-art DTU analysis methods often overlook this heterogeneity in their models. We therefore developed SPIT, a statistical tool that identifies predominant subgroups in transcript usage within a population along with their distinctive sets of DTU events. This study provides comprehensive …
Authors
Beril Erdogdu,Ales Varabyou,Stephanie C Hicks,Steven L Salzberg,Mihaela Pertea
Journal
Cell Reports Methods
Published Date
2024/3/25
CHESS 3: an improved, comprehensive catalog of human genes and transcripts based on large-scale expression data, phylogenetic analysis, and protein structure
CHESS 3 represents an improved human gene catalog based on nearly 10,000 RNA-seq experiments across 54 body sites. It significantly improves current genome annotation by integrating the latest reference data and algorithms, machine learning techniques for noise filtering, and new protein structure prediction methods. CHESS 3 contains 41,356 genes, including 19,839 protein-coding genes and 158,377 transcripts, with 14,863 protein-coding transcripts not in other catalogs. It includes all MANE transcripts and at least one transcript for most RefSeq and GENCODE genes. On the CHM13 human genome, the CHESS 3 catalog contains an additional 129 protein-coding genes. CHESS 3 is available at http://ccb.jhu.edu/chess.
Authors
Ales Varabyou,Markus J Sommer,Beril Erdogdu,Ida Shinder,Ilia Minkin,Kuan-Hao Chao,Sukhwan Park,Jakob Heinz,Christopher Pockrandt,Alaina Shumate,Natalia Rincon,Daniela Puiu,Martin Steinegger,Steven L Salzberg,Mihaela Pertea
Journal
Genome biology
Published Date
2023/10/30
The first gapless, reference-quality, fully annotated genome from a Southern Han Chinese individual
We used long-read DNA sequencing to assemble the genome of a Southern Han Chinese male. We organized the sequence into chromosomes and filled in gaps using the recently completed T2T-CHM13 genome as a guide, yielding a gap-free genome, Han1, containing 3,099,707,698 bases. Using the T2T-CHM13 annotation as a reference, we mapped all genes onto the Han1 genome and identified additional gene copies, generating a total of 60,708 putative genes, of which 20,003 are protein-coding. A comprehensive comparison between the genes revealed that 235 protein-coding genes were substantially different between the individuals, with frameshifts or truncations affecting the protein-coding sequence. Most of these were heterozygous variants in which one gene copy was unaffected. This represents the first gene-level comparison between two finished, annotated individual human genomes.
Authors
Kuan-Hao Chao,Aleksey V Zimin,Mihaela Pertea,Steven L Salzberg
Journal
G3: Genes, Genomes, Genetics
Published Date
2023/3/1
The status of the human gene catalogue
Scientists have been trying to identify every gene in the human genome since the initial draft was published in 2001. In the years since, much progress has been made in identifying protein-coding genes, currently estimated to number fewer than 20,000, with an ever-expanding number of distinct protein-coding isoforms. Here we review the status of the human gene catalogue and the efforts to complete it in recent years. Beside the ongoing annotation of protein-coding genes, their isoforms and pseudogenes, the invention of high-throughput RNA sequencing and other technological breakthroughs have led to a rapid growth in the number of reported non-coding RNA genes. For most of these non-coding RNAs, the functional relevance is currently unclear; we look at recent advances that offer paths forward to identifying their functions and towards eventually completing the human gene catalogue. Finally, we …
Authors
Paulo Amaral,Silvia Carbonell-Sala,Francisco M De La Vega,Tiago Faial,Adam Frankish,Thomas Gingeras,Roderic Guigo,Jennifer L Harrow,Artemis G Hatzigeorgiou,Rory Johnson,Terence D Murphy,Mihaela Pertea,Kim D Pruitt,Shashikant Pujar,Hazuki Takahashi,Igor Ulitsky,Ales Varabyou,Christine A Wells,Mark Yandell,Piero Carninci,Steven L Salzberg
Published Date
2023/10/5
Professor FAQs
What is Steven Salzberg's h-index at Johns Hopkins University?
The h-index of Steven Salzberg has been 105 since 2020 and 161 in total.
What are Steven Salzberg's top articles?
The articles with the titles of
Upstream open reading frames may contain hundreds of novel human exons
Implementing governmental oversight of enhanced potential pandemic pathogen research
Novel metagenomics analysis of stony coral tissue loss disease
A genome sequence for the threatened whitebark pine
Detecting differential transcript usage in complex diseases with SPIT
CHESS 3: an improved, comprehensive catalog of human genes and transcripts based on large-scale expression data, phylogenetic analysis, and protein structure
The first gapless, reference-quality, fully annotated genome from a Southern Han Chinese individual
The status of the human gene catalogue
...
are the top articles of Steven Salzberg at Johns Hopkins University.
What are Steven Salzberg's research interests?
The research interests of Steven Salzberg are: Computational Biology, Genomics, Bioinformatics, Metagenomics, Biomedical Data Science
What is Steven Salzberg's total number of citations?
Steven Salzberg has 333,890 citations in total.
What are the co-authors of Steven Salzberg?
The co-authors of Steven Salzberg are Jonathan A. Eisen, Owen White, Michael C. Schatz, Brian Haas, Charles H. Langley.