On real gut metagenomes from the IBD patients and healthy controls (portal.microbiome-bioactives.org) we see that the spectral energy offers better separation between condition and control samples that the standard taxonomy-only entropy.
On real gut metagenomes from the IBD patients and healthy controls (portal.microbiome-bioactives.org) we see that the spectral energy offers better separation between condition and control samples that the standard taxonomy-only entropy.
Our simulated HGT and genome rearrangement data indicate that our measure is sensitive to these events by design. Thus, in presence of SVs and HGTs we can see differences between samples even if the relative abundance stays nearly identical.
Within the new framework based on the spectral energy of the graph sheaf Laplacian we account for both taxonomic composition (albeit not relative abundance yet) and genome architecture of the sample.
Metagenome graphs are sensitive to HGTs and SVs. Advait Balaji and I worked on some of these ideas before + Marko Tanevski helped with scaling things up (see: pubmed.ncbi.nlm.nih.gov/35832621/, pmc.ncbi.nlm.nih.gov/articles/PMC...). However, these prior analyses were taxonomy oblivious by design.
The idea was inspired by a recent preprint about sheaves on graphs and their potential use for classical ecological diversity indices (see: arxiv.org/abs/2601.17466). The literature on sheaves on graphs is very exciting in itself, but the key idea we use is quite simple.
Our new preprint on quantifying microbial sample diversity/complexity in a way that accounts for both metagenome architecture and taxonomic composition is now live on bioRxiv: www.biorxiv.org/content/10.6...
#metagenomics #bioinformatics #dataanalysis #graphdata
Our new preprint on the inconsistency of the combined gene tree parsimony costs is now live on bioRxiv: www.biorxiv.org/content/10.6...
[4/4] Our approach is by design a meta-method: you can use it with your favorite single-cell RNA-based GRN inference tool, and squeeze more insights out of your data! Check us out GitHub: github.com/aliaaz99/GRN....
#singlecell #GRN #LLM
[3/4] We use these embeddings to construct a prior graph and then further refine it with some known TF-target interactions as pre-training targets. Finally, we use this augmented prior graph jointly with a GRN inferred by *any* other method, in order to produce a final prediction.
[2/4] We use gene descriptions from NCBI Gene database and embed them into a high-dimensional space with a LLM (Qwen3-8B). This idea was inspired by GenePT (pubmed.ncbi.nlm.nih.gov/37905130/) and a great study on gene embeddings from @vyao.bsky.social's group (www.biorxiv.org/content/10.1...).
Check out our new preprint on improving gene regulatory network inference by incorporating a prior from plain-text gene descriptions. It's a simple idea, but we show that it proves to be quite powerful and adaptable. [1/4]
www.biorxiv.org/content/10.1...
Nicolae Sapoval @nsapoval.bsky.social presented "Theoretical and empirical performance of pseudo-likelihood- based Bayesian inference of species trees under the multispecies coalescent"
A fantastic theory talk, offering intuitive insights!
Paper: doi.org/10.1101/2025.01.28.635282
As the next step, we aim to develop rigorous corrections to the pseudo-likelihood-based credibility intervals in order to further improve scalability and applicability of Baeysian phylogenomic inference.
In our work we explore suitability of pseudo-likelihood for Bayesian phylogenomic inference. We show that using pseudo-likelihood greatly reduces the computational burden of the Bayesian inference. However, the inferred credibility intervals are overconfident.
Likelihood-based phylogenomic inference is common, but it faces scalability issues. Hence, pseudo-likelihood has been previously proposed as a statistically consistent (for topology estimation) and scalable alternative: doi.org/10.1186/1471...
Our preprint on using pseudo-likelihood for Bayesian inference of species trees from gene tree data under the multispecies coalescent is now online: doi.org/10.1101/2025...
[1/n]
In case if you lose the URL (itβs not pretty), I have linked this on my website (nsapoval.github.io) as well.
This is aimed primarily at the people who are just starting their thesis-based masters or a PhD. However, itβs also an evolving document, so suggestions and ideas are welcome!
Itβs that time of the year when I get some writing done. Here are some notes on how I work with academic literature: plume-lifeboat-00b.notion.site/How-I-find-o...
I just completed "Ceres Search" - Day 4 - Advent of Code 2024 #AdventOfCode adventofcode.com/2024/day/4 (A little bit behind the schedule this year, but gotta keep it going)
I was waiting for a great topic for my first Bluesky post, and I cannot thing of a better one: Iβm thrilled to be hosting @aphillippy.bsky.social at Rice University today and looking forward to his talk at 4pm! events.rice.edu/event/345896...