We are excited to share GPN-Star, a cost-effective, biologically grounded genomic language modeling framework that achieves state-of-the-art performance across a wide range of variant effect prediction tasks relevant to human genetics.
www.biorxiv.org/content/10.1...
(1/n)
22.09.2025 05:29
π 174
π 91
π¬ 4
π 5
Big congrats, Yunha!
30.04.2025 11:38
π 1
π 0
π¬ 0
π 0
Interesting work on plasmid engineering.
11.02.2025 06:34
π 5
π 0
π¬ 0
π 0
All NIH study sections canceled indefinitely. This will halt science and devastate research budgets in universities.
22.01.2025 20:46
π 12266
π 4988
π¬ 586
π 1166
This gives me such hope for biodiversity conservation, mammals and future mammalogists! Go young people!! π§ͺ
19.01.2025 06:07
π 180
π 37
π¬ 3
π 0
Recruiting PhD students: our research covers language model + genomics + systems biology: scholar.google.com/citations?us...
1. Four-year PhD program in Beijing
2. Master's degree required
3. Start date: Sep 2025
Please DM if you are interested.
09.01.2025 03:09
π 1
π 0
π¬ 0
π 0
Two decades of bacterial ecology and evolution in a freshwater lake
Nature Microbiology - A 471-metagenome time series from Lake Mendota in Wisconsin, USA, reveals seasonal and decadal shifts in bacterial functional and ecological dynamics, especially in response...
After 24 years of work, Iβm thrilled to announce the TYMEFLIES dataset, which comprises metagenomes from Lake Mendota (Madison, WI), collected roughly every 10 days (471 samples) for 20 years! @quendi.bsky.social @robinrohwer.bsky.social
rdcu.be/d5put
A threadβ¦
03.01.2025 11:44
π 245
π 101
π¬ 3
π 3
We deeply appreciate the experimental studies that have made this work possible! Please check our github for more details: github.com/lingxusb/TXp...
05.01.2025 06:46
π 0
π 0
π¬ 0
π 0
Google Colab
We hope this work will be a useful tool. Feedback is welcome! Please feel free to try our Colab notebook to predict transcriptomes at (almost) zero cost! It takes about 20 minutes for a genome with 4k genes: colab.research.google.com/drive/1Kd-QI...
04.01.2025 23:46
π 0
π 0
π¬ 1
π 0
TXpredict captures variations in gene expression both across different protein functional groups and within the same functional group.
04.01.2025 23:46
π 0
π 0
π¬ 1
π 0
We further used TXpredict to predict the expression of 3.1M genes across a collection of 900 microbial genomes. Small clusters of ribosomal genes located at the periphery of the tSNE plot of all genes and showed high predicted expressions.
04.01.2025 23:46
π 1
π 0
π¬ 1
π 0
Our model leverages information learned from ESM2 model and basic protein statistics to predict genome-wide gene expression. It achieves an average Spearman correlation of 0.53 in predicting gene expression for bacterial genomes that are not in the training dataset:
04.01.2025 23:46
π 1
π 0
π¬ 1
π 0
Predicting microbial transcriptome using genome sequence https://www.biorxiv.org/content/10.1101/2024.12.30.630741v1
31.12.2024 18:47
π 3
π 1
π¬ 0
π 0
GitHub - lingxusb/EcoVAE
Contribute to lingxusb/EcoVAE development by creating an account on GitHub.
9/n We envision EcoVAE will advance biodiversity investigations, especially in under-sampled regions and ultimately support global biodiversity monitoring effortsπ
π»Codes are publicly available: github.com/lingxusb/Eco...
18.12.2024 01:08
π 0
π 0
π¬ 0
π 0
8/n π§© EcoVAE can also interpolate missing occurrences. For example: In North America, EcoVAE predictions for Sassafras largely overlapped with iNaturalist records. In South Asia, EcoVAE highlighted a wider distribution of Desmodium, consistent with field surveys.
18.12.2024 01:08
π 0
π 0
π¬ 1
π 0
7/n πWhere is biodiversity under-sampled? We found that regions with high prediction error overlap with known "darkspots" of biodiversity collection. For example, the highest prediction errors for plants were observed in South Asia, Southeast Asia, the Middle East, and Central Africa.
18.12.2024 01:08
π 0
π 0
π¬ 1
π 0
6/n π¦EcoVAE isnβt limited to plants. The model generalizes well to other taxa, including butterflies and mammals, showcasing its versatility across ecosystems.
18.12.2024 01:08
π 0
π 0
π¬ 1
π 0
5/nπ₯οΈRemarkably, EcoVAE can predict species distributions even with sparse inputs. With just 20% of input data, it achieved an AUROC of 0.78, effectively identifying the locations of missing genera.
18.12.2024 01:08
π 0
π 0
π¬ 1
π 0
4/nπ We withheld data from three independent regions to test its generalization. The model reconstructed species distributions effectivelyβeven for withheld test regionsβand predicted the location of missing records at genus and species levels.
18.12.2024 01:08
π 0
π 0
π¬ 1
π 0
3/n πWe leverage a VAE structure that enables fast and scalable modeling of species distribution patterns. In training, we masked 50% of species records and tasked the model to reconstruct full species distribution, mimicking real-world biodiversity sampling
18.12.2024 01:08
π 0
π 0
π¬ 1
π 0
2/n πΏBiodiversity is under immense pressure. Predicting global species distributions at scale is critical, but traditional species distribution models struggle with massive datasets and interspecies interactions (e.g., >33M records and >127K species of plants)
18.12.2024 01:08
π 0
π 0
π¬ 1
π 0
A generative deep learning approach for global species distribution prediction
Anthropogenic pressures on biodiversity necessitate efficient and highly scalable methods to predict global species distributions. Current species distribution models (SDMs) face limitations with larg...
πWhat happens when generative AI meets ecology? How can we use AI to advance biodiversity exploration and monitoring?
Excited to introduce EcoVAE, a generative approach trained on over 100 million high-quality vouchered records to model global biodiversity
www.biorxiv.org/content/10.1...
1/nπ§΅
18.12.2024 01:08
π 1
π 0
π¬ 1
π 0
Preprint alert! A thread is coming soon.
17.12.2024 02:00
π 0
π 0
π¬ 0
π 0
book cover and first page of the preface
The third edition of my textbook, Nonlinear Dynamics and Chaos, was published today. You can preview the first 68 pages on Google Books, or take a look at the preface below to see what's new. The main new thing is a chapter on the Kuramoto model! Hope you enjoy it.
16.01.2024 15:55
π 172
π 30
π¬ 6
π 7
Two BioML starter packs now:
Pack 1: go.bsky.app/2VWBcCd
Pack 2: go.bsky.app/Bw84Hmc
DM if you want to be included (or nominate people who should be!)
18.11.2024 17:09
π 119
π 56
π¬ 10
π 11