PNAS
Proceedings of the National Academy of Sciences (PNAS), a peer reviewed journal of the National Academy of Sciences (NAS) - an authoritative source of high-impact, original research that broadly spans...
Excited for our publication on how the geographic scale of a sample affects the discovery of rare, deleterious variants to be out this week. With a mix of theory, simulation, and data analysis, we show when samples are narrow vs broad, the number of variants discovered and their frequencies change
05.06.2025 17:55
π 70
π 27
π¬ 2
π 1
Thank you!
03.06.2025 22:14
π 0
π 0
π¬ 0
π 0
Out today in @pnas.org! www.pnas.org/doi/10.1073/...
03.06.2025 18:25
π 28
π 16
π¬ 1
π 0
Hi! Could I please be added? Thanks for setting this up!
05.12.2024 02:05
π 0
π 0
π¬ 0
π 0
I just figured out how to use feeds! So, sharing this with #popgen π§ͺ
05.12.2024 02:04
π 9
π 1
π¬ 0
π 0
Thanks Erik!
05.12.2024 01:59
π 0
π 0
π¬ 0
π 0
Thanks to co-lead Dan Rice & co-authors @aabiddanda.bsky.social, Marida Ianni-Ravn, and Chris Porras!
04.12.2024 17:17
π 0
π 0
π¬ 0
π 0
Overall - while our theoretical model is no doubt a simplification of the complex dispersal/evolutionary processes seen in natural populations, especially humans - we hope that this work will help improve our interpretation of existing genetic studies and provide guidance for the design of new ones.
04.12.2024 17:17
π 2
π 0
π¬ 1
π 0
Our results have implications for several applications of genetic data. Power to detect trait/disease associations (e.g., GWAS) is tied to allele frequency. The SFS is also used for inference of the distribution of fitness effects, which our results suggest may be biased by effects of study design.
04.12.2024 17:17
π 1
π 1
π¬ 1
π 0
However, when it comes to avg. allele frequency across all sites (incl. monomorphic ones) these effects can cancel - in our theoretical model we see unchanging avg. allele frequency with sampling design. In human data we see this for fine scale samples (within the UK) but not for broader samples.
04.12.2024 17:17
π 0
π 0
π¬ 1
π 0
We find evidence of these effects in re-sampling experiments using the UK Biobank. For example, our broadest re-sample with n=10,000 discovers ~98% more variant LoF sites than our most narrow sample, but allele frequency at those variant sites is on average ~41% lower.
04.12.2024 17:17
π 0
π 0
π¬ 1
π 0
Broad samples will sample a greater number of rare, deleterious variants than narrow samples (we call this βdiscoveryβ), but each will be sampled at lower average frequency (we call this βdilutionβ). These effects lead to substantial changes in some summary statistics, especially for large samples.
04.12.2024 17:17
π 1
π 0
π¬ 1
π 0
We develop a model for the evolution of carriers of rare deleterious variants, and use it to approximate the site frequency spectrum (SFS, the distribution of allele frequencies) in samples at various scales of geographic breadth. We find several key patterns as samples go from βnarrowβ to βbroadβ.
04.12.2024 17:17
π 2
π 0
π¬ 1
π 0
We focus on rare, deleterious variants, which are expected to cluster in geographic space. Rare variants are also generally of interest since they tend to have large effects on traits (including disease traits), and can help improve understanding of biological mechanisms.
04.12.2024 17:17
π 2
π 0
π¬ 1
π 0
In particular, we are interested in geographic breadth, or how broad a region across which individuals are sampled. This is important to current discourse in human genetics surrounding the Euro-centric bias of genetic datasets, and the launch of new biobanks to improve representation globally.
04.12.2024 17:17
π 3
π 0
π¬ 1
π 0
Excited to share a new preprint with @jnovembre.bsky.social ! We use a combination of population genetic theory, simulation, and data analysis to ask: how does study design in genetic studies (including biobanks) impact the discovery of rare, deleterious variants?
04.12.2024 17:17
π 74
π 30
π¬ 2
π 5
Thanks to co-lead Dan Rice + co-authors @aabiddanda.bsky.social, Marida Ianni-Ravn, and Chris Porras!
04.12.2024 17:11
π 0
π 0
π¬ 0
π 0
Overall - while our theoretical model is no doubt a simplification of the complex dispersal/evolutionary processes seen in natural populations, especially humans - we hope that this work will help improve our interpretation of existing genetic studies and provide guidance for the design of new ones.
04.12.2024 17:11
π 0
π 0
π¬ 1
π 0
Our results have implications for several applications of genetic data. Power to detect trait/disease associations (e.g., GWAS) is tied to allele frequency. The SFS is also used for inference of the distribution of fitness effects, which our results suggest may be biased by effects of study design.
04.12.2024 17:11
π 0
π 0
π¬ 1
π 0
However, when it comes to avg. allele frequency across all sites (incl. monomorphic ones) these effects can cancel - in our theoretical model we see unchanging avg. allele frequency with sampling design. In human data we see this for fine scale samples (within the UK) but not for broader samples.
04.12.2024 17:11
π 0
π 0
π¬ 1
π 0
We find evidence of these effects in re-sampling experiments using the UK Biobank. For example, our broadest re-sample with n=10,000 discovers ~98% more variant LoF sites than our most narrow sample, but allele frequency at those variant sites is on average ~41% lower.
04.12.2024 17:11
π 0
π 0
π¬ 1
π 0
Broad samples will sample a greater number of rare, deleterious variants than narrow samples (we call this discovery), but each will be sampled at lower average frequency (we call this dilution). These effects lead to substantial changes in some summary statistics, especially for large samples.
04.12.2024 17:11
π 0
π 0
π¬ 1
π 0
We develop a model for the evolution of carriers of rare deleterious variants, and use it to approximate the site frequency spectrum (SFS, the distribution of allele frequencies) in samples at various scales of geographic breadth. We find several key patterns as samples go from βnarrowβ to βbroadβ.
04.12.2024 17:11
π 0
π 0
π¬ 1
π 0
We focus on rare, deleterious variants, which are expected to cluster in geographic space. Rare variants are also generally of interest since they tend to have large effects on traits (including disease traits), and can help improve understanding of biological mechanisms.
04.12.2024 17:11
π 0
π 0
π¬ 1
π 0
In particular, we are interested in geographic breadth, or how broad a region across which individuals are sampled. This is important to current discourse in human genetics surrounding the Euro-centric bias of genetic datasets, and the launch of new biobanks to improve representation globally.
04.12.2024 17:11
π 0
π 0
π¬ 1
π 0