We hope that you enjoy digging into the tool. Let us know what you think!
We hope that you enjoy digging into the tool. Let us know what you think!
MolGen-Transformer is a generative AI model designed to address challenges of molecular language models (LM) that impact their effectiveness, and is trained on a large and diverse dataset of approximately 198 million organic molecules.
Introducing MolGen-Transformer, the result of a longstanding collaboration with Baskar Ganapathysubramanian that has been led by Bella Yang and Rebekah Duke-Crockett with additional contributions from Moses Ogbaje and Parker Sornberger.
lnkd.in/eSTnFXXb
Notably, the analysis reveals that one of the most popular similarity measures (mfpReg, Tanimoto) can rank among the worst-performing measures depending on the property being evaluated.
chemrxiv.org/engage/chemr...
Using over 350 million molecule pairs with electronic structure, redox, and optical properties, we evaluate correlations between molecular fingerprints, distance functions, and properties.
The approach builds on the concept of neighborhood behavior and incorporates KDE analysis to quantify how well similarity measures capture property relationships.
How do you choose your approach to identify molecular similarity? In this latest collaborative work led by Rebekah Duke, we introduce a framework to evaluate the correlation between molecular similarity and properties.
chemrxiv.org/engage/chemr...
Thanks! We'll have to take a look.
Welcome to the club!