Ning's Avatar

Ning

@ningcao

Thinking about Data, a lot

28
Followers
22
Following
7
Posts
22.11.2024
Joined
Posts Following

Latest posts by Ning @ningcao

Post image Post image Post image

Come find @datologyai.com crew and some data cookie at NeurIPS!

11.12.2024 02:06 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
DatologyAI Jobs DatologyAI Jobs

If you’re as excited as we are about pushing the boundaries of data curation, stop by booth 303 at NeurIPS to chat with us! We’re also hiring across Research and Engineering: jobs.ashbyhq.com/DatologyAI

26.11.2024 01:35 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

I’m incredibly grateful to have contributed to this mission of building the best LLM data pipeline. Collaborating with the team, I’ve learned so much about Data-Centric ML, designing thoughtful and rigorous experiments, and the engineering principles behind creating a resilient data pipeline.

26.11.2024 01:35 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
Technical Deep-Dive: Curating Our Way to a State-of-the-Art Text Dataset Our data curation pipeline to obtain substantial improvements in LLM quality, training speed, and inference efficiency.

Over the past few months, we’ve run hundreds of ablations, rigorously tested hypotheses, and experimented relentlessly to ensure our results are both scalable and robust. Read more here: www.datologyai.com/post/technic...

26.11.2024 01:35 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
DatologyAI Jobs DatologyAI Jobs

Thrilled to share that we’ve surpassed DCLM and built a state-of-the-art data curation pipeline to enable better, faster, and more cost-efficient LLMs!

26.11.2024 01:35 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
DatologyAI-curated data beating DCLM and other SOTA datasets.

DatologyAI-curated data beating DCLM and other SOTA datasets.

We beat DCLM and created SOTA data curation pipeline! 🧡

26.11.2024 01:35 πŸ‘ 3 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Consider it early Black Friday for Data Curation.

25.11.2024 17:59 πŸ‘ 7 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0