#pretraining

@tmlr-pub.bsky.social

1 month ago

New #J2C Certification:

BiSSL: Enhancing the Alignment Between Self-Supervised Pretraining and Downstream Fine-Tuning via...

Gustav Wagner Zakarias, Lars Kai Hansen, Zheng-Hua Tan

https://openreview.net/forum?id=GQAGlqOpyA

#supervised #pretraining #pretrained

1 0 0 0

ProPics TV

@propicstv.bsky.social

1 month ago

Model Collapse - What Happens When AI Feeds Itself #ai #science #viral

#stockimages #Stockvideos #dataset #Training #Datasales #Datalicensing #MachineLearning #imagelicensing #transformermodels #pretraining #transferlearning #objectdetection #LoRA #Largevisionmodels #GANS

0 0 0 0

ProPics TV

@propicstv.bsky.social

1 month ago

Model Collapse - What Happens When AI Feeds Itself #ai #science #viral

#stockimages #Stockvideos #dataset #Training #Datasales #Datalicensing #MachineLearning #imagelicensing #diffusionmodel #transformermodels #pretraining #transferlearning #objectdetection #LoRA #Largevisionmodels #GANS

0 0 0 0

chenran818

@chenran818.bsky.social

2 months ago

Pre-training is back! 🚀 Forget the 'scaling laws are dead' talk. While everyone thought RL was king, top labs like OpenAI were wrong. Pre-training is set for a renaissance by 2026, driving major AI progress! #AI #Pretraining #MachineLearning

0 0 0 0

@arxivlens.bsky.social

2 months ago

Towards Scalable Pre-training of Visual Tokenizers for Generation
Jingfeng Yao, Xinggang Wang et al.
Paper
Details
#VisualTokenizers #Pretraining #GenerativeAI

0 0 0 0

trending stonks

@trendingstocks.bsky.social

3 months ago

Big Computers, New Questions - Ilya Sutskever and Dwarkesh Patel

#research #pretraining

0 0 0 0

TMLR Published Papers

@tmlr-pub.bsky.social

3 months ago

Understanding Emergent In-Context Learning from a Kernel Regression Perspective

Chi Han, Ziqi Wang, Han Zhao, Heng Ji

Action editor: Yingbin Liang

https://openreview.net/forum?id=6rD50Q6yYz

#context #attention #pretraining

0 0 0 0

Ars Technica News

@arstechni.ca

5 months ago

AI models can acquire backdoors from surprisingly few malicious documents https://arstechni.ca #UKAISecurityInstitute #alanturinginstitute #AIvulnerabilities #backdoorattacks #machinelearning #datapoisoning #trainingdata #LLMsecurity #modelsafety #pretraining #AIresearch #AIsecurity…

1 0 0 0

GetNews.me

@getnews-me.bsky.social

5 months ago

Probabilistic Language-Image Pre-Training Boosts Vision-Language Models

A new probabilistic language-image pre-training approach is reported to boost performance of vision-language models. Read more: getnews.me/probabilistic-language-i... #visionlanguage #pretraining #ai

0 0 0 0

TMLR Published Papers

@tmlr-pub.bsky.social

5 months ago

TapWeight: Reweighting Pretraining Objectives for Task-Adaptive Pretraining

Ruiyi Zhang, Sai Ashish Somayajula, Pengtao Xie

Action editor: Simon Kornblith

https://openreview.net/forum?id=DCCw2CEVFS

#pretraining #tapweight #tap

0 0 0 0

GetNews.me

@getnews-me.bsky.social

5 months ago

How Pretraining Data Shapes In-Context Learning

A new study finds that heavier-tailed pretraining data improves accuracy on rare numerical tasks, while broader coverage cuts the demos needed for target performance. Read more: getnews.me/how-pretraining-data-sha... #incontextlearning #pretraining

0 0 0 0

GetNews.me

@getnews-me.bsky.social

5 months ago

Academic Pre-Training Feasible: $100K or 100 Days Trade-Offs

Academic teams can pre‑train a billion‑parameter model for about $100,000, using four GPUs over 18 days—a trade‑off from the original 64‑GPU three‑day run. Read more: getnews.me/academic-pre-training-fe... #academiacompute #pretraining #llm

0 0 0 0

GetNews.me

@getnews-me.bsky.social

5 months ago

CoMTIP: Contrastive Masked Pre‑training for Spatial Transcriptomics

CoMTIP, a pre‑training framework linking histology images, gene names and expression values, was submitted on 21 September 2025. It offers zero‑shot gene prediction and beats prior methods. getnews.me/comtip-contrastive-pre-t... #spatial #pretraining

0 0 0 0

human conversations

@trndgtr.bsky.social

7 months ago

Leaps, Not Just Steps - Demis Hassabis on Lex Fridman

#scurve #pretraining #aiscaling

0 0 0 0

ELLIOT Project

@elliot-eu.bsky.social

7 months ago

🎯 How is ELLIOT strengthening Europe’s AI ecosystem?

Jenia Jitsev explains how his team leads #pretraining of #MultimodalAI open foundation models in the #HorizonEU project — using scaling laws to improve core building blocks for trustworthy, reusable #GeneralistAI.

🎥 Watch the video to learn more

3 0 0 0

Sharp Coder Blog

@sharpcoderblog.com

9 months ago

Intro to Procedural Animation in Unity Procedural animation is a technique in computer graphics used to generate motion algorithmically rather than using pre-defined keyframes. This method allows for more dynamic ...

Intro to Procedural Animation in Unity #Ai #Chatbot #Gpt #Openai #Transformer #Nlp #Deeplearning #Gpt3 #Gpt2 #Conversational #Languagemodel #Neuralnetwork #Pretraining #Finetuning

1 0 0 0

Juan Carlos Niebles

@jcniebles.bsky.social

9 months ago

AdaVid: Adaptive Video-Language Pretraining Contrastive video-language pretraining has demonstrated great success in learning rich and robust video representations. However, deploying such video encoders on compute-constrained edge devices rema...

I'll also be presenting multiple papers at #CVPR2025! First up: "AdaVid: Adaptive Video-Language Pretraining".

🗓️ Thu Jun 12, 12:00-13:00PM
📍 ExHall D Poster #202
🔗 Paper: arxiv.org/abs/2504.12513
🌐 Website: chaitanya100100.github.io/AdaVid/
#VideoLanguage #Pretraining

1 1 1 1

TMLR Published Papers

@tmlr-pub.bsky.social

9 months ago

Random Policy Enables In-Context Reinforcement Learning within Trust Horizons

Weiqin Chen, Santiago Paternain

Action editor: Pin-Yu Chen

https://openreview.net/forum?id=mAiMKnr9r5

#pretraining #trained #pretrained

0 0 0 0

MLCommons

@mlcommons.org

10 months ago

MLCommons MLPerf Training Expands with Llama 3.1 405B - MLCommons MLCommons MLPerf Training Expands with Llama 3.1 405B

MLCommons' MLPerf Training suite has a new #pretraining #benchmark based on #Meta’s Llama 3.1 405B model. We use the same dataset with a bigger model and longer context, offering a more relevant and challenging measure for today’s #AI systems. mlcommons.org/2025/05/trai...

0 0 0 0

Hacker News Companion

@hncompanion.com

10 months ago

4/15 Is pretraining + RLHF optimization surpassing scale, or are benchmarks just improving for specific tasks? 🤔 Good question raised by badmonster in the HN thread. #Pretraining #RLHF #Optimization https://news.ycombinator.com/item?id=43842683#43852749

1 0 1 0

TMLR Published Papers

@tmlr-pub.bsky.social

10 months ago

New #Featured Certification:

Random Policy Enables In-Context Reinforcement Learning within Trust Horizons

Weiqin Chen, Santiago Paternain

https://openreview.net/forum?id=mAiMKnr9r5

#pretraining #trained #pretrained

1 0 0 0

TMLR Published Papers

@tmlr-pub.bsky.social

11 months ago

Mixed Sparsity Training: Achieving 4$\times$ FLOP Reduction for Transformer Pretraining

Pihe Hu, Shaolong Li, Xun Wang, Longbo Huang

Action editor: Vincent Tan

https://openreview.net/forum?id=XosdLS7KVE

#sparse #pretraining #gpu

1 0 0 0

TMLR Published Papers

@tmlr-pub.bsky.social

1 year ago

How Does Code Pretraining Affect Language Model Task Performance?

Jackson Petty, Sjoerd van Steenkiste, Tal Linzen

Action editor: John Timothy Halloran

https://openreview.net/forum?id=pxxmUKKgel

#linguistic #pretraining #pretrain

2 0 0 0

@liang-weixin.bsky.social

1 year ago

Joint work with Junhong Shen, Genghan Zhang @zhang677.bsky.social, Ning Dong, Luke Zettlemoyer, Lili Yu

#LLM #MultiModal #pretraining

0 0 0 0

TMLR Published Papers

@tmlr-pub.bsky.social

1 year ago

Pretraining a Neural Operator in Lower Dimensions

AmirPouya Hemmasian, Amir Barati Farimani

Action editor: Xingyou Song

https://openreview.net/forum?id=ZewaRoZehI

#pdes #pretraining #pde

0 0 0 0

TMLR Published Papers

@tmlr-pub.bsky.social

1 year ago

Why Fine-grained Labels in Pretraining Benefit Generalization?

Guan Zhe Hong, Yin Cui, Ariel Fuxman, Stanley H. Chan, Enming Luo

Action editor: Dmitry Kangin

https://openreview.net/forum?id=FojAV72owK

#pretraining #labeled #deep

0 0 0 0

TMLR Published Papers

@tmlr-pub.bsky.social

1 year ago

Adaptive Training Distributions with Scalable Online Bilevel Optimization

David Grangier, Pierre Ablin, Awni Hannun

Action editor: Changjian Shui

https://openreview.net/forum?id=JP1GVyF5i5

#pretraining #pretrained #adaptive

1 0 0 0

TMLR Published Papers

@tmlr-pub.bsky.social

1 year ago

Strategies for Pretraining Neural Operators

Anthony Zhou, Cooper Lorsung, AmirPouya Hemmasian, Amir Barati Farimani

Action editor: Antonio Vergari

https://openreview.net/forum?id=9vEVeX9oIv

#pretraining #models #modeling

1 0 0 0

Posts tagged #pretraining