CONCUR: Benchmarking LLMs for Concurrent Code Generation
#CodeGeneration #LLM #Package
hgpu.org?p=30644
CONCUR: Benchmarking LLMs for Concurrent Code Generation
#CodeGeneration #LLM #Package
hgpu.org?p=30644
RepoLaunch: Automating Build & Test Pipeline of Code Repositories on ANY Language and ANY Platform
#LLM #Package
hgpu.org?p=30643
Ray Tracing using HIP
#HIP #AMD #Raytracing #Rendering #Package
hgpu.org?p=30642
Catalyst-Agent: Autonomous heterogeneous catalyst screening and optimization with an LLM Agent
#Chemistry #LLM #Catalyst
hgpu.org?p=30641
Practical FP4 Training for Large-Scale MoE Models on Hopper GPUs
#CUDA #LLM #Hopper #FP4 #Precision #Package
hgpu.org?p=30640
CUDABench: Benchmarking LLMs for Text-to-CUDA Generation
#CUDA #LLM #Benchmarking #Package
hgpu.org?p=30630
StitchCUDA: An Automated Multi-Agents End-to-End GPU Programing Framework with Rubric-based Agentic Reinforcement Learning
#CUDA #CodeGeneration #LLM
hgpu.org?p=30629
CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation
#CUDA #CodeGenerarion #LLM #Package
hgpu.org?p=30628
CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models
#CodeGeneration #LLM #Package
hgpu.org?p=30620
CL4SE: A Context Learning Benchmark For Software Engineering Tasks
#CodeGeneration #LLM #Package
hgpu.org?p=30619
From Prompts to Performance: Evaluating LLMs for Task-based Parallel Code Generation
#OpenMP #LLM #CodeGeneration
hgpu.org?p=30618
A Survey of Recent Developments in SYCL Compiler Implementations
#SYCL #Compilers
hgpu.org?p=30617
Joint Training on AMD and NVIDIA GPUs
#CUDA #ROCm #LLM #NVIDIA #AMD
hgpu.org?p=30616
Fine-Tuning GPT-5 for GPU Kernel Generation
#Triton #CUDA #LLM
hgpu.org?p=30592
KernelBlaster: Continual Cross-Task CUDA Optimization via Memory-Augmented In-Context Reinforcement Learning
#CUDA #LLM #Performance
hgpu.org?p=30591
HPC++: An LLVM-Based Automatic Parallelization Framework with Heterogeneous CPUβGPU Execution
#OpenCL #HPC #LLVM
hgpu.org?p=30590
OptiML: An End-to-End Framework for Program Synthesis and CUDA Kernel Optimization
#CUDA #LLM #Performance
hgpu.org?p=30588
A Safety Report on GPT-5.2, Gemini 3 Pro, Qwen3-VL, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5
#LLM #Security #Package
hgpu.org?p=30589
Execution-Centric Characterization of FP8 Matrix Cores, Asynchronous Execution, and Structured Sparsity on AMD MI300A
#AMD #HIP #ROCm
hgpu.org?p=30572
Improving Code Generation via Small Language Model-as-a-judge
#CodeGeneration #LLM
hgpu.org?p=30571
Deep Kernel Fusion for Transformers
#CUDA #LLM #Performance
hgpu.org?p=30570
DICE: Diffusion Large Language Models Excel at Generating CUDA Kernels
#CUDA #LLM #CodeGeneration #Package
hgpu.org?p=30569
Improving HPC Code Generation Capability of LLMs via Online Reinforcement Learning with Real-Machine Benchmark Rewards
#CUDA #OpenMP #HPC #CodeGeneration #LLM
hgpu.org?p=30568
Inside VOLT: Designing an Open-Source GPU Compiler (Tool)
#OpenCL #CUDA #FPGA #Compilers #Package
hgpu.org?p=30544
HetCCL: Accelerating LLM Training with Heterogeneous GPUs
#LLM #DeepLearning #DL
hgpu.org?p=30545
Just-in-Time Catching Test Generation at Meta
#LLM #Testing
hgpu.org?p=30543
Scaling GPU-to-CPU Migration for Efficient Distributed Execution on CPU Clusters
#CUDA #Triton #Compilers
hgpu.org?p=30542
Accelerating Scientific Research with Gemini: Case Studies and Common Techniques
#LLM #Gemini #Review
hgpu.org?p=30541
Dr. Kernel: Reinforcement Learning Done Right for Triton Kernel Generations
#CUDA #Triton #Package
hgpu.org?p=30540
SciDef: Automating Definition Extraction from Academic Literature with Large Language Models
#LLM #NLP #Package
hgpu.org?p=30539