Development & Optimization

Jun 18, 2025
Improved Performance and Monitoring Capabilities with NVIDIA Collective Communications Library 2.26
The NVIDIA Collective Communications Library (NCCL) implements multi-GPU and multinode communication primitives optimized for NVIDIA GPUs and networking. NCCL...
11 MIN READ

Jun 18, 2025
Compiler Explorer: An Essential Kernel Playground for CUDA Developers
Have you ever wondered exactly what the CUDA compiler generates when you write GPU kernels? Ever wanted to share a minimal CUDA example with a colleague...
7 MIN READ

Jun 13, 2025
Run High-Performance LLM Inference Kernels from NVIDIA Using FlashInfer
Best-in-class LLM Inference requires two key elements: speed and developer velocity. Speed refers to maximizing the efficiency of the underlying hardware by...
6 MIN READ

Jun 12, 2025
Accelerated Sequence Alignment for Protein Science with MMseqs2-GPU and NVIDIA NIM
Protein sequence alignment—comparing protein sequences for similarities—is fundamental to modern biology and medicine. It illuminates gene functions by...
9 MIN READ

Jun 11, 2025
Accelerate Decision Optimization Using Open Source NVIDIA cuOpt
Businesses make thousands of decisions every day—what to produce, where to ship, how to allocate resources. At scale, optimizing these decisions becomes a...
5 MIN READ

Jun 11, 2025
Introducing NVIDIA DGX Cloud Lepton: A Unified AI Platform Built for Developers
The age of AI-native applications has arrived. Developers are building advanced agentic and physical AI systems—but scaling across geographies and GPU...
6 MIN READ

Jun 06, 2025
How NVIDIA GB200 NVL72 and NVIDIA Dynamo Boost Inference Performance for MoE Models
The latest wave of open source large language models (LLMs), like DeepSeek R1, Llama 4, and Qwen3, have embraced Mixture of Experts (MoE) architectures. Unlike...
12 MIN READ

Jun 04, 2025
Maximizing OpenMM Molecular Dynamics Throughput with NVIDIA Multi-Process Service
Molecular dynamics (MD) simulations model atomic interactions over time and require significant computational power. However, many simulations have small...
7 MIN READ

Jun 03, 2025
NVIDIA Base Command Manager Offers Free Kickstart for AI Cluster Management
As AI and high-performance computing (HPC) workloads continue to become more common and complex, system administrators and cluster managers are at the heart of...
3 MIN READ

May 27, 2025
Upcoming Webinar: Supercharge Agentic AI with Scalable Data Flywheels
Join our live webinar on June 18 to see how NVIDIA NeMo microservices speed AI agent development.
1 MIN READ

May 27, 2025
Advanced Optimization Strategies for LLM Training on NVIDIA Grace Hopper
In the previous post, Profiling LLM Training Workflows on NVIDIA Grace Hopper, we explored the importance of profiling large language model (LLM) training...
10 MIN READ

May 27, 2025
Profiling LLM Training Workflows on NVIDIA Grace Hopper
The rapid advancements in AI have resulted in an era of exponential growth in model sizes, particularly in the domain of large language models (LLMs). These...
12 MIN READ

May 22, 2025
Spotlight: Infleqtion Optimizes Portfolios Using Q-CHOP and NVIDIA CUDA-Q Dynamics
Computing is an essential tool for the modern financial services industry. Profits are won and lost based on the speed and accuracy of algorithms guiding...
9 MIN READ

May 15, 2025
Predicting Performance on Apache Spark with GPUs
The world of big data analytics is constantly seeking ways to accelerate processing and reduce infrastructure costs. Apache Spark has become a leading platform...
9 MIN READ

May 14, 2025
Get Trained and Certified at GTC Paris at VivaTech 2025
Join us at GTC Paris on June 10th and choose from six full-day, instructor-led workshops.
1 MIN READ

May 09, 2025
CUDA C++ Compiler Updates Impacting ELF Visibility and Linkage
In the next CUDA major release, CUDA 13.0, NVIDIA is introducing two significant changes to the NVIDIA CUDA Compiler Driver (NVCC) that will impact ELF...
11 MIN READ