News

NVIDIA has announced TensorRT-LLM for Windows. This open-source library will allow PC developers with NVIDIA GeForce RTX graphics cards to boost the performance of LLMs by up to four times.
NVIDIA and Stability AI optimized Stable Diffusion 3.5 with FP8 quantization and TensorRT, reducing VRAM needs by 40% and boosting performance, enabling broader GeForce RTX 50 Series GPU support.
The Bing Search team shared how it helped make Bing Search and Bing’s Deep Search faster, more accurate and more cost-effective by transitioning to SLM models and the integration of TensorRT-LLM.