NVIDIA Tesla V100 welcomes you to the era of AI with finding the insights hidden in oceans of data can transform entire industries, from personalized cancer therapy to helping virtual personal assistants converse naturally and predicting the next big hurricane.
NVIDIA® Tesla® V100 Tensor Core is the most advanced data center GPU ever built to accelerate AI, High Performance Computing (HPC), and graphics. It’s powered by NVIDIA Volta architecture, comes in 16 and 32GB configurations, and offers the performance of up to 100 CPUs in a single GPU. Data scientists, researchers, and engineers can now spend less time optimizing memory usage and more time designing the next AI breakthrough.
Why is it so special
With 640 Tensor Cores, Tesla V100 is the world’s first GPU to break the 100 teraFLOPS (TFLOPS) barrier of deep learning performance. The next generation of NVIDIA NVLink™ connects multiple V100 GPUs at up to 300 GB/s to create the world’s most powerful computing servers. AI models that would consume weeks of computing resources on previous systems can now be trained in a few days. With this dramatic reduction in training time, a whole new world of problems will now be solvable with AI.
It is engineered to provide maximum performance in existing hyperscale server racks. With AI at its core, Tesla V100 GPU delivers 47X higher inference performance than a CPU server. This giant leap in throughput and efficiency will make the scale-out of AI services practical.
It is engineered for the convergence of AI and HPC. It offers a platform for HPC systems to excel at both computational science for scientific simulation and data science for finding insights in data. By pairing NVIDIA CUDA® cores and Tensor Cores within a unified architecture, a single server with Tesla V100 GPUs can replace hundreds of commodity CPU-only servers for both traditional HPC and AI workloads. Every researcher and engineer can now afford an AI supercomputer to tackle their most challenging work.
Tech Spec. – NVIDIA TESLA V100
Tesla V100 Specifications
- 5120 CUDA cores
- 640 New Tensor Cores
- 7.5 TeraFLOPS double-precision performance with NVIDIA GPU Boost
- 15 TeraFLOPS single-precision performance with NVIDIA GPU Boost
- 120 TeraFLOPS mixed-precision deep learning performance with NVIDIA GPU Boost
- 300 GB/s bi-directional interconnect bandwidth with NVIDIA NVLink
- 900 GB/s memory bandwidth with CoWoS HBM2 Stacked Memory
- 16 GB of CoWoS HBM2 Stacked Memory
- 300 Watt
Tesla V100 Form Factors
- Tesla V100 for NVLink: Ultimate performance for deep learning
- Tesla V100 for PCIe: Highest versatility for all workloads