Our Blog

V-Ray GPU Benchmarks on Top-of the-Line NVIDIA GPUs

V-Ray GPU Benchmarks on Top-of the-Line NVIDIA GPUs

Introduction:

V-Ray’s GPU rendering and NVIDIA’s hardware are constantly improving. Recently, there have been major advances in both, so we thought now would be the perfect time to run new benchmarks and find out how much faster everything might be.

The hardware:

With 40 logical CPU cores and 128GB RAM, the Lenovo P900 is powerful. It’s great for GPU tests, since there’s space for three double slot GPUs and one single slot GPU. Plus, the toolless chassis makes it quick to pop cards in and out. The tests felt like an F1 pitstop for GPUs.

The GPUs we decided to test are as follows:

 

GPU Architecture Cores RAM type RAM Power Slots Street Price
GP100 Pascal 3584 HBM2 16GB 235W 2 N/A
P6000 Pascal 3840 GDDR5X 24GB 250W 2 $4,699
P5000 Pascal 2560 GDDR5X 16GB 180W 2 $2,499
P4000 Pascal 1792 GDDR5X 8GB 105W 1 N/A
M6000 Maxwell 3072 GDDR5 24GB 250W 2 $4,539
Titan X (Pascal) Pascal 3584 GDDR5X 12GB 250W 2 $1,599

*Street prices approximate, based on a quick search at Newegg and Amazon. The GP100 and P4000 are not public yet, so no pricing is available.

The benchmark test:

Even before the benchmarks started, we were very interested to see NVIDIA’s new NVLink tech in action. Because NVLink allows cards to share memory, we were curious to see what sort of performance we could get using two new GP100s. More on this later.
 
Our lead GPU developer, Blago Taskov and I set up the benchmarks. To get better data, we decided it would be best to test multiple scenes instead of just one. We batch rendered nine different scenes and recorded the time to complete each one. Then, we added up the total time for all nine.
 
Here are the results:
  Test 1 Test 2 Test 3 Test 4 Test 5 Test 6 Test 7 Test 8 Test 9 Total time
GP100 x 2 46.49 130.36 156.69 29.43 112.99 39.88 40.21 107.75 19.94 683.74
GP100 90.72 251.81 295.52 50.84 220.51 77.72 76.94 202.28 38.02 1304.36
P6000 127.21 363.18 410.72 72.17 348.99 131.39 109.64 264.82 61.83 1889.95
P5000 188.18 536.69 594.92 101.77 531.6 198.02 158.83 374.65 90.98 2775.64
P4000 212.54 636.84 724.22 131.86 565.83 207.79 178.6 455.61 104.62 3217.91
M6000 140.13 483.71 538.86 97.59 423.11 159.04 134.79 351.91 73.15 2402.29
Titan X (Pascal) 139.25 350.49 371.32 75.45 351.15 139.82 107.4 254.62 59.53 1849.03


A comparison of the different times in percentage of time for each card can be seen in this table:

  GP100 x 2 GP100 P6000 P5000 P4000 M6000 Titan X (Pascal)
GP100 x 2 1 1.907684 2.764135 4.059496 4.706336 3.513455 2.7042882
GP100 0.524196 1 1.448948 2.127971 2.467041 1.841738 1.4175764
P6000 0.361777 0.690156 1 1.468631 1.702643 1.271087 0.9783486
P5000 0.246336 0.469931 0.680906 1 1.15934 0.86549 0.6661635
P4000 0.21248 0.405344 0.587322 0.86256 1 0.746537 0.5746059
M6000 0.28462 0.542965 0.786728 1.155414 1.339518 1 0.7696947
Titan X (Pascal) 0.369783 0.705429 1.022131 1.501133 1.740323 1.299216 1

A note about RAM:

RAM plays a big part in the value of these cards. For example, the Titan X (Pascal) and P6000 showed similar times across all the tests. On some the Titan X was faster, and on others the P6000 beat it outright. In overall time, the Titan X narrowly edged out the P6000. But that’s the not the whole story. While both cards were neck in neck in speed, the choice (and cost) comes down to RAM. The Titan X is significantly less expensive at 12GB of RAM, but the P6000 can fit much more data with its 24GB of RAM. You might be able to give yourself a little more breathing on that 12GB card with V-Ray 3.5’s On-demand Mip-mapping. This would dramatically reduce the RAM requirements for loading textures. Ultimately, it comes down to your budget and how much memory you really need.
 
Let’s say you want to render a huge scene with lots of geometry and textures. If you need more than 24GB, that’s where NVLink comes in.

What is NVLink?

Currently, GP100s are the only cards to support NVLink. They use special HBM2 memory that is so fast, it can be shared across cards. It may look similar to SLI, but it’s not the same. In our setup we connected two GP100s. In theory, with specialized hardware, it’s possible to link more. For example, NVIDIA’s DGX-1 does this with eight P100 GPUs. But at $129,000 it’s a little out of our price range. We’re looking forward to testing that one. When we do, we’ll be sure to share the results. 

V-Ray and NVLink:

We’ve enabled NVLINK in the latest V-Ray nightly builds. To test it, we enlisted the help of our friends at Dabarti Studio, and they created this torture test.

Model and assets courtesy of Dabarti with 169 million polygons and 150+ 6k textures

 
This scene contains 169 million polygons and over 150 6K images. The geometry alone won’t fit on a single card, not to mention all those high res. textures.
 
Time to render. First, we set all objects to Dynamic Geometry in the V-Ray Properties. This made it possible for the geometry to be shared across the cards. Then, we disabled On-demand Mip-mapping to force the full resolution textures to load. Once the cards were fully loaded, each one used 13GB of its 16GB RAM. That’s a total of 26GB RAM on both cards – more than the 24GB a P6000 can hold.
 
It worked, and we noticed little or no performance loss with NVLink. It’s still early, but the initial results are positive. Maybe with a few driver updates and V-Ray tweaks, NVLink will perform even better in the future.

Conclusion:

Moore’s Law is alive and well. The M6000 arrived about two years ago and today the GP100 is almost twice as fast – right on schedule. The combination of NVIDIA’s latest tech and V-Ray’s most recent advances in GPU rendering, seem to remove some of the early memory limitations. And that paints a bright future for GPU rendering.  We will continue to test and update you more as we get new hardware to test and benchmark.

Special thanks:

Thanks to NVIDIA for loaning us their latest and greatest hardware for stress testing. Also, thanks to Lenovo for supplying Chaos Group Labs with a workstation that can handle some serious computing. And thanks to Tomasz Wyszolmirski at Dabarti Studio for helping us continue to push GPU rendering to its limits.

Tags: , , , ,

Show Comments (5)

This is a unique website which will require a more modern browser to work! Please upgrade today!