MatlabGPUDemo1
From FarmShare
(Difference between revisions)
Line 7: | Line 7: | ||
For a list of examples, see: http://www.mathworks.com/products/parallel-computing/examples.html?s_tid=brdcrb | For a list of examples, see: http://www.mathworks.com/products/parallel-computing/examples.html?s_tid=brdcrb | ||
These matlab functions have GPU support: http://www.mathworks.com/help/distcomp/using-gpuarray.html#bsloua3-1 | These matlab functions have GPU support: http://www.mathworks.com/help/distcomp/using-gpuarray.html#bsloua3-1 | ||
+ | Example scritpts: http://www.mathworks.com/help/distcomp/examples/index.html#gpu | ||
In this example we will run the Benchmarking A\b on the GPU one found here: [[http://www.mathworks.com/help/distcomp/examples/benchmarking-a-b-on-the-gpu.html?prodcode=DM&language=en| Benchmarking A\b on the GPU]] | In this example we will run the Benchmarking A\b on the GPU one found here: [[http://www.mathworks.com/help/distcomp/examples/benchmarking-a-b-on-the-gpu.html?prodcode=DM&language=en| Benchmarking A\b on the GPU]] |
Revision as of 09:43, 9 September 2013
Matlab GPU demos
GPU devices in Matlab are supported by the parallel computing toolbox. No special setup is required. Matlab will discover and use Cuda devices automatically.
Resources:
Information can be found here: http://www.mathworks.com/products/parallel-computing/index.html For a list of examples, see: http://www.mathworks.com/products/parallel-computing/examples.html?s_tid=brdcrb These matlab functions have GPU support: http://www.mathworks.com/help/distcomp/using-gpuarray.html#bsloua3-1 Example scritpts: http://www.mathworks.com/help/distcomp/examples/index.html#gpu
In this example we will run the Benchmarking A\b on the GPU one found here: [Benchmarking A\b on the GPU]
matlab commands used below:
paralleldemo_gpu_devices paralleldemo_gpu_backslash(.75);
example output
Here we launch Matlab, run paralleldemo_gpu_devices to print out the Cuda device discovered by Matlab. Then we run the A\b demo.
$ module load matlab $ matlab -nodesktop Warning: No display specified. You will not be able to display graphics on the screen. Warning: No window system found. Java option 'MWT' ignored. < M A T L A B (R) > Copyright 1984-2013 The MathWorks, Inc. R2013a (8.1.0.604) 64-bit (glnxa64) February 15, 2013 No window system found. Java option 'MWT' ignored. To get started, type one of these: helpwin, helpdesk, or demo. For product information, visit www.mathworks.com. >> paralleldemo_gpu_devices numDevices = 1 origDevice = CUDADevice with properties: Name: 'Tesla C2070' Index: 1 ComputeCapability: '2.0' SupportsDouble: 1 DriverVersion: 5.5000 ToolkitVersion: 5 MaxThreadsPerBlock: 1024 MaxShmemPerBlock: 49152 MaxThreadBlockSize: [1024 1024 64] MaxGridSize: [65535 65535 65535] SIMDWidth: 32 TotalMemory: 5.6366e+09 FreeMemory: 5.5344e+09 MultiprocessorCount: 14 ClockRateKHz: 1147000 ComputeMode: 'Default' GPUOverlapsTransfers: 1 KernelExecutionTimeout: 0 CanMapHostMemory: 1 DeviceSupported: 1 DeviceSelected: 1 device = CUDADevice with properties: Name: 'Tesla C2070' Index: 1 ComputeCapability: '2.0' SupportsDouble: 1 DriverVersion: 5.5000 ToolkitVersion: 5 MaxThreadsPerBlock: 1024 MaxShmemPerBlock: 49152 MaxThreadBlockSize: [1024 1024 64] MaxGridSize: [65535 65535 65535] SIMDWidth: 32 TotalMemory: 5.6366e+09 FreeMemory: 5.5344e+09 MultiprocessorCount: 14 ClockRateKHz: 1147000 ComputeMode: 'Default' GPUOverlapsTransfers: 1 KernelExecutionTimeout: 0 CanMapHostMemory: 1 DeviceSupported: 1 DeviceSelected: 1 >> paralleldemo_gpu_backslash(.75); Starting benchmarks with 13 different single-precision matrices of sizes ranging from 1024-by-1024 to 13312-by-13312. Creating a matrix of size 1024-by-1024. Gigaflops on CPU: 34.472190 Gigaflops on GPU: 56.288799 Creating a matrix of size 2048-by-2048. Gigaflops on CPU: 49.891778 Gigaflops on GPU: 106.760173 Creating a matrix of size 3072-by-3072. Gigaflops on CPU: 64.997307 Gigaflops on GPU: 197.257665 Creating a matrix of size 4096-by-4096. Gigaflops on CPU: 70.944260 Gigaflops on GPU: 266.873255 Creating a matrix of size 5120-by-5120. Gigaflops on CPU: 84.640804 Gigaflops on GPU: 319.151358 Creating a matrix of size 6144-by-6144. Gigaflops on CPU: 92.799236 Gigaflops on GPU: 355.467871 Creating a matrix of size 7168-by-7168. Gigaflops on CPU: 98.141367 Gigaflops on GPU: 388.194551 Creating a matrix of size 8192-by-8192. Gigaflops on CPU: 102.462204 Gigaflops on GPU: 405.167131 Creating a matrix of size 9216-by-9216. Gigaflops on CPU: 98.400070 Gigaflops on GPU: 419.867571 Creating a matrix of size 10240-by-10240. Gigaflops on CPU: 96.734765 Gigaflops on GPU: 434.993371 Creating a matrix of size 11264-by-11264. Gigaflops on CPU: 112.294056 Gigaflops on GPU: 439.164558 Creating a matrix of size 12288-by-12288. Gigaflops on CPU: 115.434767 Gigaflops on GPU: 440.911860 Creating a matrix of size 13312-by-13312. Gigaflops on CPU: 115.826290 Gigaflops on GPU: 460.198654 Starting benchmarks with 9 different double-precision matrices of sizes ranging from 1024-by-1024 to 9216-by-9216. Creating a matrix of size 1024-by-1024. Gigaflops on CPU: 14.479196 Gigaflops on GPU: 21.906035 Creating a matrix of size 2048-by-2048. Gigaflops on CPU: 27.758668 Gigaflops on GPU: 70.264055 Creating a matrix of size 3072-by-3072. Gigaflops on CPU: 35.325472 Gigaflops on GPU: 110.924771 Creating a matrix of size 4096-by-4096. Gigaflops on CPU: 41.316066 Gigaflops on GPU: 151.816138 Creating a matrix of size 5120-by-5120. Gigaflops on CPU: 47.203079 Gigaflops on GPU: 182.013352 Creating a matrix of size 6144-by-6144. Gigaflops on CPU: 50.618165 Gigaflops on GPU: 203.495957 Creating a matrix of size 7168-by-7168. Gigaflops on CPU: 53.713014 Gigaflops on GPU: 220.657206 Creating a matrix of size 8192-by-8192. Gigaflops on CPU: 54.993392 Gigaflops on GPU: 225.368964 Creating a matrix of size 9216-by-9216. Gigaflops on CPU: 56.978938 Gigaflops on GPU: 237.973215