Project

General

Profile

Statistics
| Branch: | Revision:

root / rgbdslam / external / siftgpu / Notes On CUDA.txt @ 9240aaa3

History | View | Annotate | Download (2.41 KB)

1 9240aaa3 Alex
---------------------------------------------------------------------------------
2
1. How to enable CUDA
3
4
The CUDA implementation in the package is not compiled by default.
5
6
To enable it for visual stuio 2005, use vc/SiftGPU_CUDA_Enabled.sln
7
To enable it for other OS, you need to change siftgpu_enable_cuda to 1 in the makefile
8
9
10
---------------------------------------------------------------------------------
11
2. Change CUDA build parameters.
12
For windows, you need to change the settings in the custom build command line of 
13
ProgramCU.cu. For example, add -use_fast_match for using fast match. 
14
15
For Other OS, you need to change the makefile. The top part of the makefile is 
16
the configuration section, which includes: 
17
	siftgpu_enable_cuda = 0             (Set 1 to enable CUDA-based SiftGPU)
18
	CUDA_INSTALL_PATH = /usr/local/cuda (Where to find CUDA)
19
	siftgpu_cuda_options = -arch sm_10  (Additional CUDA Compiling options)
20
21
	
22
------------------------------------------------------------------------------------
23
3. CUDA runtime parameters for SiftGPU::ParseParam
24
First, you need to specify "-cuda" to use CUDA-based SiftGPU. More parameters can 
25
be chagned at runtime in CUDA-based SiftGPU than in OpenGL-based version. Check out
26
the manual for details. 
27
28
NEW. You can choose GPU for CUDA computation by using "-cuda [device_index=0]"
29
30
One parameter for CUDA is "-di", which controls whether dynamic indexing is used
31
in descriptor generations. It is turned off by default. My experiments on 8800 
32
GTX show that unrolled loop of 8 if-assigns are faster than dynamic indexing, but
33
it might be different on other GPUs.
34
35
36
--------------------------------------------------------------------------------------
37
4. Speed of CUDA-based SiftGPU
38
If the size of the first octave (multiply the original size by 2 if upsample is used)
39
is less than or around 1024x768, CUDA version will be faster than OpenGL versions, 
40
otherwise the OpenGL versions are still faster. 
41
42
**************************************************************************************
43
This is observed on nVidia 8800 GTX, it might be different on other GPUs. Recent
44
experiments on GTX280 show that CUDA version is not as fast as OpenGL version. 
45
46
Note: the thread block settings are currently tuned on GPU nVidia GTX 8800, 
47
       which may not be optimized for other GPUs.
48
**************************************************************************************