Statistics
| Branch: | Revision:

root / rgbdslam / external / siftgpu / speed_and_accuracy.txt @ 9240aaa3

History | View | Annotate | Download (3.91 KB)

1
------------------------------------------------------------------------
2
SPEED
3
------------------------------------------------------------------------
4
1. The first several rounds of SiftGPU might be slow, don't worry. If it
5
is always slow, the reason  might be either the graphic card is old or the 
6
graphic memory is small. 
7

    
8
2. Texture reallocation happens when a new image does not fit in the already 
9
allocated storage. To get less reallocations, you can pre-allocate storage 
10
to fit the largest image size, or just use images with the same size. 
11

    
12
3. Loading some compressed images (.e.g jpg) may take a lot of time on 
13
decompressing. Using binary pgm files or directly specifying data in memory 
14
can achive better performance. Writing ASCII SIFT files is also slow. 
15

    
16
4. The packd version saves GPU memory but also run faster than the unpacked, 
17
which is default now. 
18

    
19
5. SiftGPU might be faster with older grpahic card drivers than with newer ones. 
20

    
21
6. The descriptor normalization in the OpenGL-based implementation is running 
22
on CPU. New versions are now using SSE, which improves the speed for this part
23
a lot. 
24

    
25
7. The orientation computation in unpacked implementation is occasionally slow 
26
under single orientation  mode (-m 1) or packed orientation mode (-m2p). By 
27
default, siftgpu uses 2 orientations  (-m 2), which should be fine. This issue
28
is still unresolved.
29

    
30
8. The thread block settings in the CUDA-based SiftGPU are currently tuned 
31
for my GPU nVidia GTX 8800, which may not be optimized for other GPUs.
32

    
33
----------------------------------------------------------------------------
34
ACCURACY
35
----------------------------------------------------------------------------
36
1. The latest version of SiftGPU now has comparable accuracy with CPU 
37
implementatins. Evaluation on box.pgm of Lowe's package now gives around 600 
38
matches, which is close to SIFT++.
39

    
40
2. In orientation computation, SiftGPU uses a factor 2.0 * sigma as the sample 
41
window size, which is smaller than the typical value 3.0. Changing it from 2.0 
42
to 3.0 reduces the speed of this step by %40, but gives only a very small 
43
improvements in matching. You can change it by specifying parameter "-w 3". 
44

    
45
3. In keypoint localization, SiftGPU refines the location only once by default. 
46
You can change it by specifiying "-s n", to get n iterations (Only available 
47
in cg unpacked implementation). SiftGPU does not move the level of keypoints 
48
in the refinement. 
49

    
50
4. The feature locations are having a (0.5,0.5) offset compared with CPU 
51
implementations by default.  (0, 0) in texture is at the top-left coorner 
52
(instead of center) of the top-left pixel. You can use the center as (0, 0) 
53
by specifying "-loweo"
54

    
55
5. By default, SiftGPU does not do Upsampling(-fo -1), To match it with Lowe's 
56
implementation you need  to use "-fo -1 -loweo". 
57

    
58
6. SiftGPU may get slightly different results on different GPUs due to different 
59
floating point precision. SiftGPU is tested on limited types of graphic cards/OS, 
60
working on your graphic card is not guaranteed. 
61

    
62
IF it returns different number of features at different run on the same card, 
63
then something is going wrong, and probably some special tricks need to be used. 
64
Please email me if that happens. 
65

    
66
7. When getting wrong matches, please look at the saved SIFT file to make sure 
67
there are no weired descriptors( for example, all of the numbers of a descriptor 
68
are 45, or any number in a descriptor is larger than 255)
69

    
70
------------------------------------------------------------------------------
71
KWOWN ISSUES.
72
------------------------------------------------------------------------------
73
1. SiftGPU may have problem with dual monitor.
74
2. Slow on 7950. Changing GlobalParam::_iTexFormat to GL_RGBA16F_ARB can make 
75
   it work. Unknown reason.
76
3. Experiments on 8600 show problems. It works fine for the first image, but 
77
   gets wrong keypoints after.
78

    
79

    
80

    
81

    
82

    
83

    
84

    
85

    
86

    
87

    
88

    
89