root / rgbdslam / external / siftgpu / speed_and_accuracy.txt @ 9240aaa3
History | View | Annotate | Download (3.91 KB)
1 | 9240aaa3 | Alex | ------------------------------------------------------------------------ |
---|---|---|---|
2 | SPEED |
||
3 | ------------------------------------------------------------------------ |
||
4 | 1. The first several rounds of SiftGPU might be slow, don't worry. If it |
||
5 | is always slow, the reason might be either the graphic card is old or the |
||
6 | graphic memory is small. |
||
7 | |||
8 | 2. Texture reallocation happens when a new image does not fit in the already |
||
9 | allocated storage. To get less reallocations, you can pre-allocate storage |
||
10 | to fit the largest image size, or just use images with the same size. |
||
11 | |||
12 | 3. Loading some compressed images (.e.g jpg) may take a lot of time on |
||
13 | decompressing. Using binary pgm files or directly specifying data in memory |
||
14 | can achive better performance. Writing ASCII SIFT files is also slow. |
||
15 | |||
16 | 4. The packd version saves GPU memory but also run faster than the unpacked, |
||
17 | which is default now. |
||
18 | |||
19 | 5. SiftGPU might be faster with older grpahic card drivers than with newer ones. |
||
20 | |||
21 | 6. The descriptor normalization in the OpenGL-based implementation is running |
||
22 | on CPU. New versions are now using SSE, which improves the speed for this part |
||
23 | a lot. |
||
24 | |||
25 | 7. The orientation computation in unpacked implementation is occasionally slow |
||
26 | under single orientation mode (-m 1) or packed orientation mode (-m2p). By |
||
27 | default, siftgpu uses 2 orientations (-m 2), which should be fine. This issue |
||
28 | is still unresolved. |
||
29 | |||
30 | 8. The thread block settings in the CUDA-based SiftGPU are currently tuned |
||
31 | for my GPU nVidia GTX 8800, which may not be optimized for other GPUs. |
||
32 | |||
33 | ---------------------------------------------------------------------------- |
||
34 | ACCURACY |
||
35 | ---------------------------------------------------------------------------- |
||
36 | 1. The latest version of SiftGPU now has comparable accuracy with CPU |
||
37 | implementatins. Evaluation on box.pgm of Lowe's package now gives around 600 |
||
38 | matches, which is close to SIFT++. |
||
39 | |||
40 | 2. In orientation computation, SiftGPU uses a factor 2.0 * sigma as the sample |
||
41 | window size, which is smaller than the typical value 3.0. Changing it from 2.0 |
||
42 | to 3.0 reduces the speed of this step by %40, but gives only a very small |
||
43 | improvements in matching. You can change it by specifying parameter "-w 3". |
||
44 | |||
45 | 3. In keypoint localization, SiftGPU refines the location only once by default. |
||
46 | You can change it by specifiying "-s n", to get n iterations (Only available |
||
47 | in cg unpacked implementation). SiftGPU does not move the level of keypoints |
||
48 | in the refinement. |
||
49 | |||
50 | 4. The feature locations are having a (0.5,0.5) offset compared with CPU |
||
51 | implementations by default. (0, 0) in texture is at the top-left coorner |
||
52 | (instead of center) of the top-left pixel. You can use the center as (0, 0) |
||
53 | by specifying "-loweo" |
||
54 | |||
55 | 5. By default, SiftGPU does not do Upsampling(-fo -1), To match it with Lowe's |
||
56 | implementation you need to use "-fo -1 -loweo". |
||
57 | |||
58 | 6. SiftGPU may get slightly different results on different GPUs due to different |
||
59 | floating point precision. SiftGPU is tested on limited types of graphic cards/OS, |
||
60 | working on your graphic card is not guaranteed. |
||
61 | |||
62 | IF it returns different number of features at different run on the same card, |
||
63 | then something is going wrong, and probably some special tricks need to be used. |
||
64 | Please email me if that happens. |
||
65 | |||
66 | 7. When getting wrong matches, please look at the saved SIFT file to make sure |
||
67 | there are no weired descriptors( for example, all of the numbers of a descriptor |
||
68 | are 45, or any number in a descriptor is larger than 255) |
||
69 | |||
70 | ------------------------------------------------------------------------------ |
||
71 | KWOWN ISSUES. |
||
72 | ------------------------------------------------------------------------------ |
||
73 | 1. SiftGPU may have problem with dual monitor. |
||
74 | 2. Slow on 7950. Changing GlobalParam::_iTexFormat to GL_RGBA16F_ARB can make |
||
75 | it work. Unknown reason. |
||
76 | 3. Experiments on 8600 show problems. It works fine for the first image, but |
||
77 | gets wrong keypoints after. |
||
78 | |||
79 | |||
80 | |||
81 | |||
82 | |||
83 | |||
84 | |||
85 | |||
86 | |||
87 | |||
88 | |||
89 |