I'm using the latest fosphor from git (7b6b9961bc2d9b84daeb42a5c8f8aeba293d207c) and am seeing two weird (and I believe related) issues. Firstly, I see the following error:
[+] Selected device: TITAN X (Pascal) [!] CL Error (-30, /home/user/gr-fosphor/lib/fosphor/cl.c:409): Unable to queue clear of spectrum buffer
This is CL_INVALID_VALUE returning from clEnqueueFillBuffer, so I added some debug fprints to cl.c to see what parameters were being passed into clEnqueueFillBuffer: Edits: fprintf(stderr, "size = %d\n", 2 * 2 * sizeof(cl_float) * FOSPHOR_FFT_LEN); fprintf(stderr, "pattern_size = %d\n", sizeof(float)); fprintf(stderr, "pattern = %p\n", &noise_floor); fprintf(stderr, "offset = %d\n", 0);
Output: size = 16384 pattern_size = 4 pattern = 0x7fb66b7fdd2c offset = 0
These parameters look like they shouldn't cause CL_INVALID_VALUE: https://www.khronos.org/registry/OpenCL/sdk/2.0/docs/man/xhtml/clEnqueueFill...
But there is this one condition that might be met, which is that somehow size (16384) is larger than the underlying buffer (cl->mem_spectrum). The underlying OpenGL buffer being too small brings me to my next (I believe related) issue. The spectrum plot in fosphor is weirdly pixelated, please see the attachment, which shows a screencap from "osmocom_fft -F -f 100e6 -g 20 -s 10e6".
Where is the cl->mem_spectrum buffer ultimately declared / initialized? My OpenCL / OpenGL sharing knowledge is nonexistent, any pointers for how I can help debug this issue?
Output of clinfo is attached as well, and I'm on Ubuntu 16.04 on x86_64.
Hi Raj,
I'm using the latest fosphor from git (7b6b9961bc2d9b84daeb42a5c8f8aeba293d207c) and am seeing two weird (and I believe related) issues. Firstly, I see the following error:
[+] Selected device: TITAN X (Pascal) [!] CL Error (-30, /home/user/gr-fosphor/lib/fosphor/cl.c:409): Unable to queue clear of spectrum buffer
That's really weird indeed. Although I've seen it before. Can't remember exactly what it was though ...
This is CL_INVALID_VALUE returning from clEnqueueFillBuffer, so I added some debug fprints to cl.c to see what parameters were being passed into clEnqueueFillBuffer:
Watchout that CL 1.2 might not be supported by NVidia's driver and so it might use the fallback path defined in cl_compat.c
But there is this one condition that might be met, which is that somehow size (16384) is larger than the underlying buffer (cl->mem_spectrum). The underlying OpenGL buffer being too small brings me to my next (I believe related) issue. The spectrum plot in fosphor is weirdly pixelated, please see the attachment, which shows a screencap from "osmocom_fft -F -f 100e6 -g 20 -s 10e6".
Where is the cl->mem_spectrum buffer ultimately declared / initialized? My OpenCL / OpenGL sharing knowledge is nonexistent, any pointers for how I can help debug this issue?
So, there is two way that mem_spectrum can be initialized depending on wether CL/GL sharing is active or not. Look into cl_init_buffers_gl and cl_init_buffers_nogl basically.
But I've seen drivers do things that are not spec compliant unfortunately ... and other than trying random stuff to see exactly what they don't like, I'm not sure what to do.
Cheers,
Sylvain
Hi,
I'm using the latest fosphor from git (7b6b9961bc2d9b84daeb42a5c8f8aeba293d207c) and am seeing two weird (and I believe related) issues. Firstly, I see the following error:
[+] Selected device: TITAN X (Pascal) [!] CL Error (-30, /home/user/gr-fosphor/lib/fosphor/cl.c:409): Unable to queue clear of spectrum buffer
That's really weird indeed. Although I've seen it before. Can't remember exactly what it was though ...
Actually the reason I'm not seeing it anymore on my setup is because I don't have any machine with CL/GL sharing working any more ... I changed laptop since and I can't get CL/GL sharing working with optirun and recent nvidia drivers ...
One thing I would point out though is that if you have CL/GL sharing working, you can pretty much comment the entire call to cl_queue_clear_buffers because the GL side of things will clear the buffers already and they are the same. The independent clearing of the CL buffers only matter if they're different.
Cheers,
Sylvain
On Sat, Feb 4, 2017 at 4:44 AM, Sylvain Munaut 246tnt@gmail.com wrote:
Hi,
I'm using the latest fosphor from git (7b6b9961bc2d9b84daeb42a5c8f8aeba293d207c) and am seeing two weird
(and I
believe related) issues. Firstly, I see the following error:
[+] Selected device: TITAN X (Pascal) [!] CL Error (-30, /home/user/gr-fosphor/lib/fosphor/cl.c:409): Unable
to
queue clear of spectrum buffer
That's really weird indeed. Although I've seen it before. Can't remember exactly what it was though ...
Actually the reason I'm not seeing it anymore on my setup is because I don't have any machine with CL/GL sharing working any more ... I changed laptop since and I can't get CL/GL sharing working with optirun and recent nvidia drivers ...
One thing I would point out though is that if you have CL/GL sharing working, you can pretty much comment the entire call to cl_queue_clear_buffers because the GL side of things will clear the buffers already and they are the same. The independent clearing of the CL buffers only matter if they're different.
I'll comment out that call for my purposes and let you know how that goes. My guess is that resolves the -30 error but not the blocky/pixelated spectrum histogram.
Based on your comment about how your setup doesn't support GL/CL sharing, I tried defining FLG_FOSPHOR_USE_CLGL_SHARING as 0 in private.h. Not sharing GL objects fixes the pixelated spectrum issue! No more cl_clear_queue_clear_buffers errors either, because the code never goes down that path anymore.
It seems to me there is a bug in the way the histogram buffer or texture is allocated when it is allocated through the GL code path. I'm not sure if this is quirk of my setup/GPU/driver, or if this is a general bug; I do have an older NVIDIA GPU and another machine on which I can try to reproduce what I'm seeing.
Final question, what are the main downsides to NOT using CL/GL sharing? Is there some extra copy/performance overhead?
On Sun, Feb 5, 2017 at 2:36 PM, Raj Bhattacharjea raj.b@gatech.edu wrote:
On Sat, Feb 4, 2017 at 4:44 AM, Sylvain Munaut 246tnt@gmail.com wrote:
Hi,
I'm using the latest fosphor from git (7b6b9961bc2d9b84daeb42a5c8f8aeba293d207c) and am seeing two weird
(and I
believe related) issues. Firstly, I see the following error:
[+] Selected device: TITAN X (Pascal) [!] CL Error (-30, /home/user/gr-fosphor/lib/fosphor/cl.c:409):
Unable to
queue clear of spectrum buffer
That's really weird indeed. Although I've seen it before. Can't remember exactly what it was though ...
Actually the reason I'm not seeing it anymore on my setup is because I don't have any machine with CL/GL sharing working any more ... I changed laptop since and I can't get CL/GL sharing working with optirun and recent nvidia drivers ...
One thing I would point out though is that if you have CL/GL sharing working, you can pretty much comment the entire call to cl_queue_clear_buffers because the GL side of things will clear the buffers already and they are the same. The independent clearing of the CL buffers only matter if they're different.
I'll comment out that call for my purposes and let you know how that goes. My guess is that resolves the -30 error but not the blocky/pixelated spectrum histogram.
-- Raj Bhattacharjea, PhD Georgia Tech Research Institute Information and Communications Laboratory http://www.prism.gatech.edu/~rb288/ 404.407.6622 <(404)%20407-6622>
Hi,
Based on your comment about how your setup doesn't support GL/CL sharing, I tried defining FLG_FOSPHOR_USE_CLGL_SHARING as 0 in private.h. Not sharing GL objects fixes the pixelated spectrum issue! No more cl_clear_queue_clear_buffers errors either, because the code never goes down that path anymore.
Ok, good to know.
I need to come up with a good way to pass "options" to fosphor to be able to configure theses things at runtime.
It seems to me there is a bug in the way the histogram buffer or texture is allocated when it is allocated through the GL code path. I'm not sure if this is quirk of my setup/GPU/driver, or if this is a general bug; I do have an older NVIDIA GPU and another machine on which I can try to reproduce what I'm seeing.
I can't get CL/GL sharing to work at all on my optimus laptop so unfortunately I can't even test that code path anymore. I really need to build myself a desktop machine ....
Final question, what are the main downsides to NOT using CL/GL sharing? Is there some extra copy/performance overhead?
Yes. Basically the result of the CL computation is downloaded to CPU memory, then re-uploaded as a texture. Not only does that impose extra data copy but also more synchronization points between CPU and GPU.
Cheers,
Sylvain
Sylvain,
Ok, good to know.
I need to come up with a good way to pass "options" to fosphor to be able to configure theses things at runtime.
That would be useful! Given how much I use your tool for real work, I'm willing to contribute; do you look at pull requests on the github mirror? Here are some other things I have often considered plumbing through as options: 1. FFT length. It looks like you have a length 512 FFT kernel in fft.cl already, but I think its unused. A few other FFT sizes might be useful too for adjusting resolution bandwidth. With what you have, several other powers of two should be implementable simply. 2. Waterfall time length. Requires some GL tricks to change the texture size, or internally render at some fixed size and always crop and/or downsample the texture to show the amount of time the user requested.
Thanks for the tips that resolved the issue!
I know this thread is a bit old, but it's worth mentioning that CL/GL sharing being enabled causes fosphor to be broken altogether on my setup as of the latest master commit (possibly earlier). Everything builds, but the following runtime error occurs:
[!] CL Error (-5, /home/user/gr-fosphor/lib/fosphor/cl.c:480): Unable to share spectrum VBO into OpenCL context
As previously mentioned, everything works okay if I change the value of FLG_FOSPHOR_USE_CLGL_SHARING in private.h to 0.
Sylvain, do you always have the value of this flag as zero in your building and testing? Or do you leave it as 1 and fosphor works because you're using Intel CPU OpenCL or something? Maybe check in "#define FLG_FOSPHOR_USE_CLGL_SHARING (0<<0)" into the source repo until you can verify that this code path works correctly?
Details of my setup: x86_64 (Xeon D) Ubuntu 16.04 NVIDIA Titan Xp NVIDIA Driver 375.66 from the standard Ubuntu (xenial-updates) nvidia-opencl-icl-375 from the standard Ubuntu repos
I'm happy to test things out and will do some of my own debugging.
On Tue, Feb 7, 2017 at 3:52 PM, Raj Bhattacharjea raj.b@gatech.edu wrote:
Sylvain,
Ok, good to know.
I need to come up with a good way to pass "options" to fosphor to be able to configure theses things at runtime.
That would be useful! Given how much I use your tool for real work, I'm willing to contribute; do you look at pull requests on the github mirror? Here are some other things I have often considered plumbing through as options:
- FFT length. It looks like you have a length 512 FFT kernel in fft.cl
already, but I think its unused. A few other FFT sizes might be useful too for adjusting resolution bandwidth. With what you have, several other powers of two should be implementable simply. 2. Waterfall time length. Requires some GL tricks to change the texture size, or internally render at some fixed size and always crop and/or downsample the texture to show the amount of time the user requested.
Thanks for the tips that resolved the issue!
-- Raj Bhattacharjea, PhD Georgia Tech Research Institute Information and Communications Laboratory http://www.prism.gatech.edu/~rb288/ 404.407.6622 <(404)%20407-6622>