Sorry for the delay between posts, but the part of the OpenCL spec on which I’m currently is difficult to implement right.
This part is chapter 5.2 : “Buffer objects”. When one reads this chapter, (s)he may think that it’s fairly easy to implement : there are only two functions. The trick is that buffer objects are way more spread in OpenCL than said by this sole chapter. The two next chapters speak about Images and Querying information about … buffer objects. Yes, three chapters are consecrated to these buffers.
But the amount of work isn’t the worst thing. The worst, that took me three days to sort out with the help of my mentor and Stéphane Marchesin, is that buffers are created context-wise. That means that when we create a buffer, we only know in which context it will be used. The problem is that a buffer needs to reside on a device, and that a context can operate several devices.
So, the problem was to find an elegant solution to this problem of creating one buffer in many devices. Finally, a nVidia forum gave me the solution : when we create a buffer object, we don’t allocate anything. When the buffer is first used on a device, it is allocated on this device.
This solution is elegant and works fairly well, but there are some problems, mainly of data availability. One problem I currently think about is that we can pass a host_ptr to clCreateBuffer. This pointer contains data to be uploaded to the buffer, to pre-populate it. The problem is that if we create the buffer after the clCreateBuffer call, for instance in a clEnqueueNDRangeKernel, how to be sure that the memory once pointed by host_ptr is still valid ?
It’s one of the big problems I must face to implement this part of the spec. Maybe the solution is to memcpy the content of host_ptr somewhere, but it would take memory.
So, it’s time to come to the facts : I did actually things these days. Mainly thinking about all these hairy problems, but also coding. I let you look at this before continuing :
src/CMakeLists.txt | 1 + src/api/api_memory.cpp | 111 ++++++++++++++- src/core/cpudevice.cpp | 184 ++++++++++++++++++++++++ src/core/cpudevice.h | 20 +++ src/core/deviceinterface.h | 17 ++- src/core/memobject.cpp | 340 ++++++++++++++++++++++++++++++++++++++++++++ src/core/memobject.h | 129 +++++++++++++++++ tests/CMakeLists.txt | 2 + tests/test_mem.cpp | 163 +++++++++++++++++++++ tests/test_mem.h | 17 +++ tests/tests.c | 2 + 11 files changed, 982 insertions(+), 4 deletions(-)
It’s one commit (sorry for so big commits, but I wasn’t able to break it into smaller pieces that continue to be usable).
Mainly, we can now create Buffer and SubBuffer objects, with functions like clCreateBuffer. We can give them host_ptr’s, but we cannot do anything with them. Reading, writing, copying and mapping them will come in the following days (I have a full day free of school tomorrow, so I hope Clover will advance).
Another note is that the Clover repository moved back to freedesktop.org, and will soon be accessible at http://cgit.freedesktop.org/~steckdenis/clover.
I think it’s all for this blog post. If I have something to add, I’ll put it in the next post, that will also contain instructions to build and test Clover.