LCA14 – Graphics Working Group Thursday Wrap up

Posted: March 6, 2014 in linaro, Uncategorized

Thursday featured the UMM user space allocator helper discussion and the GPGPU status talk.

The UMM User Space Allocators discussion was given by Sumit Semwal and Benjamin Gaignard. The problem involves the need from user space to allocate and work with memory for sharing between devices. Consider a video pipeline or a web camera that is rendering to the screen. This work will help achieve a zero copy design without user space having to know hardware details such as memory ranges, and other device constraints.

Gil Pitney and I gave the GPGPU talk which covered the current efforts involving the GPGPU subteam. Gil is working on Shamrock which is the old Clover project evolved. He’s upgraded it to use current top of tree llvm and MCJIT for code gen. There’s still testing to do but these are excellent steps forward as getting off the old JIT was important. Shamrock provides a CPU only OpenCL implementation which is great for those that don’t want to implement their own drivers but still want to provide at least the basic functionality. In addition there will be via Shamrock a driver for TI DSP hardware. This is also quite a great step forward. Via this route, everyone can collaborate on the open source portion which takes care of the base library and this leave just the driver/codegen to be something that needs to be created by the board creator.

The other part of the talk was about accelerating SQLite with OpenCL. There was a past project that accomplished something similar but with CUDA. I’m working on this and it’s quite the enjoyable project. I’m just implementing OpenCL kernels so there is a ways to go.  It will serve as a good reference for what can be accomplished on ARM SoC systems which have OpenCL drivers. We typically don’t have as many shader units as modern desktop PCIe solutions in the intel universe. I do find it encouraging that the SQLite design is quite flexible and fits well with this kind of experiment.

I did also attend the Ne10 and Chromium optimizations for Cocos2D/Cocos2D-HTML5. This are ARM projects. Ne10 is essentially a library that sits above Neon intrinsics to give easier access to that functionality. Cocos is a popular cross platform engine that is particularly popular in the Android world for 2D UIs and game creation. There was some nice optimization work in and around various drawing primitives done by the ARM team for Chromium that end up helping Cocos.

Thursday included the first bit of quiet time I had all week to actually write some code. It didn’t last long but it did feel good as I’m in a very fun portion of implementing the optimized SQLite with OpenCL and it was hard to set that work aside while Connect is on.

  1. Rahul Garg says:

    Thanks for posting the notes. I was very interested in Shamrock and the direction it is heading. Is there a possibility of supporting SPIR? Recently an OpenCL C to SPIR compiler was open-sourced by Khronos.This would help you move forwards quickly.

    Also, any more notes on which TI DSPs are expected to be supported?

    Very much looking forward to OpenCL on ARM platforms 🙂

    • tgallfoo says:

      Hi Rahul. You bring up an interesting question. While the Khronos work is interesting, something that fits with llvm via clang or MCJIT would be the most useful. Shamrock has just recently been converted to use MCJIT. Will look into it.

      For which TI DSPs are supported, it’s the Keystone architecture. Gil Pitney can name exact boards. (gil pitney att linaro org) I don’t have access to any TI DSP boards I’m afraid.

      • Rahul Garg says:

        Thanks for replying Tom!

        Well SPIR is actually just a variation of LLVM IR, and the Khronos OpenCL C to SPIR compiler is actually based on clang 3.2.

        So effectively, if I understand correctly, the main step an implementor needs to do is to add a backend to LLVM. Some other work/steps are also required but at least you no longer need to worry about frontend. The khronos work should make it a lot simpler to build LLVM based OpenCL implementations.

        As a bonus, you end up supporting SPIR as well which will make it a lot simpler to bring in higher level languages on top. For example, Multicoreware is building a C++ AMP compiler which will generate SPIR. There is some work on compiling OpenACC to SPIR as well.

        Thanks again for your great blog and to Linaro!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s