[Update August 2013: Google has removed the OpenCL library with Android 4.3. You can find an interesting discussion here. Google seems to push for its own renderscript protocol. I will not work with renderscript since my priorities are platform independency and sticking with widely adopted standards to avoid fragmentation of my code basis.]
I recently got hold of a Nexus 4 smartphone, which features a GPU (Qualcomm Adreno 320) and conveniently ships with already installed OpenCL library. With minimal changes I got the previously discussed many-body program code related to the fractional quantum Hall effect up and running. No unrooting of the phone is required to run the code example. Please use the following recipe at your own risk, I don’t accept any liabilities. Here is what I did:
- Download and unpack the Android SDK from google for cross-compilation (my host computer runs Mac OS X).
- Download and unpack the Android NDK from google to build minimal C/C++ programs without Java (no real app).
Install the standalone toolchain from the Android NDK. I used the following command for my installation:
/home/tkramer/android-ndk-r8d/build/tools/make-standalone-toolchain.sh \ --install-dir=/home/tkramer/android-ndk-standalone
- Put the OpenCL programs and source code in an extra directory, as described in my previous post
- Change one line in the cl.hpp header: instead of including <GL/gl.h> change to <GLES/gl.h>. Note: I am using the “old” cl.hpp bindings 1.1, further changes might be required for the newer bindings, see for instance this helpful blog
Transfer the OpenCL library from the phone to a subdirectory lib/ inside your source code. To do so append the path to your SDK tools and use the adb command:
export PATH=/home/tkramer/adt-bundle-mac-x86_64-20130219/sdk/platform-tools:$PATH adb pull /system/lib/libOpenCL.so
Cross compile your program. I used the following script, please feel free to provide shorter versions. Adjust the include directories and library directories for your installation.
rm plasma_disk_gpu /home/tkramer/android-ndk-standalone/bin/arm-linux-androideabi-g++ -v -g \ -DCL_USE_DEPRECATED_OPENCL_1_1_APIS -DGPU \ -I. \ -I/home/tkramer/android-ndk-standalone/include/c++/4.6 \ -I/home/tkramer/android-ndk-r8d/platforms/android-5/arch-arm/usr/include \ -Llib \ -march=armv7-a -mfloat-abi=softfp -mfpu=neon \ -fpic -fsigned-char -fdata-sections -funwind-tables -fstack-protector \ -ffunction-sections -fdiagnostics-show-option -fPIC \ -fno-strict-aliasing -fno-omit-frame-pointer -fno-rtti \ -lOpenCL \ -o plasma_disk_gpu plasma_disk.cpp
Copy the executable to the data dir of your phone to be able to run it. This can be done without rooting the phone with the nice SSHDroid App, which by defaults transfers to /data . Don’t forget to copy the kernel .cl files:
scp -P 2222 integrate_eom_kernel.cl firstname.lastname@example.org.NNN: scp -P 2222 plasma_disk_gpu email@example.com.NNN:
- ssh into your phone and run the GPU program:
ssh -p 2222 firstname.lastname@example.org.NNN ./plasma_disk_gpu 64 16
- Check the resulting data files. You can copy them for example to the Download path of the storage and use the gnuplot (droidplot App) to plot them.
A short note about runtimes. On the Nexus 4 device the program runs for about 12 seconds, on a MacBook Pro with NVIDIA GT650M it completes in 2 seconds (in the example above the equations of motion for 16*64=1024 interacting particles are integrated). For larger particle numbers the phone often locks up.
An alternative way to transfer files to the device is to connect via USB cable and to install the Android Terminal Emulator app. Next
cd /data/data/jackpal.androidterm mkdir gpu chmod 777 gpu
On the host computer use adb to transfer the compiled program and the .cl kernel and start a shell to run the kernel
adb push integrate_eom_kernel.cl /data/data/jackpal.androidterm/gpu/ adb push plasma_disk_gpu /data/data/jackpal.androidterm/gpu/
You can either run the program within the terminal emulator or use the adb shell
adb shell cd /data/data/jackpal.androidterm/gpu/ ./plasma_disk_gpu 64 16
Let’s see in how many years todays desktop GPUs can be found in smartphones and which computational physics codes can be run!