Skip to content
Zhixun Tan edited this page Nov 26, 2017 · 1 revision

TLDR

Currently the following demo program runs and gets the correct result.

from __future__ import absolute_import, print_function

import tvm
import numpy as np

n = tvm.var("n")
A = tvm.placeholder((n, n, n), name='A')
B = tvm.placeholder((n, n, n), name='B')
C = tvm.compute(A.shape, lambda i, j, k: A[i, j, k] + B[i, j, k], name="C")

s = tvm.create_schedule(C.op)
s[C].opengl()

fadd_gl = tvm.build(s, [A, B, C], "opengl", name="myadd")
print("------opengl code------")
print(fadd_gl.imported_modules[0].get_source(fmt="gl"))

ctx = tvm.opengl(0)
n = 10
a = tvm.nd.array(np.random.uniform(size=(n, n, n)).astype(A.dtype), ctx)
b = tvm.nd.array(np.random.uniform(size=(n, n, n)).astype(B.dtype), ctx)
c = tvm.nd.array(np.zeros((n, n, n), dtype=C.dtype), ctx)
fadd_gl(a, b, c)

np.testing.assert_allclose(c.asnumpy(), a.asnumpy() + b.asnumpy())

The corresponding fragment shader is

#version 330 core
uniform sampler2D A;
uniform sampler2D B;
out float C;
void main() {
  ivec2 threadIdx = ivec2(gl_FragCoord.xy);
  C = (texelFetch(A, ivec2(threadIdx.x, 0), 0).r + texelFetch(B, ivec2(threadIdx.x, 0), 0).r);
}

OpenGL Tensor Storage - Current Status

  • We store tensors in OpenGL textures.
  • No matter what dimensions a tensor has, we always store it in a 2D texture with height=1.
  • The reason we are not using 1D textures is that texelFetch in GLSL only supports 2D textures.
  • We currently only support float textures.
  • We store textures as GL_R32F, i.e. only red color channel, float type.

OpenGL Tensor Storage - Possible Future Improvements

  • Support other types than float.
  • Use full RGBA channels. This might potentially give us SIMD?
  • Don't let height always be 1. The reason is that OpenGL textures have stupid size limitations. For example, you can have a 2500x2500 2D texture but not a 6250000x1 2D texture. One possible way is just let height be the maximum supported value.

OpenGL Schedule - Current Status

  • We added an opengl schedule, which basically fuses all dimensions into one and binds that single dimension to threadIdx.x.

Codegen - Current Status

  • We haven't changed lowering at all. When the IR says Store(buffer, index) and the buffer happens to be the output texture, we check that index must be threadIdx.x, and emit code to output a pixel.
  • The codegen part has only been started. We will go through all the AST nodes as our next step.

Clone this wiki locally