forked from apache/tvm
-
Notifications
You must be signed in to change notification settings - Fork 1
Current Status
Zhixun Tan edited this page Nov 26, 2017
·
1 revision
Currently the following demo program runs and gets the correct result.
from __future__ import absolute_import, print_function
import tvm
import numpy as np
n = tvm.var("n")
A = tvm.placeholder((n, n, n), name='A')
B = tvm.placeholder((n, n, n), name='B')
C = tvm.compute(A.shape, lambda i, j, k: A[i, j, k] + B[i, j, k], name="C")
s = tvm.create_schedule(C.op)
s[C].opengl()
fadd_gl = tvm.build(s, [A, B, C], "opengl", name="myadd")
print("------opengl code------")
print(fadd_gl.imported_modules[0].get_source(fmt="gl"))
ctx = tvm.opengl(0)
n = 10
a = tvm.nd.array(np.random.uniform(size=(n, n, n)).astype(A.dtype), ctx)
b = tvm.nd.array(np.random.uniform(size=(n, n, n)).astype(B.dtype), ctx)
c = tvm.nd.array(np.zeros((n, n, n), dtype=C.dtype), ctx)
fadd_gl(a, b, c)
np.testing.assert_allclose(c.asnumpy(), a.asnumpy() + b.asnumpy())The corresponding fragment shader is
#version 330 core
uniform sampler2D A;
uniform sampler2D B;
out float C;
void main() {
ivec2 threadIdx = ivec2(gl_FragCoord.xy);
C = (texelFetch(A, ivec2(threadIdx.x, 0), 0).r + texelFetch(B, ivec2(threadIdx.x, 0), 0).r);
}- We store tensors in OpenGL textures.
- No matter what dimensions a tensor has, we always store it in a 2D texture with height=1.
- The reason we are not using 1D textures is that
texelFetchin GLSL only supports 2D textures. - We currently only support float textures.
- We store textures as
GL_R32F, i.e. only red color channel, float type.
- Support other types than float.
- Use full RGBA channels. This might potentially give us SIMD?
- Don't let height always be 1. The reason is that OpenGL textures have stupid size limitations. For example, you can have a 2500x2500 2D texture but not a 6250000x1 2D texture. One possible way is just let height be the maximum supported value.
- We added an
openglschedule, which basically fuses all dimensions into one and binds that single dimension to threadIdx.x.
- We haven't changed lowering at all. When the IR says
Store(buffer, index)and the buffer happens to be the output texture, we check that index must bethreadIdx.x, and emit code to output a pixel. - The codegen part has only been started. We will go through all the AST nodes as our next step.