Skip to content

Glow#262

Merged
iceiix merged 146 commits intomasterfrom
glow
Dec 25, 2020
Merged

Glow#262
iceiix merged 146 commits intomasterfrom
glow

Conversation

@iceiix
Copy link
Copy Markdown
Owner

@iceiix iceiix commented Jan 6, 2020

Replace the use of raw OpenGL with the https://github.com/grovesNL/glow wrapper

#34 Investigate rendering backends, such as gfx-rs or wgpu-rs or glow
#171 wasm broken: failed to resolve: could not find Context in platform_impl

@iceiix
Copy link
Copy Markdown
Owner Author

iceiix commented Jan 6, 2020

Down to 36 compile errors, 31 are methods not found in glow::native::Context, will have to find glow equivalents, others are related to parameters. Current list:

error[E0061]: this function takes 1 parameter but 2 parameters were supplied
error[E0061]: this function takes 1 parameter but 4 parameters were supplied
error[E0061]: this function takes 2 parameters but 4 parameters were supplied
error[E0433]: failed to resolve: could not find `types` in `gl`
error[E0599]: no method named `buffer_data` found for type `glow::native::Context` in the current scope
error[E0599]: no method named `buffer_sub_data` found for type `glow::native::Context` in the current scope
error[E0599]: no method named `clear_bufferfv` found for type `glow::native::Context` in the current scope
error[E0599]: no method named `delete_buffers` found for type `glow::native::Context` in the current scope
error[E0599]: no method named `delete_framebuffers` found for type `glow::native::Context` in the current scope
error[E0599]: no method named `delete_textures` found for type `glow::native::Context` in the current scope
error[E0599]: no method named `delete_vertex_arrays` found for type `glow::native::Context` in the current scope
error[E0599]: no method named `gen_buffers` found for type `glow::native::Context` in the current scope
error[E0599]: no method named `gen_framebuffers` found for type `glow::native::Context` in the current scope
error[E0599]: no method named `gen_textures` found for type `glow::native::Context` in the current scope
error[E0599]: no method named `gen_vertex_arrays` found for type `glow::native::Context` in the current scope
error[E0599]: no method named `get_integerv` found for type `glow::native::Context` in the current scope
error[E0599]: no method named `get_shaderiv` found for type `glow::native::Context` in the current scope
error[E0599]: no method named `get_tex_image` found for type `glow::native::Context` in the current scope
error[E0599]: no method named `map_buffer` found for type `glow::native::Context` in the current scope
error[E0599]: no method named `multi_draw_elements` found for type `glow::native::Context` in the current scope
error[E0599]: no method named `read_buffer` found for type `glow::native::Context` in the current scope
error[E0599]: no method named `tex_image_2d_multisample` found for type `glow::native::Context` in the current scope
error[E0599]: no method named `tex_parameteri` found for type `glow::native::Context` in the current scope
error[E0599]: no method named `tex_sub_image_2d` found for type `glow::native::Context` in the current scope
error[E0599]: no method named `tex_sub_image_3d` found for type `glow::native::Context` in the current scope
error[E0599]: no method named `uniform1f` found for type `glow::native::Context` in the current scope
error[E0599]: no method named `uniform1i` found for type `glow::native::Context` in the current scope
error[E0599]: no method named `uniform2f` found for type `glow::native::Context` in the current scope
error[E0599]: no method named `uniform3f` found for type `glow::native::Context` in the current scope
error[E0599]: no method named `uniform3i` found for type `glow::native::Context` in the current scope
error[E0599]: no method named `uniform4f` found for type `glow::native::Context` in the current scope
error[E0599]: no method named `uniform4fv` found for type `glow::native::Context` in the current scope
error[E0599]: no method named `uniform_matrix4fv` found for type `glow::native::Context` in the current scope
error[E0599]: no method named `uniform_matrix4fv` found for type `glow::native::Context` in the current scope
error[E0599]: no method named `vertex_attrib_i_pointer` found for type `glow::native::Context` in the current scope
error[E0599]: no method named `vertex_attrib_pointer` found for type `glow::native::Context` in the current scope

reference: https://docs.rs/glow/0.4.0/glow/struct.Context.html -> https://docs.rs/glow/0.4.0/glow/trait.HasContext.html and https://github.com/grovesNL/glow/blob/master/src/native.rs

@iceiix
Copy link
Copy Markdown
Owner Author

iceiix commented Jan 6, 2020

Found most glow method equivalents, except for these four:

error[E0599]: no method named `map_buffer` found for type `glow::native::Context` in the current scope
error[E0599]: no method named `multi_draw_elements` found for type `glow::native::Context` in the current scope
error[E0599]: no method named `read_buffer` found for type `glow::native::Context` in the current scope
error[E0599]: no method named `tex_image_2d_multisample` found for type `glow::native::Context` in the current scope

The underlying OpenGL method names would correspond to gl + camelCase:

in glow, they would be gl. + camelCase (gl.MapBuffer, gl.MultiDrawElements, gl.ReadBuffer, gl.TexImage2DMultisample). Searched through all glow backends, not found. They may not be wrapped due to WebGL incompatibility, may have to find alternatives or only use on native, or if possible enhance glow to wrap these for both OpenGL and WebGL?

In addition to missing methods, there are 8 parameter count mismatches and 7 parameter type mismatches, 15 other errors total, will have to adapt to glow's typing. There is an slight impedance mismatch: glow has its own OpenGL wrapper objects, but so does our src/gl/mod.rs (not actually steven_gl, in gl/src, as that is merely a generated gl_generator/khronos_api, which glow does itself): Texture, Program, Shader, Uniform, Attribute, VertexArray, Buffer, Framebuffer.

…n a loop and it has no WebGL equivalent (19)
@iceiix
Copy link
Copy Markdown
Owner Author

iceiix commented Jan 7, 2020

glMapBuffer - looks like we can use glMapBufferRange with offset 0 and length len, through map_buffer_range. But it intentionally isn't supported by WebGL, see: https://www.khronos.org/registry/webgl/specs/latest/2.0/#5.14

Instead of using MapBufferRange, buffer data may be read by using the getBufferSubData entry point.

What do we use it for? Not easy to grep (Buffer map method in src/gl). But could at least change to glMapBufferRange (map_buffer_range) to get to compile against glow, will throw runtime error on WebGL.


glMultiDrawElements - https://stackoverflow.com/questions/20035395/webgl-why-no-gl-multidrawelements says:

glMultiDrawArrays and glMultiDrawElements are essentially nothing more than additions to the OpenGL (1.4) to reduce the number of API calls. Since the same thing can be achieved using multiple back-to-back draw calls, you can see why it is not a good fit for ES. It is very much like RISC vs. CISC in the CPU architecture world, simpler is better particularly in embedded applications.

so could replace with a loop. Maybe have a multi_draw_elements wrapper where on native it uses glMultiDrawElements for speed, and on web it iterates? But, it appears nothing actually calls this method, so it can be removed.


glReadBuffer - this one actually does appear in the WebGL 2.0 spec: https://www.khronos.org/registry/webgl/specs/latest/2.0/:

  /* Framebuffer objects */
...
  void readBuffer(GLenum src);

void readBuffer(GLenum src) (OpenGL ES 3.0.4 §4.3.1, man page)
Specify a color buffer of the read framebuffer as the read buffer.

but curiously lacks a wrapper in glow. Add it?


glTexImage2DMultisample - apparently multisample renderbuffers are a WebGL2 feature? mrdoob/three.js#8120, but there is this comment, with no follow up:

  • There is no glTexImage2DMultisample in WebGL2, I'm not sure if that's a problem

image_2d_sample is used in src/render/mod.rs for:

        fb_color.image_2d_sample(gl::TEXTURE_2D_MULTISAMPLE, NUM_SAMPLES, width, height, gl::RGBA8, false);

        fb_depth.image_2d_sample(gl::TEXTURE_2D_MULTISAMPLE, NUM_SAMPLES, width, height, gl::DEPTH_COMPONENT24, false);

would using glTexImage2D (non-multisample) conditionally on WebGL work acceptably?

@iceiix
Copy link
Copy Markdown
Owner Author

iceiix commented Jan 7, 2020

https://stackoverflow.com/questions/47934444/webgl-framebuffer-multisampling

WebGL2 does support multisampling for framebuffers. You can call renderbufferStorageMultisample to create a multisampled renderbufffer and you can call blitFramebuffer to resolve it into the canvas

https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/glRenderbufferStorageMultisample.xhtml

glRenderbufferStorageMultisample, glNamedRenderbufferStorageMultisample — establish data storage, format, dimensions and sample count of a renderbuffer object's image

vs

https://www.khronos.org/opengl/wiki/GLAPI/glTexImage2DMultisample

glTexImage2DMultisample: establish the data storage, format, dimensions, and number of samples of a multisample texture's image

glow does define glow::TEXTURE_2D_MULTISAMPLE, equivalent to gl::TEXTURE_2D_MULTISAMPLE (GL_TEXTURE_2D_MULTISAMPLE), but doesn't use it.

A hint from https://bugs.chromium.org/p/chromium/issues/detail?id=361931:

In ppapi, I can use OES_texture_float to implement floating point texture. But my problem is how to implement floating point texture with multi-sampling. In GLES 2.0, we don't have a function like glteximage2dmultisample, so the way to implement multi-sampling in GLES 2.0 is to use:
void RenderbufferStorageMultisampleANGLE(
enum target, sizei samples,
enum internalformat,
sizei width, sizei height);
And then use BlitFramebufferANGLE to a texture.
But the "internalformat" parameter here can't support GL_RGBA_16F in the standard ES 2.0. So I don't know if there is another way to implement it.

change from glTexImage2DMultisample(GL_TEXTURE_2D_MULTISAMPLE) to glRenderbufferStorageMultisample(GL_RENDERBUFFER), in native, too?

I had previously changed this code in ed5d48f, which links to the mysteriously-deleted issue https://github.com/iceiix/steven/issues/5#issuecomment-427669426, recovered partially at #5 (comment) - superseded by #25, one of the first bugs I fixed in this project, may as well change it again…

@iceiix
Copy link
Copy Markdown
Owner Author

iceiix commented Jan 7, 2020

PR'd glReadBuffer to glow in grovesNL/glow#72, accessible on both platforms, but looking closer, we don't actually use it anywhere. Originally added in e9631f0#diff-dd4fbaefaaa8ed7c96f70be3c784eba7R754

@iceiix
Copy link
Copy Markdown
Owner Author

iceiix commented Dec 25, 2020

Current status: now gets far enough to draw the blue sky:

Screen Shot 2020-12-24 at 7 10 24 PM

but crashes soon after thread '<unknown>' has overflowed its stack, same #262 (comment)

To unblock porting to glow while not changing non-web compatible code yet (incremental progress), branched to iceiix/glow#1 for adding wrappers around native-only (OpenGL) functions

  • glRenderbufferStorageMultisample
  • get_pixels (glGetTexImage)
  • sub_image_2d (glTexSubImage2D)
  • set_float_multi_raw (glUniform4fv) with raw pointer - Vec[] color to f32
  • set_matrix4 (glUniformMatrix4fv) - uniform_matrix_4_f32_slice with cgmath::Matrix<f32> into &[f32]
  • set_matrix4_multi (glUniformMatrix4fv) - uniform_matrix_4_f32_slice, like set_matrix4 but convert multiple (slice)

@iceiix
Copy link
Copy Markdown
Owner Author

iceiix commented Dec 25, 2020

Screen Shot 2020-12-24 at 7 50 48 PM

@iceiix
Copy link
Copy Markdown
Owner Author

iceiix commented Dec 25, 2020

Now that rendering seems to basically work, back to debugging the stack overflow. While carefully stepping through in the debugger, failed to trigger the crash:

Screen Shot 2020-12-24 at 8 23 35 PM

apparently because authentication timed out (since I was pausing the program while debugging), failing to login and show the server list. This further suggests the problem is in the server list. Removing auth_token in conf.cfg to force showing the login screen, no crash, confirmed. Working hypothesis was the crash was due to incomplete porting to glow but now that most functions are bridged this seems unlikely. Looking at the backtrace again, reload_server_list frame:

UIImage Image draw()
Process 94174 stopped
* thread #22, stop reason = EXC_BAD_ACCESS (code=2, address=0x70000fac1a98)
    frame #0: 0x00000001011925b7 stevenarella`__rust_probestack + 23
stevenarella`__rust_probestack:
->  0x1011925b7 <+23>: testq  %rsp, 0x8(%rsp)
    0x1011925bc <+28>: subq   $0x1000, %r11             ; imm = 0x1000 
    0x1011925c3 <+35>: cmpq   $0x1000, %r11             ; imm = 0x1000 
    0x1011925ca <+42>: ja     0x1011925b0               ; <+16>
Target 0: (stevenarella) stopped.
(lldb) bt
error: need to add support for DW_TAG_base_type '()' encoded with DW_ATE = 0x7, bit_size = 0
error: need to add support for DW_TAG_base_type '()' encoded with DW_ATE = 0x7, bit_size = 0
* thread #22, stop reason = EXC_BAD_ACCESS (code=2, address=0x70000fac1a98)
  * frame #0: 0x00000001011925b7 stevenarella`__rust_probestack + 23
    frame #1: 0x0000000100b2e52e stevenarella`steven_protocol::protocol::packet::packet_by_id::h4978885aa0193b70(version=0, state=<unavailable>, dir=<unavailable>, id=0, buf=<unavailable>) at mod.rs:140
    frame #2: 0x0000000100a7b0fd stevenarella`steven_protocol::protocol::Conn::read_packet::ha802bac8d24a7eaa(self=0x000070000fcbd0f8) at mod.rs:1148:22
    frame #3: 0x0000000100a7c0a0 stevenarella`steven_protocol::protocol::Conn::do_status::h80fad33fbbfaec98(self=Conn @ 0x000070000fcbd0f8) at mod.rs:1200:59
    frame #4: 0x000000010027c50d stevenarella`stevenarella::screen::server_list::ServerList::reload_server_list::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h44c8b89134f91a91((null)=closure-0 @ 0x000070000fcbdd98, conn=<unavailable>) at server_list.rs:272:38
    frame #5: 0x000000010026518d stevenarella`core::result::Result$LT$T$C$E$GT$::and_then::h9750178b9c146fc8(self=Result<steven_protocol::protocol::Conn, steven_protocol::protocol::Error> @ 0x000070000fcc0648, op=closure-0 @ 0x000070000fcc0498) at result.rs:708:22
    frame #6: 0x000000010027c5da stevenarella`stevenarella::screen::server_list::ServerList::reload_server_list::_$u7b$$u7b$closure$u7d$$u7d$::hc616dd881b588187 at server_list.rs:271:23
    frame #7: 0x000000010003b5b1 stevenarella`std::sys_common::backtrace::__rust_begin_short_backtrace::hee984852b4d275b5(f=<unavailable>) at backtrace.rs:137:18
    frame #8: 0x000000010027a5e1 stevenarella`std::thread::Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::hb58890e5228a4a2c at mod.rs:464:17
    frame #9: 0x000000010004c1f1 stevenarella`_$LT$std..panic..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::hf944feb85a12545d(self=<unavailable>, _args=<unavailable>) at panic.rs:308:9
    frame #10: 0x000000010014a7f9 stevenarella`std::panicking::try::do_call::h4d35e0219f41f9de(data="") at panicking.rs:381:40
    frame #11: 0x000000010014aebd stevenarella`__rust_try + 29
    frame #12: 0x000000010014a3ee stevenarella`std::panicking::try::h87bf4f691700d6c0(f=<unavailable>) at panicking.rs:345:19
    frame #13: 0x000000010004c391 stevenarella`std::panic::catch_unwind::hc6c9d8434ee244da(f=<unavailable>) at panic.rs:382:14
    frame #14: 0x00000001002798a1 stevenarella`std::thread::Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::h63bdad4e7a2dd187 at mod.rs:463:30
    frame #15: 0x00000001000c1a31 stevenarella`core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::h68f44f53c40952df((null)=0x0000000126d40980, (null)=<unavailable>) at function.rs:227:5
    frame #16: 0x000000010117128d stevenarella`std::sys::unix::thread::Thread::new::thread_start::he3e6719579180a65 [inlined] _$LT$alloc..boxed..Box$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$A$GT$$GT$::call_once::h761c0c24cc66dea8 at boxed.rs:1042:9 [opt]
    frame #17: 0x0000000101171287 stevenarella`std::sys::unix::thread::Thread::new::thread_start::he3e6719579180a65 [inlined] _$LT$alloc..boxed..Box$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$A$GT$$GT$::call_once::h15cdc23ec4ed7bf4 at boxed.rs:1042 [opt]
    frame #18: 0x000000010117127e stevenarella`std::sys::unix::thread::Thread::new::thread_start::he3e6719579180a65 at thread.rs:87 [opt]
    frame #19: 0x00007fff2031a950 libsystem_pthread.dylib`_pthread_start + 224
    frame #20: 0x00007fff2031647b libsystem_pthread.dylib`thread_start + 15
(lldb) f 4
frame #4: 0x000000010027c50d stevenarella`stevenarella::screen::server_list::ServerList::reload_server_list::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h44c8b89134f91a91((null)=closure-0 @ 0x000070000fcbdd98, conn=<unavailable>) at server_list.rs:272:38
   269 	            // Don't block the main thread whilst pinging the server
   270 	            thread::spawn(move || {
   271 	                match protocol::Conn::new(&address, protocol::SUPPORTED_PROTOCOLS[0])
-> 272 	                    .and_then(|conn| conn.do_status())
   273 	                {
   274 	                    Ok(res) => {
   275 	                        let mut desc = res.0.description;
(lldb) thread list
Process 94174 stopped
  thread #1: tid = 0x338497, 0x0000000100811723 stevenarella`core::ptr::write::h9ff55e52aa26cbb7(dst=0x00007ffeefbe6bf0, src=54051) at mod.rs:876:2, queue = 'com.apple.main-thread'
  thread #6: tid = 0x3384c3, 0x00007fff202e5e7e libsystem_kernel.dylib`mach_msg_trap + 10, name = 'OGL Profiler'
  thread #7: tid = 0x3384ca, 0x00007fff202e88e2 libsystem_kernel.dylib`__psynch_cvwait + 10
  thread #8: tid = 0x3384cb, 0x00007fff202ea7e2 libsystem_kernel.dylib`kevent + 10, name = 'reqwest-internal-sync-runtime'
  thread #9: tid = 0x3384f5, 0x00007fff202e753e libsystem_kernel.dylib`__workq_kernreturn + 10
  thread #10: tid = 0x338514, 0x00007fff202e753e libsystem_kernel.dylib`__workq_kernreturn + 10
  thread #11: tid = 0x338660, 0x00007fff202e88e2 libsystem_kernel.dylib`__psynch_cvwait + 10
  thread #12: tid = 0x338661, 0x00007fff202e88e2 libsystem_kernel.dylib`__psynch_cvwait + 10
  thread #13: tid = 0x338662, 0x00007fff202e88e2 libsystem_kernel.dylib`__psynch_cvwait + 10
  thread #14: tid = 0x338663, 0x00007fff202e88e2 libsystem_kernel.dylib`__psynch_cvwait + 10
  thread #15: tid = 0x338664, 0x00007fff202e88e2 libsystem_kernel.dylib`__psynch_cvwait + 10
  thread #16: tid = 0x338665, 0x00007fff202e88e2 libsystem_kernel.dylib`__psynch_cvwait + 10
  thread #17: tid = 0x338666, 0x00007fff202e88e2 libsystem_kernel.dylib`__psynch_cvwait + 10
  thread #18: tid = 0x338667, 0x00007fff202e88e2 libsystem_kernel.dylib`__psynch_cvwait + 10
  thread #19: tid = 0x33866c, 0x00007fff202e753e libsystem_kernel.dylib`__workq_kernreturn + 10
  thread #20: tid = 0x33866d, 0x0000000000000000
  thread #21: tid = 0x33866e, 0x00007fff202e5e7e libsystem_kernel.dylib`mach_msg_trap + 10, name = 'com.apple.NSEventThread'
* thread #22: tid = 0x3386a5, 0x00000001011925b7 stevenarella`__rust_probestack + 23, stop reason = EXC_BAD_ACCESS (code=2, address=0x70000fac1a98)
((lldb)

Repro: set a breakpoint at b protocol/src/protocol/mod.rs:1148, immediately crashes after stepping into the call

So what exactly is __rust_probestack?

https://github.com/rust-lang/rust/blob/master/src/doc/rustc/src/exploit-mitigations.md#stack-clashing-protection

Stack clashing protection protects the stack from overlapping with another memory region—allowing arbitrary data in both to be overwritten using each other—by reading from the stack pages as the stack grows to cause a page fault when attempting to read from the guard page/region. This is also referred to as “stack probes” or “stack probing”.

The Rust compiler supports stack clashing protection via stack probing, and enables it by default since version 1.20.0 (2017-08-31)[26]–[29].

https://github.com/rust-lang/compiler-builtins
https://www.reddit.com/r/rust/comments/6ig2v3/pcwalton_revives_the_patch_to_land_stack_probes/

This crash only seems to occur in debug builds. In release builds, I can join and connect to a server. But it does indicate a memory corruption issue...somewhere

@iceiix iceiix merged commit 0aa062f into master Dec 25, 2020
@iceiix iceiix deleted the glow branch December 25, 2020 18:00
iceiix added a commit that referenced this pull request Jan 18, 2021
Regression introduced in #262 by the glow port, correct the raw size to
account for the matrix. See #141 (comment)
iceiix added a commit that referenced this pull request Jan 18, 2021
Regression introduced in #262 by the glow port, correct the raw size to
include all elements in set_float_multi().

See #141 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant