-
Notifications
You must be signed in to change notification settings - Fork 67
Description
EGL BadNativeWindow Error: Surface API Architecture Mismatch
Motivation
Someone asked in Zed's discord (#GPUI) whether Zed (and other GPUI applications) can run with OpenGL ES instead of Vulkan on Linux. While blade-graphics supports GLES, attempting to run with the gles config flag results in an immediate crash:
thread 'main' (881883) panicked at blade-graphics-0.7.0/src/gles/egl.rs:431:26:
called `Result::unwrap()` on an `Err` value: BadNativeWindow
This issue affects any windowed application using blade's GLES backend on Linux with Wayland or X11.
Root Cause Analysis
The crash stems from an architectural mismatch between blade's Surface API design and EGL's initialization requirements:
The Problem
Blade's Surface API (works fine in Vulkan, Metal, headless and ANGLE EGL):
Context::init() → create platform-agnostic context
create_surface(window) → create surface for specific window
EGL's Requirements:
Context::init(display) → requires platform-specific display (X11/Wayland/etc.) or compositor (Mutter/Sway)
create_surface(window) → create surface on that display
What's Happening
- During context initialization (
[egl.rs:154](https://github.com/kvark/blade/blob/main/blade-graphics/src/gles/egl.rs#L154)), EGL selects a display platform based solely on available extensions, without any window system information:
pub unsafe fn init(desc: crate::ContextDesc) -> Result<Self, crate::NotSupportedError> {
// No window handle available here!
let display = if let Some(egl1_5) = egl.upcast::<egl::EGL1_5>() {
// ...
} else if client_extensions.contains("EGL_MESA_platform_surfaceless") {
// ❌ Selects surfaceless even for windowed apps!
egl1_5.get_platform_display(EGL_PLATFORM_SURFACELESS_MESA, ...)
}
// ...
}-
The surfaceless platform is designed for headless/offscreen rendering and cannot create window surfaces.
-
During surface creation (
[egl.rs:322](https://github.com/kvark/blade/blob/main/blade-graphics/src/gles/egl.rs#L322)), when the code attempts to create a window surface on the incompatible surfaceless display:
egl.create_platform_window_surface(
inner.egl.display, // ❌ Surfaceless display!
inner.egl.config,
native_window_ptr, // Wayland/X11 window
&attributes_usize,
)
.unwrap() // 💥 Panics with BadNativeWindowWhy This Regression Occurred
The platform-specific display initialization code was removed during the Surface API refactoring (see [PR #203](https://github.com/kvark/blade/pull/203/files#diff-fd9eb1747f951b0069bfdd20edb28c3c90a0366a3ebd6898f2f24c7383fbb899)). The old init_windowed function properly detected X11/Wayland displays, but the unified init function lost this logic.
Evidence of the regression:
- Platform constants are defined but unused (
[egl.rs:11](https://github.com/kvark/blade/blob/main/blade-graphics/src/gles/egl.rs#L11)):
const _EGL_PLATFORM_WAYLAND_KHR: u32 = 0x31D8;
const _EGL_PLATFORM_X11_KHR: u32 = 0x31D5;
const _EGL_PLATFORM_XCB_EXT: u32 = 0x31DC;- Window handle type detection exists only in surface creation (
[egl.rs:327](https://github.com/kvark/blade/blob/main/blade-graphics/src/gles/egl.rs#L327)), after the display is already selected
Proposed Solutions
I've evaluated three approaches to fix this architectural mismatch:
Option 1: Lazy EGL Initialization (Recommended) ⭐
Defer EGL context creation until the first surface is created, when proper display information is available.
Advantages:
- ✅ Maintains API compatibility (no breaking changes)
- ✅ Fits EGL's natural model (context tied to actual display)
- ✅ Handles multi-window correctly (all surfaces from same display)
- ✅ Follows the pattern already working in Vulkan backend
Diagram:
Implementation sketch:
struct ContextInner {
egl: Option<EglContext>, // Uninitialized until first surface
glow: Option<glow::Context>,
// ...
}
impl super::Context {
fn ensure_egl_initialized(
&self,
display_handle: raw_window_handle::RawDisplayHandle
) -> Result<()> {
let mut inner = self.platform.inner.lock().unwrap();
if inner.egl.is_none() {
// Initialize EGL with proper platform display
let egl_context = match display_handle {
RawDisplayHandle::Xlib(h) => {
EglContext::init_with_platform(EGL_PLATFORM_X11_KHR, h.display)?
}
RawDisplayHandle::Wayland(h) => {
EglContext::init_with_platform(EGL_PLATFORM_WAYLAND_KHR, h.display)?
}
// ... other platforms
};
inner.egl = Some(egl_context);
inner.glow = Some(/* load GL functions */);
}
Ok(())
}
pub fn create_surface<I: HasWindowHandle + HasDisplayHandle>(
&self,
window: I,
) -> Result<super::Surface, NotSupportedError> {
// Initialize EGL with actual display information
self.ensure_egl_initialized(window.display_handle()?.as_raw())?;
// ... existing surface creation code
}
}Option 2: Platform Display in ContextDesc
Add platform-specific display information to ContextDesc:
pub struct ContextDesc {
pub presentation: bool,
// ...
#[cfg(gles)]
pub platform_display: Option<PlatformDisplay>,
}Disadvantages:
- Requires breaking API changes
- Forces applications to handle platform detection
- Less ergonomic for common cases
Option 3: Extension Trait for GLES
Create platform-specific initialization API:
#[cfg(gles)]
pub trait GlesContextExt {
fn init_with_display<I: HasDisplayHandle>(
display: I,
desc: ContextDesc,
) -> Result<Self, NotSupportedError>;
}Disadvantages:
- Requires different initialization path for GLES vs Vulkan
- More complex API surface
- Harder to maintain backend parity
Recommendation
Option 1 (Lazy Initialization) best addresses the architectural mismatch while maintaining API compatibility. It acknowledges that EGL and Vulkan have fundamentally different initialization models and adapts the EGL backend accordingly.
The Vulkan backend already demonstrates the correct pattern of extracting display handles during surface creation ([surface.rs:69-72](https://github.com/kvark/blade/blob/main/blade-render/src/surface.rs#L69-L72)). EGL should follow this same timing, just with deferred context initialization.
Additional Context
- Running on: Linux with GNOME (Mutter/Wayland compositor)
- The
ContextDesc.presentationflag already exists but is currently unused during EGL platform selection
WDYT?