In the world of Linux graphical environments, X11 and Wayland are two major display server protocols that serve as the backbone for rendering graphical user interfaces (GUIs). Each of them has its own architecture, design philosophy, and handling of window management and compositing. In this article, we will explore the differences between X11 and Wayland by taking an example of Qt and GTK applications, and explain how rendering, window management, and compositing are handled in each system.
What Are X11 and Wayland?
Before we dive into the technical details, let’s briefly define X11 and Wayland.
- X11 (X Window System): X11 is a display server protocol that has been the standard on Unix-like systems (including Linux) since the mid-1980s. It operates on a client-server model, where the X server manages the display hardware (monitor, input devices) and handles the rendering of graphical elements. Applications (clients) send rendering commands to the X server, which then displays them on the screen.
- Wayland: Wayland is a newer protocol designed as a simpler, more modern replacement for X11. It aims to improve performance, security, and reliability. Unlike X11, Wayland uses a direct communication model between clients and the compositor, bypassing the need for an X server to manage graphical output.
Let’s now dive deeper into the rendering process, architecture, role of the window manager, and compositor by using Qt and GTK applications as examples.
Architecture and Rendering Process
X11 Architecture
In the X11 model, the X server is the central component responsible for managing the screen and input devices (mouse, keyboard, etc.). Applications interact with the X server through a client-server model, where they send requests (such as drawing shapes, displaying windows, or handling input events) to the X server.
In the context of Qt or GTK applications, here’s how rendering works:
- Qt or GTK Application: In the X11 model, graphical applications (such as those written using Qt, GTK, or other libraries) are the ones that issue rendering commands. These applications use libraries like Xlib or XCB (X C Binding) to communicate with the X server. When an application wants to display something (e.g., drawing a button, text, or image), it sends drawing instructions to the X server.
- X Server: The X server receives these rendering commands and coordinates the display of the graphical content on the screen. However, the X server itself does not do the actual rendering. Instead, it forwards these commands to the graphics hardware (GPU), which handles the rendering through a process called hardware acceleration. In modern systems, the X server typically relies on OpenGL or other graphics APIs to perform this task, but the X server itself is not responsible for the drawing or pixel manipulation.The X server also manages input events (e.g., mouse clicks or keypresses) and sends them to the correct application.
- Graphics Driver: The X server interacts with the graphics driver, which communicates directly with the GPU. The driver may use various technologies (e.g., Direct Rendering Infrastructure (DRI)) to accelerate the rendering process, offloading the computationally expensive work of rendering the graphical elements to the GPU. This makes rendering much faster and more efficient.
- Window Manager: X11 relies on a window manager to handle window placement, resizing, and decoration. The window manager may be integrated into a desktop environment (e.g., GNOME, KDE) or run as a standalone component (e.g., i3, Openbox). It interacts with the X server to control the placement and focus of application windows.
- Compositor: While window managers traditionally handle window management (placement, focus, etc.), compositors in the X11 world are responsible for composing the final image that is shown on the screen. Compositing can involve special effects such as transparency, window shadows, and smooth transitions.

What is Rendering
Rendering is the process of generating a visual representation of data, typically in the context of computer graphics. In simpler terms, it refers to the act of converting raw data (such as 3D models, textures, or graphical instructions) into a final image or animation that can be displayed on a screen or saved in a file. Rendering is crucial in fields like video games, animation, architecture visualization, and graphical user interfaces (GUIs).
Types of Rendering
There are different types of rendering, depending on the application:
- 2D Rendering: This involves the process of displaying 2D graphics, such as images, shapes, or text, on a screen. In this case, the rendering process takes input (like vector data or bitmap images) and outputs a 2D image.
- 3D Rendering: In 3D graphics, rendering involves creating a 2D image from a 3D model or scene. This can involve calculating lighting, shadows, textures, and the perspective of objects. 3D rendering is more complex, as it involves simulating how light interacts with objects in a 3D space.
OpenGL Rendering
- If the application uses OpenGL for rendering (whether for 2D or 3D graphics), the X server does not directly render the content. Instead, it communicates with the OpenGL driver, which talks to the GPU.
- The OpenGL API allows the application to submit rendering commands to the GPU via the driver. The driver, in turn, uses the GPU to process the rendering tasks (e.g., drawing 3D objects, textures, lighting effects, etc.).
- This involves the GPU handling the actual drawing of pixels or buffers in memory. For 3D applications, this involves processing vertex and fragment data, and for 2D rendering, it involves manipulating pixel data or textures.
Framebuffer and Offscreen Rendering
- The rendered output is typically stored in a framebuffer in video memory (GPU memory). This framebuffer contains the image that will eventually be displayed on the screen.
- The X server is not directly involved in the rendering process itself but plays a role in managing window buffers and offscreen rendering. If the application uses OpenGL, the X server may pass information about window placement and buffer handling to the GPU.
- In the case of compositing, the X server may use separate window buffers (each created by the application) and pass these buffers to a compositor for final composition.
Passing the Rendered Output to the Display Driver
- Once rendering is complete, the X server’s role is to manage the output to the display.
- If no compositor is used, the X server simply tells the display driver to show the contents of the framebuffer on the screen. This may involve copying the framebuffer contents to the screen via the graphics driver.
- If compositing is involved, the X server or a separate compositor (such as picom or Compton) takes the individual window buffers and combines them into a final image. The X server or compositor then hands off this final image to the display driver to be shown on the screen.
Display Driver
- The display driver is responsible for interfacing with the hardware to output the final image on the screen. The driver will take the contents of the framebuffer (or composed final image) and manage the actual drawing of pixels on the monitor.
- The display driver can perform additional operations such as synchronization, vertical blanking (V-Sync), and other low-level optimizations to ensure smooth rendering.
Recap: Key Steps Involved
- Application sends rendering commands (OpenGL, Cairo, etc.) to the X server.
- X server manages window positioning, buffers, and forwards OpenGL calls to the GPU via the graphics driver.
- OpenGL rendering occurs on the GPU, and the resulting image is stored in a framebuffer (video memory).
- The X server or compositor collects the individual window buffers (if compositing) and combines them.
- The display driver takes the final image and displays it on the screen.
Wayland Architecture
Wayland differs in that it eliminates the need for a separate X server. Instead, Wayland uses a compositor that manages both the windowing system and the compositing of graphical elements. Here’s how this works in the context of a Qt or GTK application:
- Qt or GTK Application: Just like in the X11 case, a Qt or GTK application makes API calls to the respective framework. However, instead of sending requests to an X server, the application directly communicates with the Wayland compositor.
- Wayland Compositor: The Wayland compositor is responsible for handling both window management and composition. When an application wants to display content, it sends rendering commands to the compositor. The compositor is in charge of deciding where and how each window should appear, as well as how to render special effects like shadows or animations.
- Window Management: Unlike X11, in Wayland, the compositor handles window management directly. The window manager and compositor are combined into a single component. It determines the position and size of windows and also handles events like moving or resizing windows.
- Rendering: One of the key differences with Wayland is that it eliminates the need for a server to handle rendering requests. The compositor itself is responsible for composing the image that gets displayed on the screen, and it does so by directly using GPU resources, often using technologies like OpenGL or Vulkan

Direct rendering in wayland
In a Wayland-based system, rendering is done directly to the framebuffer using hardware acceleration, bypassing the traditional X server architecture found in X11. This process enables lower latency, better performance, and more seamless graphics. It contrasts with the X11 model, where applications submit rendering requests to the X server, which then forwards them to the GPU.
Here’s how Direct Rendering in Wayland works:
Application to OpenGL Driver (GPU Rendering)
In a Wayland system, when an application wants to render graphics (e.g., using Qt, GTK, or directly using OpenGL or Vulkan), it directly communicates with the OpenGL driver (or Vulkan driver). This step works similarly to other systems where the application issues rendering commands.
- The application creates a rendering context (e.g., through OpenGL or Vulkan).
It then issues OpenGL (or Vulkan) rendering commands that describe what it wants to draw (e.g., shapes, textures, images). - These rendering commands are processed by the OpenGL driver, which translates them into instructions that the GPU can execute.
GPU Rendering and Memory Management
Once the OpenGL driver receives the commands, it sends them to the GPU for processing:
- The GPU executes the rendering commands. If the application is rendering a 3D scene or using OpenGL for 2D graphics, the GPU performs all the necessary operations, like drawing pixels, applying shaders, and handling textures.
- The result of this rendering process is stored in GPU memory, typically in a framebuffer. This framebuffer is where the rendered image is kept, ready to be displayed on the screen.
Wayland Compositor’s Role
Now, here’s where the Wayland compositor comes into play:
- Wayland clients (the applications) create and manage buffers (e.g., OpenGL/Vulkan buffers) to hold the rendered images. These buffers are allocated in shared memory, and after the application renders its content to the buffer, it submits this buffer to the Wayland compositor.
- The Wayland compositor is responsible for managing the overall screen layout and window composition. It collects these buffers from multiple applications (clients), composites them together into a single image (if needed), and prepares the final image that will be displayed on the screen.
- The compositor may use GPU-based compositing (via OpenGL or Vulkan) to combine the individual buffers into a final composed image. This means the compositor itself can also leverage hardware acceleration to efficiently perform this task.
Display Driver: Final Output to the Screen
Once the compositor has prepared the final image (which could be composed of the application buffers and possibly other graphical effects), it passes this image to the display driver:
- The display driver is responsible for interfacing with the actual hardware (the monitor or display) and ensuring the final image is shown correctly.
- The display driver communicates with the GPU to handle framebuffer swaps (if multiple framebuffers are involved) and to update the screen. This involves sending the rendered image to the screen via the GPU’s display pipeline, ensuring that the final composed image is output to the monitor.
Recap: Key Steps Involved
- Application renders content (e.g., using OpenGL or Vulkan).
- The OpenGL/Vulkan driver processes the rendering commands and sends them to the GPU, which stores the rendered content in GPU memory (framebuffer).
- The Wayland compositor collects the rendered buffers from applications (clients) and combines them using hardware acceleration (OpenGL/Vulkan).
- The compositor sends the final image to the display driver, which outputs the content to the screen.
This architecture allows direct rendering because the applications (clients) interact directly with the OpenGL/Vulkan drivers, and the compositor uses hardware-accelerated compositing, which makes the system more efficient and responsive compared to traditional X11 systems.
How X11 Handles Screen Sharing and Remote Desktop ?
In X11, the display server has direct access to all windows, input devices, and screen content.The basic X11 protocol itself doesn’t include built-in screen-sharing functionality, but the client-server model enables communication across a network. A remote client can request to display its window on another X server using the X11 protocol, even if the client and server are on different machines So you can run applications on a remote machine and display them locally using the DISPLAY environment variable and tools like ssh -X or ssh -Y. Remote desktop solutions like Xvnc and X2Go rely on this capability for full desktop sharing
How Wayland Handles Screen Sharing and Remote Desktop?
Wayland does not allow direct access to the screen buffer or input events for security reasons. Unlike X11, Wayland does not have built-in support for running remote applications over the network.Applications cannot snoop on or capture the screen without explicit permission.Screen sharing is handled through protocols like PipeWire and xdg-desktop-portal:The compositor mediates screen access, ensuring only authorized applications can capture content.Applications request screen sharing via a portal, and the user explicitly approves or denies it.
Xwayland
Xwayland is an X server that runs under Wayland. It provides compatibility for native X11 applications that are yet to provide Wayland support. This means that even though you’re using a Wayland display server, you can still run older X11 applications that haven’t been updated to work with Wayland.Xwayland translates the X11 commands and graphics into Wayland commands, allowing the application to be displayed on your Wayland display.
Conclusion
In summary, both X11 and Wayland are display server protocols, but Wayland represents a modern and streamlined approach to rendering and window management. X11 has been the standard for decades, with its client-server architecture, window manager, and compositor being separate entities. On the other hand, Wayland combines these components into a single compositor that simplifies and improves the performance of window management and rendering.
For Qt and GTK applications, the difference is primarily in how the graphics are rendered and displayed. On X11, the applications communicate with the X server and window manager, while on Wayland, they communicate directly with the Wayland compositor. This leads to improved performance and more efficient use of system resources on Wayland.
As Linux desktop environments continue to evolve, Wayland is poised to replace X11 as the default display server in most distributions, providing a smoother, more modern experience for both developers and users.