NuttX RTOS for PinePhone: Display Engine

📝 23 Dec 2022

Rendering graphics on PinePhone with Apache NuttX RTOS

Apache NuttX RTOS for Pine64 PinePhone (pic above) now supports Allwinner A64 Display Engine!

We’re one step closer to completing our NuttX Display Driver for PinePhone.

Let’s find out how our NuttX Display Driver will call A64 Display Engine to render graphics on PinePhone’s LCD Display

Complete Display Driver for PinePhone

Complete Display Driver for PinePhone

§1 Allwinner A64 Display Engine

Inside PinePhone’s Allwinner A64 SoC (pic above) is the A64 Display Engine that…

Previously we talked about the A64 Display Engine and coding it with Zig…

Today we’ll program it with the NuttX Kernel Driver for the Display Engine.

3 Framebuffers for 3 UI Channels

§2 UI Channels

A64 Display Engine supports up to 3 Framebuffers in RAM (pic above). Each pixel has 32-bit ARGB 8888 format.

The Display Engine renders the 3 Framebuffer as 3 UI Channels, blended together into the displayed image…

Blending the UI Channels

Let’s start with the 3 Framebuffers: test_a64_de.c

Let’s wrap the 3 Framebuffers (fb0, fb1 and fb2) with the NuttX Framebuffer Interface…

PinePhone Framebuffer

§3 NuttX Framebuffer

NuttX expects our PinePhone Display Driver to provide a Framebuffer Interface for rendering graphics.

Let’s define the NuttX Framebuffer: test_a64_de.c

// TODO: Run `make menuconfig`
// Select "System Type > Allwinner A64 Peripheral Selection > DE"
// Select "System Type > Allwinner A64 Peripheral Selection > RSB"
// Select "Build Setup > Debug Options > Graphics Debug Features > Error + Warnings + Info"
// Select "Build Setup > Debug Options > Battery-related Debug Features > Error + Warnings + Info"
// Select "Device Drivers > Framebuffer Overlay Support"
// Save config and exit menuconfig

// NuttX Framebuffer Interface
#include <nuttx/video/fb.h>

// 3 UI Channels: 1 Base Channel + 2 Overlay Channels
#define CHANNELS 3

// NuttX Video Controller for PinePhone (3 UI Channels)
static struct fb_videoinfo_s videoInfo = {
  .fmt       = FB_FMT_RGBA32,  // Pixel format (XRGB 8888)
  .xres      = PANEL_WIDTH,    // Horizontal resolution in pixel columns
  .yres      = PANEL_HEIGHT,   // Vertical resolution in pixel rows
  .nplanes   = 1,     // Number of color planes supported (Base UI Channel)
  .noverlays = 2      // Number of overlays supported (2 Overlay UI Channels)
};

The fb_videoinfo_s struct defines the overall PinePhone Display Interface…

This is how we define Framebuffer 0 (UI Channel 1): test_a64_de.c

// NuttX Color Plane for PinePhone (Base UI Channel):
// Fullscreen 720 x 1440 (4 bytes per XRGB 8888 pixel)
static struct fb_planeinfo_s planeInfo = {
  .fbmem   = &fb0,         // Start of frame buffer memory
  .fblen   = sizeof(fb0),  // Length of frame buffer memory in bytes
  .stride  = PANEL_WIDTH * 4,  // Length of a line in bytes (4 bytes per pixel)
  .display = 0,   // Display number (Unused)
  .bpp     = 32,  // Bits per pixel (XRGB 8888)
  .xres_virtual = PANEL_WIDTH,   // Virtual Horizontal resolution in pixel columns
  .yres_virtual = PANEL_HEIGHT,  // Virtual Vertical resolution in pixel rows
  .xoffset      = 0,  // Offset from virtual to visible resolution
  .yoffset      = 0   // Offset from virtual to visible resolution
};

(fb_planeinfo_s is defined here)

And Framebuffers 1 and 2 (UI Channels 2 and 3): test_a64_de.c

/// NuttX Overlays for PinePhone (2 Overlay UI Channels)
static struct fb_overlayinfo_s overlayInfo[2] = {
  // First Overlay UI Channel:
  // Square 600 x 600 (4 bytes per ARGB 8888 pixel)
  {
    .fbmem     = &fb1,  // Start of frame buffer memory
    .fblen     = sizeof(fb1),  // Length of frame buffer memory in bytes
    .stride    = FB1_WIDTH * 4,  // Length of a line in bytes
    .overlay   = 0,     // Overlay number (First Overlay)
    .bpp       = 32,    // Bits per pixel (ARGB 8888)
    .blank     = 0,     // TODO: Blank or unblank
    .chromakey = 0,     // TODO: Chroma key argb8888 formatted
    .color     = 0,     // TODO: Color argb8888 formatted
    .transp    = { .transp = 0, .transp_mode = 0 },  // TODO: Transparency
    .sarea     = { .x = 52, .y = 52, .w = FB1_WIDTH, .h = FB1_HEIGHT },  // Selected area within the overlay
    .accl      = 0      // TODO: Supported hardware acceleration
  },
  // Second Overlay UI Channel:
  // Fullscreen 720 x 1440 (4 bytes per ARGB 8888 pixel)
  {
    .fbmem     = &fb2,  // Start of frame buffer memory
    .fblen     = sizeof(fb2),  // Length of frame buffer memory in bytes
    .stride    = PANEL_WIDTH * 4,  // Length of a line in bytes
    .overlay   = 1,     // Overlay number (Second Overlay)
    .bpp       = 32,    // Bits per pixel (ARGB 8888)
    .blank     = 0,     // TODO: Blank or unblank
    .chromakey = 0,     // TODO: Chroma key argb8888 formatted
    .color     = 0,     // TODO: Color argb8888 formatted
    .transp    = { .transp = 0, .transp_mode = 0 },  // TODO: Transparency
    .sarea     = { .x = 0, .y = 0, .w = PANEL_WIDTH, .h = PANEL_HEIGHT },  // Selected area within the overlay
    .accl      = 0      // TODO: Supported hardware acceleration
  },
};

(fb_overlayinfo_s is defined here)

What’s sarea?

.sarea = {
  .x = 52,
  .y = 52, 
  .w = FB1_WIDTH,  // Width is 600
  .h = FB1_HEIGHT  // Height is 600
}

Remember that Framebuffer 1 is 600 pixels wide… But the PinePhone Screen is 720 pixels wide.

We use sarea to specify that Framebuffer 1 will be rendered 52 pixels from the left (X Offset), 52 pixels from the top (Y Offset).

(So it will be centered horizontally)

§4 Render Framebuffers

We’ve defined the NuttX Framebuffers… Let’s render them with the Display Engine!

We’ll walk through the steps…

  1. Initialise Display Engine

  2. Initialise UI Blender

  3. Initialise UI Channels

  4. Enable Display Engine

§4.1 Initialise Display Engine

We begin by initialising the Display Engine

// Init Display Engine
int ret = a64_de_init();
DEBUGASSERT(ret == OK);

// Wait 160 milliseconds
up_mdelay(160);

// Render Graphics with Display Engine
ret = pinephone_render_graphics();
DEBUGASSERT(ret == OK);

(Source)

a64_de_init comes from our NuttX Kernel Driver for Display Engine.

(How it works)

We call up_mdelay to wait 160 milliseconds. (Explained here)

Then we call pinephone_render_graphics

§4.2 Initialise UI Blender

Inside pinephone_render_graphics, we initialise the UI Blender that will blend our UI Channels into a single image: test_a64_de.c

// Render graphics with A64 Display Engine
int pinephone_render_graphics(void) {

  // Init the UI Blender for A64 Display Engine
  int ret = a64_de_blender_init();
  DEBUGASSERT(ret == OK);

(a64_de_blender_init comes from our Display Engine Driver)

(How it works)

§4.3 Initialise UI Channels

Next we initialise UI Channel 1 with Framebuffer 0…

  // Init the Base UI Channel (Channel 1)
  ret = a64_de_ui_channel_init(
    1,  // UI Channel Number (1 for Base UI Channel)
    planeInfo.fbmem,    // Start of Frame Buffer Memory (address should be 32-bit)
    planeInfo.fblen,    // Length of Frame Buffer Memory in bytes
    planeInfo.xres_virtual,  // Horizontal resolution in pixel columns
    planeInfo.yres_virtual,  // Vertical resolution in pixel rows
    planeInfo.xoffset,  // Horizontal offset in pixel columns
    planeInfo.yoffset   // Vertical offset in pixel rows
  );
  DEBUGASSERT(ret == OK);

(a64_de_ui_channel_init comes from our Display Engine Driver)

(How it works)

Then we initialise UI Channels 2 and 3 (with Framebuffers 1 and 2)…

  // For each of the 2 Overlay UI Channels (Channels 2 and 3)...
  for (int i = 0; i < sizeof(overlayInfo) / sizeof(overlayInfo[0]); i++) {

    // Get the NuttX Framebuffer for the UI Channel
    const struct fb_overlayinfo_s *ov = &overlayInfo[i];

    // Init the UI Channel.
    // We pass NULL if the UI Channel should be disabled.
    ret = a64_de_ui_channel_init(
      i + 2,  // UI Channel Number (2 and 3 for Overlay UI Channels)
      (CHANNELS == 3) ? ov->fbmem : NULL,  // Start of Frame Buffer Memory (address should be 32-bit)
      ov->fblen,    // Length of Frame Buffer Memory in bytes
      ov->sarea.w,  // Horizontal resolution in pixel columns
      ov->sarea.h,  // Vertical resolution in pixel rows
      ov->sarea.x,  // Horizontal offset in pixel columns
      ov->sarea.y   // Vertical offset in pixel rows
    );
    DEBUGASSERT(ret == OK);
  }

(a64_de_ui_channel_init comes from our Display Engine Driver)

(How it works)

§4.4 Enable Display Engine

Finally we enable the Display Engine

  // Set UI Blender Route, enable Blender Pipes
  // and apply the settings
  ret = a64_de_enable(CHANNELS);
  DEBUGASSERT(ret == OK);    

(a64_de_enable comes from our Display Engine Driver)

(How it works)

The Display Engine starts pulling pixels from our Framebuffers over Direct Memory Access (DMA). And pushes the rendered image to PinePhone’s LCD Display.

But we won’t see anything until we populate our 3 Framebuffers with a Test Pattern…

  // Fill Framebuffer with Test Pattern.
  // Must be called after Display Engine is Enabled,
  // or missing rows will appear.
  test_pattern();
  return OK;
}

Let’s do a simple Test Pattern…

3 Framebuffers for 3 UI Channels

§5 Test Pattern

We fill our 3 Framebuffers with a simple Test Pattern (pic above)…

Note that Framebuffers 1 and 2 are Semi-Transparent, to show that the UI Blender works correctly.

This is how we populate our 3 Framebuffers: test_a64_de.c

// Fill the Framebuffers with a Test Pattern.
// Must be called after Display Engine is Enabled,
// or missing rows will appear.
static void test_pattern(void) {

  // Zero the Framebuffers
  memset(fb0, 0, sizeof(fb0));
  memset(fb1, 0, sizeof(fb1));
  memset(fb2, 0, sizeof(fb2));

Framebuffer 0 (UI Channel 1) will have Blue, Green and Red Blocks…

  // Init Framebuffer 0:
  // Fill with Blue, Green and Red
  const int fb0_len = sizeof(fb0) / sizeof(fb0[0]);

  // For every pixel...
  for (int i = 0; i < fb0_len; i++) {

    // Colours are in XRGB 8888 format
    if (i < fb0_len / 4) {
      // Blue for top quarter
      fb0[i] = 0x80000080;
    } else if (i < fb0_len / 2) {
      // Green for next quarter
      fb0[i] = 0x80008000;
    } else {
      // Red for lower half
      fb0[i] = 0x80800000;
    }

    // Fixes the missing rows, not sure why
    ARM64_DMB(); ARM64_DSB(); ARM64_ISB();
  }

(We’ll talk about ARM64_DMB later)

Framebuffer 1 (UI Channel 2) will be Semi-Transparent White…

  // Init Framebuffer 1:
  // Fill with Semi-Transparent White
  const int fb1_len = sizeof(fb1) / sizeof(fb1[0]);

  // For every pixel...
  for (int i = 0; i < fb1_len; i++) {

    // Set the pixel to Semi-Transparent White
    fb1[i] = 0x40FFFFFF;  // ARGB 8888 format

    // Fixes the missing rows, not sure why
    ARM64_DMB(); ARM64_DSB(); ARM64_ISB();
  }

And Framebuffer 2 (UI Channel 3) will have a Semi-Transparent Green Circle…

  // Init Framebuffer 2:
  // Fill with Semi-Transparent Green Circle
  const int fb2_len = sizeof(fb2) / sizeof(fb2[0]);

  // For every pixel row...
  for (int y = 0; y < PANEL_HEIGHT; y++) {

    // For every pixel column...
    for (int x = 0; x < PANEL_WIDTH; x++) {

      // Get pixel index
      const int p = (y * PANEL_WIDTH) + x;
      DEBUGASSERT(p < fb2_len);

      // Shift coordinates so that centre of screen is (0,0)
      const int half_width  = PANEL_WIDTH  / 2;
      const int half_height = PANEL_HEIGHT / 2;
      const int x_shift = x - half_width;
      const int y_shift = y - half_height;

      // If x^2 + y^2 < radius^2, set the pixel to Semi-Transparent Green
      if (x_shift*x_shift + y_shift*y_shift < half_width*half_width) {
        fb2[p] = 0x80008000;  // Semi-Transparent Green in ARGB 8888 Format
      } else {  // Otherwise set to Transparent Black
        fb2[p] = 0x00000000;  // Transparent Black in ARGB 8888 Format
      }

      // Fixes the missing rows, not sure why
      ARM64_DMB(); ARM64_DSB(); ARM64_ISB();
    }
  }
}

We’re done with our Test Pattern! Let’s talk about ARM64_DMB

Missing Rows

Why the Arm Barriers?

// Fixes the missing rows, not sure why
ARM64_DMB(); ARM64_DSB(); ARM64_ISB();

These are Arm64 Barrier Instructions that prevent caching and out-of-order execution. (See this)

If we omit these Barrier Instructions, the rendered image will have missing rows. (Pic above)

We’re not sure why this happens. Maybe it’s the CPU Cache? DMA? Framebuffer Alignment? Memory Corruption?

(Doesn’t happen in the original Zig version)

Why do we fill the Framebuffers after enabling the Display Engine?

Since we’re running on DMA (Direct Memory Access), rightfully we can fill the Framebuffers (with our Test Pattern) before enabling the Display Engine…

But this creates mysterious missing rows (pic above). So we fill the Framebuffers after enabling the Display Engine.

Let’s run our Test Code…

(We’re still missing a row at the bottom of the circle)

Complete Display Driver for PinePhone

Complete Display Driver for PinePhone

§6 Complete Display Driver

Are we done yet with our Display Driver for PinePhone?

Not quite! PinePhone needs a super complex Display Driver that will handle 11 steps (pic above)…

We’ve implemented most of this in the NuttX Kernel, we’re now converting the remaining bits from Zig to C.

So how do we test this hodgepodge of Zig and C?

We created a Zig Test Program that glues together the Zig and C bits for testing.

Here are all 11 steps of our upcoming Display Driver, hodgepodged with Zig: render.zig

// Zig Test Program that renders 3 UI Channels in Zig and C...
// Turn on PinePhone Display Backlight (in Zig)
backlight.backlight_enable(90);  // 90% brightness

// Init A64 Timing Controller TCON0 (in C)
// PANEL_WIDTH is 720, PANEL_HEIGHT is 1440
_ = a64_tcon0_init(PANEL_WIDTH, PANEL_HEIGHT);

// Init PinePhone Power Management Integrated Circuit (in C)
_ = pinephone_pmic_init();            

// Wait 15 milliseconds for power supply and power-on init
up_mdelay(15);

In the code above, we do these steps…

// Enable A64 MIPI Display Serial Interface (in C)
_ = a64_mipi_dsi_enable();

// Enable A64 MIPI Display Physical Layer (in C)
_ = a64_mipi_dphy_enable();

Here we enable the A64 MIPI Display Serial Interface and MIPI Display Physical Layer.

(a64_mipi_dsi_enable comes from our NuttX Driver for MIPI Display Serial Interface)

(a64_mipi_dphy_enable too)

// Reset LCD Panel (in Zig)
panel.panel_reset();

// Wait 15 milliseconds for LCD Panel
up_mdelay(15);

// Init LCD Panel (in C)
_ = pinephone_panel_init();

Next we reset the LCD Panel, wait 15 milliseconds and send the Initialisation Commands to the LCD Controller.

(panel_reset is in Zig)

(pinephone_panel_init will be added to NuttX Kernel)

(Which calls a64_mipi_dsi_write from our NuttX Driver for MIPI Display Serial Interface)

// Start A64 MIPI Display Serial Interface (in C)
_ = a64_mipi_dsi_start();

We start A64’s MIPI Display Serial Interface.

(a64_mipi_dsi_start comes from our NuttX Driver for MIPI Display Serial Interface)

// Init A64 Display Engine (in C)
_ = a64_de_init();

// Wait 160 milliseconds for Display Engine
up_mdelay(160);

We initialise the Display Engine.

(We’ve seen a64_de_init earlier)

// Render Graphics with Display Engine (in C)
_ = pinephone_render_graphics();

Finally we render the framebuffers with the Display Engine.

(We’ve seen pinephone_render_graphics earlier)

This is how we compile our Zig Test Program…

We boot NuttX on PinePhone (with a microSD Card) and run our Zig Test Program…

NuttShell (NSH) NuttX-11.0.0-pinephone

nsh> uname -a
NuttX 11.0.0-pinephone 64a54d2-dirty
Dec 21 2022 21:48:25 arm64 pinephone

nsh> hello 0

(Source)

PinePhone renders our Test Pattern on the LCD Display (pic below). Yep our (work-in-progress) PinePhone Display Driver has been tested successfully!

Here’s the Debug Log from our Zig Test Program…

Won’t the Debug Logging create extra latency that might affect the driver?

That’s why we also test with Debug Logging disabled

Let’s talk about the upcoming drivers that we’re adding to NuttX Kernel…

Rendering graphics on PinePhone with Apache NuttX RTOS

§7 Upcoming Drivers

Which bits of our NuttX Display Driver are still in Zig?

These parts are still in Zig, pending conversion to C

These have just been converted from Zig to C, now adding to NuttX Kernel…

Where will the new drivers live inside the NuttX Kernel?

The drivers for Display Backlight, LCD Panel and PMIC will go into the new PinePhone LCD Driver.

Which will follow the design of the STM32F7 LCD Driver in NuttX…

  1. At startup, stm32_bringup calls fb_register

    (stm32_bringup.c)

  2. To initialise the Framebuffer, fb_register calls up_fbinitialize

    (fb.c)

  3. To initialise the Display Driver, up_fbinitialize calls stm32_ltdcinitialize

    (stm32_lcd.c)

  4. Inside the Display Driver, stm32_ltdcinitialize creates the NuttX Framebuffer

    (stm32_ltdc.c)

  5. NuttX Framebuffer is here: stm32_ltdc.c

Our new PinePhone LCD Driver shall execute all 11 steps as described earlier…

Probably inside our new implementation of up_fbinitialize. Work-in-progress…

Zig Test Program running on Apache NuttX RTOS for PinePhone

§8 What’s Next

Very soon the official NuttX Kernel will be rendering graphics on PinePhone’s LCD Display… Stay tuned for updates!

Please check out the other articles on NuttX for PinePhone…

Many Thanks to my GitHub Sponsors for supporting my work! This article wouldn’t have been possible without your support.

Got a question, comment or suggestion? Create an Issue or submit a Pull Request here…

lupyuen.github.io/src/de3.md

§9 Appendix: Calibrate NuttX Delay

Can we call sleep() or usleep() in our NuttX Display Driver?

Sorry Nope! Most of our Display Driver code runs in the NuttX Kernel at startup.

Calling sleep() or usleep() will crash the kernel

Because the kernel is still starting up!

So how do we wait a while in our NuttX Display Driver?

We call up_mdelay() like so…

// Wait 160 milliseconds
up_mdelay(160);

How does up_mdelay() work?

It’s a very simple loop: arm64_assert.c

// Wait for the specified milliseconds
void up_mdelay(unsigned int milliseconds) {
  volatile unsigned int i;
  volatile unsigned int j;

  for (i = 0; i < milliseconds; i++) {
    for (j = 0; j < CONFIG_BOARD_LOOPSPERMSEC; j++) {
    }
  }
}

Huh? Won’t the compiler optimise the code and remove the loop?

It won’t because we declared the variables as volatile.

The NuttX Disassembly shows that the loop is still intact: nuttx.S

arm64_assert.c:69 (discriminator 2)
  for (i = 0; i < milliseconds; i++)
    40081830:	b9400be1 	ldr	w1, [sp, #8]
    40081834:	11000421 	add	w1, w1, #0x1
    40081838:	b9000be1 	str	w1, [sp, #8]
    4008183c:	17fffff4 	b	4008180c <up_mdelay+0x10>
arm64_assert.c:71 (discriminator 3)
      for (j = 0; j < CONFIG_BOARD_LOOPSPERMSEC; j++)
    40081840:	b9400fe1 	ldr	w1, [sp, #12]
    40081844:	11000421 	add	w1, w1, #0x1
    40081848:	b9000fe1 	str	w1, [sp, #12]
    4008184c:	17fffff6 	b	40081824 <up_mdelay+0x28>

What’s CONFIG_BOARD_LOOPSPERMSEC?

That’s a magic constant computed by the NuttX Calibration Tool For udelay.

To install the calibration tool…

make menuconfig

Then select…

Application Configuration > Examples > Calibration Tool For udelay 

And rebuild NuttX.

Boot NuttX on PinePhone and run calib_udelay

nsh> calib_udelay

Calibrating timer for main calibration...
Performing main calibration for udelay.This will take approx. 17.280 seconds.
Calibration slope for udelay:
  Y = m*X + b, where
    X is loop iterations,
    Y is time in nanoseconds,
    b is base overhead,
    m is nanoseconds per loop iteration.

  m = 8.58195489 nsec/iter
  b = -347067.66917297 nsec

  Correlation coefficient, R² = 1.0000

Without overhead, 0.11652356 iterations per nanosecond and 116523.57 iterations per millisecond.

Recommended setting for CONFIG_BOARD_LOOPSPERMSEC:
   CONFIG_BOARD_LOOPSPERMSEC=116524

(See the Complete Log)

We update the NuttX Board Configuration for PinePhone with the computed value: pinephone/configs/nsh/defconfig

CONFIG_BOARD_LOOPSPERMSEC=116524

(PinePhone is probably the fastest NuttX Board ever!)

What if our driver needs to wait a while AFTER the NuttX Kernel has been started?

Call nxsig_usleep() instead.

It suspends the current thread, instead of doing a busy-wait loop.