element packing

what is packing and why

packing is the form of storing multiple short-sized values as one long-sized value.

element packing is well mapped with the underlying simd register, which usually use one very wide register to store different types of values.

C elemsize elempack
double 8 1
float 4 1
int 4 1
short 2 1
signed char 1 1
arm neon elemsize elempack
float64x2_t 16 2
float32x4_t 16 4
int32x4_t 16 4
float16x4_t 8 4
int8x8_t 8 8

Though the real count of values doubles when elempack is two, the wide-sized value is still treated as one value in the view of Mat structure. For example, we want to store 40 float values in Mat object, if elempack 1 is used, Mat width is then 40, while 10 if elempack 4 is used.

dims w h c cstep elemsize elempack
1 40 1 1 40 4 1
1 10 1 1 10 16 4

packing style convention

In practice, elempack 1, 4, 8 are the most common cases. It is possible to use any other packing style in theory.

The following table show the packing axis used in ncnn for different dimension.

dims packing axis shape before packing shape after packing
1 w w w/elempack
2 h w, h w, h/elempack
3 c w, h, c w, h, c/elempack

If the packing axis dim is not evenly divisible by elempack, zero padding may be used.

outw = (w + elempack - 1) / elempack;

The following snippet shows the memory layout after elempack=4 on 3-dim Mat

// w=2 h=3 c=4 elempack=1
0 1
2 3
4 5

6 7
8 9
10 11

12 13
14 15
16 17

18 19
20 21
22 23

// w=2 h=3 c=1 elempack=4
(0,6,12,18) (1,7,13,19)
(2,8,14,20) (3,9,15,21)
(4,10,16,22) (5,11,17,23)

how to convert elempack

There is a convenient wrapper function provided

// convert to elempack 4 if packing axis dim is evenly divisible by elempack
// return the identity Mat otherwise
ncnn::Mat a;
ncnn::Mat a_packed;
ncnn::convert_packing(a, a_packed, 4);
if (a_packed.elempack == 4)
{
    // check if packing is successful
}

// convert to packing 1, aka unpacking, shall be always successful
ncnn::Mat b;
ncnn::Mat b_unpacked;
ncnn::convert_packing(b, b_unpacked, 1);

handle general interleaved data

Here is an example of using convert packing to convert RGB interleaved data to planar

NOTE: The following code is just presented to explain what packing is and the conversion process. Do not use it in production due to its poor performance. Do use ncnn::Mat::from_pixels()

// rgb_interleaved_u8 is RGB RGB RGB ...
// rgb_interleaved_u8.w = w;
// rgb_interleaved_u8.h = h;
// rgb_interleaved_u8.c = 1;
// rgb_interleaved_u8.elemsize = 3;
// rgb_interleaved_u8.elempack = 3;

ncnn::Mat rgb_interleaved_u8(w, h, 1, 3, 3);
ncnn::Mat rgb_planar_u8;

ncnn::convert_packing(rgb_interleaved_u8, rgb_planar_u8, 1);

// rgb_planar_u8 is now RRR ... GGG ... BBB ...
// rgb_planar_u8.w = w;
// rgb_planar_u8.h = h;
// rgb_planar_u8.c = 3;
// rgb_planar_u8.elemsize = 1;
// rgb_planar_u8.elempack = 1;