CedarX/Reverse Engineering

= Initial information = On June 15 2012 Iain Bullard started reverse engineering the proprietary libraries.

Unfortunately driver are proxy that maps HW Regs to Userspace using mmap all stuff constain in libvecore.so


 * open_cdxalloc as an free reimplementation of Allwinner's libcederxalloc.a.
 * CedarXWrapper as a LD_PRELOADed wrapper to help understanding the proprietary libraries.
 * CedarXPlayerTest as a basic player to use when testing.
 * ReCedro gitorious, has similar tools as those from IanB above, but with a different angle, works really well.

= Object file observations = While android and linux are different beasts from the userspace sense, It could be that the code was written in such way, that it could compile to both targets. Meaning that object files could be similar enough.

From android
The android-linux libvecore.a (md5sum 1c347a9ad3072ce3288bd6dba625b2a4) static lib contains the following files: android functions

From linux-armhf
The linux-armhf libvecore.so (md5sum a026d27307e5204db191878651cc6394) shared object contains the following functions: linux-armhf functions

Lilky both
To generate this list, libvecore.a was used, as it separates things quite nicely and appears to be identical to libvecore.so. That said, this obviously does not have to be true. There is one major difference, libvecore.so also has crtstuff.c functions combined from. While investigating nothing indicated that crtstuff.c from gcc was used directly, it is still likly some functions where used from there.

sunxi-bsp/cedarx-libs/libcedarv/android/libvecore $ arm-pc-linux-gnueabi-ar t libvecore.a | sort

The rest of the bits are all open source, see the linux-sunxi github. The exception is libcedarxalloc.a, but as mentioned above, we have open_cdxalloc.

Function references
So far the following references can easily be observed with readelf -W -s. This is just an indication of some functions, by far complete as it would take way to long and is not really needed.

FFmpeg huffman tree builder: ff_huff_build_tree http://ffmpeg.org/doxygen/trunk/huffman_8c.html

libjpeg: get_soi http://sourceforge.net/p/libjpeg-turbo/code/HEAD/tree/trunk/jdmarker.c

libvp62: VP62_InitCoeffScaleFactors http://en.verysource.com/code/5378534_1/libvp62.h.html

H264/AVC Reference encoder/decoder: remove_frame_from_dpb http://iphome.hhi.de/suehring/tml/doc/lenc/html/mbuffer_8c.html#901bd781eb9aef8b79e98b8e10fbc2aa

VC1 Reference decoder: vc1_eResult vc1DECPIC_UnpackInterlaceMVModeParams http://wiki.multimedia.cx/index.php?title=Understanding_VC-1#vc1DECPIC_UnpackInterlaceMVModeParams

MPEG2: There seems to have happened some function renaming etc, the one that google found though was: ParseQuantMatrixExtension http://sources.team-mediaportal.com/svn/public/tags/Release%201.0.2/DirectShowFilters/TsReader/source/MpegPesParser.cpp = Driver IOCTL guide = Blob mostly use MMIO Access but CedarX should be gate-on and support PLLs should be confugired before

INIT SEQUENCE
This required for make CedarX hardware regs in workable state

IOCTL_ENABLE_VE -> IOCTL_SET_VE_FREQ -> IOCTL_ENGINE_REQ -> IOCTL_RESET_VE

than cedar state should be checked by reading "VE Revision register" that constain chip version 0x1623 for a10, 0x1625 for a13

CORE IOCTL
IOCTL_GET_ENV_INFO = 0x101

return some configuration info, like reserved memory address for cedar

IOCTL_WAIT_VE = 0x102

IOCTL_RESET_VE = 0x103

do reset cedarx engine

IOCTL_ENABLE_VE = 0x104

start base clocks for cedarx

IOCTL_DISABLE_VE = 0x105

disable base clocks for cedarx

IOCTL_SET_VE_FREQ = 0x106

config cedarx plls

AVS2 IOCTL
IOCTL_CONFIG_AVS2 = 0x200

IOCTL_GETVALUE_AVS2 = 0x201

IOCTL_PAUSE_AVS2 = 0x202

IOCTL_START_AVS2 = 0x203

IOCTL_RESET_AVS2 = 0x204

IOCTL_ADJUST_AVS2 = 0x205

ENGINE IOCTL
IOCTL_ENGINE_REQ = 0x206

count references to cedar hardware and more important start some clocks that required for cedar init

IOCTL_ENGINE_REL = 0x207

decrement reference count

IOCTL_ENGINE_CHECK_DELAY = 0x208

IOCTL_GET_IC_VER = 0x209

IOCTL_ADJUST_AVS2_ABS = 0x20a

IOCTL_FLUSH_CACHE = 0x20b do invalidate CPU cache for internal cedar dma

= HW Registers guide = REGS_BASE = 0x01C00000 A10  IO register base addr

MACC_REGS_BASE = (REGS_BASE + 0x0E000) media accelerate VE IO space(4 kb)

Reset/Clock register
MACC_REGS_BASE + 0x00

On some cases reset logic not same with Cedar revisions

VE Ready register
MACC_REGS_BASE + 0x1c

when ready == 0

when not ready == 0x3f00

VE Revision register
MACC_REGS_BASE + 0xF2

Can be used after IOCTL sequence

Constain SoC ID - as VE version

Possible cases:

0x1625 - a13

0x1623 - a10

0x1620 - ???

0x1619 - ???

MPEG Engine
Base address

MPEG_REGS_BASE = (MACC_REGS_BASE + 0x100)

Media File Header Register(mphr)
MPEG_REGS_BASE + 0x00

Video Object Plane Header Register (vophr)
MPEG_REGS_BASE + 0x04

Video file size(fsize)
MPEG_REGS_BASE + 0x08

constain video frame Width:Height in word:word format

Frame Size Register
MPEG_REGS_BASE + 0x0c

constain video frame size for example for 320x240 media file this register must be set 0x014000f0 witch means Width(31-16 bits):Height(15-1 bits) format

0x0140 = 320 0x00f0 = 240

Macro Block Address Register(mbaddr)
MPEG_REGS_BASE + 0x10

Control register(vectrl)
MPEG_REGS_BASE + 0x14

Constain IRQ enable bit

VE Trigger Register(vetrigger)(??)
MPEG_REGS_BASE + 0x18

Status register (vestat)
MPEG_REGS_BASE + 0x1c

Busy statuses 14 bit(not sure) - mc free (may be Macrocell)

13 bit(not sure) - Busy status

12 bit(not sure) - idct in empty (Inverse Discrete Cosine Transform)

11 bit(not sure) - iqis in empty (Inverse Quantization and Inverse Scan)

VE ??(trbtrdfld)(??)
MPEG_REGS_BASE + 0x20

VE ??(trbtrdfrm)(??)
MPEG_REGS_BASE + 0x24

Variable-Length Decoding(VLD) Address Register (vldbaddr)
MPEG_REGS_BASE + 0x28

Variable-Length Decoding(VLD) Offset Register(vldboffset)
MPEG_REGS_BASE + 0x2c

Variable-Length Decoding(VLD) Length Register(vldlen)
MPEG_REGS_BASE + 0x30

Video Buffer Verifier(VBV) Size Register(vbvsize)
MPEG_REGS_BASE + 0x34

Video Buffer Verifier(VBV) Offset Register(vldoffset) or ??
have SECOND usage

MPEG_REGS_BASE + 0x38

VLD length or ??(vldlen)(dcacaddr)(??)
have SECOND usage

MPEG_REGS_BASE + 0x3c

Block Address Register(blkaddr)(??)
MPEG_REGS_BASE + 0x40

?? Register(??)(ncfaddr)
MPEG_REGS_BASE + 0x44

Reconstruct Buffer Luma Address Register (rec_yframaddr)
YCbCr color space Y component buffer

MPEG_REGS_BASE + 0x48

Reconstruct Buffer Croma Address Register(rec_cframaddr)
C component YCbCr

MPEG_REGS_BASE + 0x4c

FOR Buffer Luma Address Register(for_yframaddr)
Y component YCbCr

MPEG_REGS_BASE + 0x50

FOR Buffer Croma Address Register(for_cframaddr)
C component from YCbCr

MPEG_REGS_BASE + 0x54

BACK Buffer Luma Address Register(back_yframaddr)
MPEG_REGS_BASE + 0x58

BACK Buffer Croma Address Register(back_cframaddr)
MPEG_REGS_BASE + 0x5c

?? Register(??)(socx)
MPEG_REGS_BASE + 0x60

?? Register(??)(socy)
MPEG_REGS_BASE + 0x64

?? Register(??)(sol)
MPEG_REGS_BASE + 0x68

?? Register(??)(sdlx)
MPEG_REGS_BASE + 0x6c

?? Register(??)(sdly)
MPEG_REGS_BASE + 0x70

?? Register(??)(spriteshifter)
MPEG_REGS_BASE + 0x74

?? Register(??)(sdcx)
MPEG_REGS_BASE + 0x78

?? Register(??)(sdcy)
MPEG_REGS_BASE + 0x7c

Inverse Quantization Minimum Level Register(iqminput)
iq minimum settings(video compresson level)

MPEG_REGS_BASE + 0x80

Inverse Quantization Level Register(qcinput)
iq settings(compress level)

MPEG_REGS_BASE + 0x84

MS-MPEG header(msmpeg4_pichdr)(??)
MPEG_REGS_BASE + 0x88

VP6 header(vp6_pichdr)(??)
MPEG_REGS_BASE + 0x8c

Inverse Quantization and Inverse Discrete Cosine Transform Input Register(iqidctinput)(??)
MPEG_REGS_BASE + 0x90

Macro Block Height Register(mbah)(??)
look like macro cell size reg

MPEG_REGS_BASE + 0x94

Macro Block Vector 1(mbv1)(??)
MPEG_REGS_BASE + 0x98

Macro Block Vector 2(mbv2)(??)
MPEG_REGS_BASE + 0x9c

Macro Block Vector 3(mbv3)(??)
MPEG_REGS_BASE + 0xa0

Macro Block Vector 4(mbv4)(??)
MPEG_REGS_BASE + 0xa4

Macro Block Vector 5(mbv5)(??)
MPEG_REGS_BASE + 0xa8

Macro Block Vector 6(mbv6)(??)
MPEG_REGS_BASE + 0xac

Macro Block Vector 7(mbv7)(??)
MPEG_REGS_BASE + 0xb0

Macro Block Vector 8(mbv8)(??)
MPEG_REGS_BASE + 0xb4

?? (jpeg_sdctl)
MPEG_REGS_BASE + 0xb8

?? (jpeg_mcu)
MPEG_REGS_BASE + 0xbc

?? (jpeg_resint)
MPEG_REGS_BASE + 0xc0

Error Flag Register(errflag)
MPEG_REGS_BASE + 0xc4

?? (crtmb)
MPEG_REGS_BASE + 0xc8

Rotate-Scale Buffer Luma Address Register(rotf_yfrmaddr)
MPEG_REGS_BASE + 0xcc

Rotate-Scale Buffer Croma Address Register(rotf_cfrmaddr)
MPEG_REGS_BASE + 0xd0

Extra Functions Control Register(extra_func_ctrl)
MPEG_REGS_BASE + 0xd4

JPEG MCU (macrocell) Start Address Register (Jpg_start_mcuco)
MPEG_REGS_BASE + 0xd8

JPEG MCU (macrocell) End Address Register (Jpg_end_mcuco)
MPEG_REGS_BASE + 0xdc

H264 Engine
Base address

H264_REGS_BASE = (MACC_REGS_BASE + 0x200)

Interupt enable reg = H264_REGS_BASE + 0x20

bitmask 0x0111 enable/disable IRQs from H264 decoder

VC1 Engine
Base address

VC1_REGS_BASE = (MACC_REGS_BASE + 0x300)

Interupt enable reg = VC1_REGS_BASE + 0x24

RMVB Engine
Base address

RMVB_REGS_BASE = (MACC_REGS_BASE + 0x400)

Interupt enable reg = RMVB_REGS_BASE + 0x14

ISP ??
Base address

ISP_REGS_BASE = (MACC_REGS_BASE + 0xa00)

Interupt enable reg = ISP_REGS_BASE + 0x8

AVC Encoder engine
Base address

ISP_REGS_BASE = (MACC_REGS_BASE + 0xb00 )

Interupt enable reg = ISP_REGS_BASE + 0x8

JPEG Decoding process
!!!ALPHA VERSION!!!!

INIT:

1) do driver IOCTL init sequence and set up "Reset/Clock register"

2) set jpeg_resint, rotf_cfrmaddr(inp),rotf_yfrmaddr(inp),Jpg_start_mcuco(inp),Jpg_end_mcuco(inp) vldboffset(out source)

3) wirite to trigger register 0xE

4) wait status register for result

TODO

MPEG4 Decoding process
!!!ALPHA VERSION!!!!

According traces MPEG Engine can accellerate several operations:

1) Huffman decoding (VLD)(Varable Length Decoding)

2) Inverse Quantization (IQ)

3) Inverse Cosine Transform (IDCT)

4) Inverse Scan (IS)

5) ....

MPEG decoding procedure (Previous VOP)<- /             | STREAM -(DMUX) - montion -> (Motion Decoding) -> (Montion Compensation)    | \                                                               \  /                \- textures -> (VLD) -> (IS)-> (Inverse AC and DC prediction) \ | / | |/ /---/ / |                                                                             /  \-->(IQ)->(IDCT)--\                                                         / |                                                       /                    \--->(VOP Reconstruction)<>-/

Blob exported funtions description
libv_open do clock setup and initial configuration and init selected engine doing selected *_open

MJPEG Engine API
mjpeg_setup_anaglagh_transform

mjpeg_set_vbv -Video buffering verifier config (CBR/VBR select) call calbacks vbv_get_base_addr and vbv_get_size that we have in source and save infor to internal structure.

mjpeg_set_parent - save pointer to internal structure

mjpeg_set_minor_vbv - STUD

mjpeg_reset - do reset using ve_reset_hardware and internal reset function that touches clock/reset register

mjpeg_release - call callback fbm_release and ve_reset_hardware that we have in sources

mjpeg_open - init mjpeg decoder called by libve_open

mjpeg_io_control -

mjpeg_get_stream_info - use callback for memcopy

mjpeg_get_minor_fbm STUD

mjpeg_get_fbm_num return 1;

mjpeg_get_fbm - return int valure

mjpeg_flush - STUD

mjpeg_decode - general big decoding function

mjpeg_close_anaglagh_transform

mjpeg_close - do callbacks ve_reset_hardware and fbm_release that we have in source

Other findings

 * Old and new kernel drivers both offer a way to directly access registers. It looks like for quite some functions, this approach was chosen. Probably to hide the gory details in the library.