CedarX/Reverse Engineering

= Initial information = On June 15 2012 Iain Bullard started reverse engineering the proprietary libraries.

Unfortunately driver are proxy that maps HW Regs to Userspace using mmap all stuff constain in libvecore.so


 * open_cdxalloc as an free reimplementation of Allwinner's libcederxalloc.a.
 * CedarXWrapper as a LD_PRELOADed wrapper to help understanding the proprietary libraries.
 * CedarXPlayerTest as a basic player to use when testing.
 * ReCedro gitorious, has similar tools as those from IanB above, but with a different angle, works really well.

= Object file observations = While android and linux are different beasts from the userspace sense, It could be that the code was written in such way, that it could compile to both targets. Meaning that object files could be similar enough.

From android
The android-linux libvecore.a (md5sum 1c347a9ad3072ce3288bd6dba625b2a4) static lib contains the following files: android functions

From linux-armhf
The linux-armhf libvecore.so (md5sum a026d27307e5204db191878651cc6394) shared object contains the following functions: linux-armhf functions The rest of the bits are all open source, see the linux-sunxi github. The exception is libcedarxalloc.a, but as mentioned above, we have open_cdxalloc.

Function references
So far the following references can easily be observed with readelf -W -s. This is just an indication of some functions, by far complete as it would take way to long and is not really needed.

FFmpeg huffman tree builder: ff_huff_build_tree http://ffmpeg.org/doxygen/trunk/huffman_8c.html

libjpeg: get_soi http://sourceforge.net/p/libjpeg-turbo/code/HEAD/tree/trunk/jdmarker.c

libvp62: VP62_InitCoeffScaleFactors http://en.verysource.com/code/5378534_1/libvp62.h.html

H264/AVC Reference encoder/decoder: remove_frame_from_dpb http://iphome.hhi.de/suehring/tml/doc/lenc/html/mbuffer_8c.html#901bd781eb9aef8b79e98b8e10fbc2aa

VC1 Reference decoder: vc1_eResult vc1DECPIC_UnpackInterlaceMVModeParams http://wiki.multimedia.cx/index.php?title=Understanding_VC-1#vc1DECPIC_UnpackInterlaceMVModeParams

MPEG2: There seems to have happened some function renaming etc, the one that google found though was: ParseQuantMatrixExtension http://sources.team-mediaportal.com/svn/public/tags/Release%201.0.2/DirectShowFilters/TsReader/source/MpegPesParser.cpp = Memory Buffers =

CedarX requset lineral memory for decoding process.

MPEG-Engine Used Buffers
All buffers constain pre/final image in YCrCb MPEG Engine have registers for Y and C component (Cb+Cr have same size as Y)

Reconstruct buffer constain ready frame from prev step

Forward buffer constain place for new frame

BACK buffer (not used in my mp4 playback)

ROT (Rotate-Scale buffer) - used when need rotate frame before show (??)

>>>>MEM ADDR>>>> 1) (FOR)->(REC)          (ROT) 2)        (FOR)->(REC) (ROT) 3)              (FOR)->

= Driver IOCTL guide = Blob mostly use MMIO Access but CedarX should be gate-on and support PLLs should be confugired before

CORE IOCTL
IOCTL_GET_ENV_INFO = 0x101

return some configuration info, like reserved memory address for cedar

IOCTL_WAIT_VE = 0x102

IOCTL_RESET_VE = 0x103

do reset cedarx engine

IOCTL_ENABLE_VE = 0x104

start base clocks for cedarx

IOCTL_DISABLE_VE = 0x105

disable base clocks for cedarx

IOCTL_SET_VE_FREQ = 0x106

config cedarx plls

AVS2 IOCTL
IOCTL_CONFIG_AVS2 = 0x200

IOCTL_GETVALUE_AVS2 = 0x201

IOCTL_PAUSE_AVS2 = 0x202

IOCTL_START_AVS2 = 0x203

IOCTL_RESET_AVS2 = 0x204

IOCTL_ADJUST_AVS2 = 0x205

ENGINE IOCTL
IOCTL_ENGINE_REQ = 0x206

count references to cedar hardware and more important start some clocks that required for cedar init

IOCTL_ENGINE_REL = 0x207

decrement reference count

IOCTL_ENGINE_CHECK_DELAY = 0x208

IOCTL_GET_IC_VER = 0x209

IOCTL_ADJUST_AVS2_ABS = 0x20a

IOCTL_FLUSH_CACHE = 0x20b do invalidate CPU cache for internal cedar dma

= HW Registers guide = REGS_BASE = 0x01C00000 A10  IO register base addr

MACC_REGS_BASE = (REGS_BASE + 0x0E000) media accelerate VE IO space(4 kb)

Reset/Clock register
MACC_REGS_BASE + 0x00

On some cases reset logic not same with Cedar revisions

VE Ready register
MACC_REGS_BASE + 0x1c

when ready == 0

when not ready == 0x3f00

VE Revision register
MACC_REGS_BASE + 0xF2

Can be used after IOCTL sequence

Constain SoC ID - as VE version

Possible cases:

0x1625 - a13

0x1623 - a10

0x1620 - ???

0x1619 - ???

MPEG Engine
Base address

MPEG_REGS_BASE = (MACC_REGS_BASE + 0x100)

Media File Header Register(mphr)
MPEG_REGS_BASE + 0x00

Video Object Plane Header Register (vophr)
MPEG_REGS_BASE + 0x04

Video file size(fsize)
MPEG_REGS_BASE + 0x08

constain video frame Width:Height in word:word format

Frame Size Register
MPEG_REGS_BASE + 0x0c

constain video frame size for example for 320x240 media file this register must be set 0x014000f0 witch means Width(31-16 bits):Height(15-1 bits) format

0x0140 = 320 0x00f0 = 240

Macro Block Address Register(mbaddr)
MPEG_REGS_BASE + 0x10

Control register(vectrl)
MPEG_REGS_BASE + 0x14

Constain IRQ enable bit

VE Trigger Register(vetrigger)(??)
MPEG_REGS_BASE + 0x18

Status register (vestat)
MPEG_REGS_BASE + 0x1c

Busy statuses 14 bit(not sure) - mc free (may be Macrocell or montion compensation)

13 bit(not sure) - Busy status

12 bit(not sure) - idct in empty (Inverse Discrete Cosine Transform)

11 bit(not sure) - iqis in empty (Inverse Quantization and Inverse Scan)

VE ??(trbtrdfld)(??)
Distance in time to last B or P frame

TRB =display_time(B)-display_time(I)

TRD =display_time(P)-display_time(I)

MPEG_REGS_BASE + 0x20

VE ??(trbtrdfrm)(??)
MPEG_REGS_BASE + 0x24

Variable-Length Decoding(VLD) Block Address Register (vldbaddr)
MPEG_REGS_BASE + 0x28

Variable-Length Decoder(VLD) Block Offset Register(vldboffset)
MPEG_REGS_BASE + 0x2c

Variable-Length Decoding(VLD) Length Register(vldlen)
MPEG_REGS_BASE + 0x30

Video Buffer Verifier(VBV) Address Register(vbvsize)
Constain Maximum VBV buffer address

MPEG_REGS_BASE + 0x34

Variable Length Decoder(VLD) Offset Register(vldoffset) or ??
have SECOND usage

MPEG_REGS_BASE + 0x38

VLD length or ??(vldlen)(dcacaddr)(??)
have SECOND usage

MPEG_REGS_BASE + 0x3c

Block Address Register(blkaddr)(??)
MPEG_REGS_BASE + 0x40

?? Address Register(??)(ncfaddr)
MPEG_REGS_BASE + 0x44

Reconstruct Buffer Luma Address Register (rec_yframaddr)
YCbCr color space Y component buffer

Constain Prev frame for decoder work.

MPEG_REGS_BASE + 0x48

Reconstruct Buffer Croma Address Register(rec_cframaddr)
Constain Prev frame for decoder work.

C component YCbCr

MPEG_REGS_BASE + 0x4c

Forward Buffer Luma Address Register(for_yframaddr)
Space for decoding frame Y Component

MPEG_REGS_BASE + 0x50

Forward Buffer Croma Address Register(for_cframaddr)
Place for croma (C) component decoding frame

MPEG_REGS_BASE + 0x54

BACK Buffer Luma Address Register(back_yframaddr)
MPEG_REGS_BASE + 0x58

BACK Buffer Croma Address Register(back_cframaddr)
MPEG_REGS_BASE + 0x5c

?? Register(??)(socx)
MPEG_REGS_BASE + 0x60

?? Register(??)(socy)
MPEG_REGS_BASE + 0x64

?? Register(??)(sol)
MPEG_REGS_BASE + 0x68

?? Register(??)(sdlx)
MPEG_REGS_BASE + 0x6c

?? Register(??)(sdly)
MPEG_REGS_BASE + 0x70

?? Register(??)(spriteshifter)
MPEG_REGS_BASE + 0x74

?? Register(??)(sdcx)
MPEG_REGS_BASE + 0x78

?? Register(??)(sdcy)
MPEG_REGS_BASE + 0x7c

Inverse Quantization Minimum Level Register(iqminput)
iq minimum settings(video compresson level) for MPEG decoding

for MJPEG decoding, in this register before frame decoding you must load (push) IQ tabe for current frame

MPEG_REGS_BASE + 0x80

Inverse Quantization Level Register(qcinput)
iq settings(compress level)

MPEG_REGS_BASE + 0x84

MS-MPEG header(msmpeg4_pichdr)(??)
MPEG_REGS_BASE + 0x88

VP6 header(vp6_pichdr)(??)
MPEG_REGS_BASE + 0x8c

Inverse Quantization and Inverse Discrete Cosine Transform Input Register(iqidctinput)(??)
MPEG_REGS_BASE + 0x90

Macro Block Height Register(mbah)(??)
look like macro cell size reg

MPEG_REGS_BASE + 0x94

Macro Block Vector 1(mbv1)(??)
MPEG_REGS_BASE + 0x98

Macro Block Vector 2(mbv2)(??)
MPEG_REGS_BASE + 0x9c

Macro Block Vector 3(mbv3)(??)
MPEG_REGS_BASE + 0xa0

Macro Block Vector 4(mbv4)(??)
MPEG_REGS_BASE + 0xa4

Macro Block Vector 5(mbv5)(??)
MPEG_REGS_BASE + 0xa8

Macro Block Vector 6(mbv6)(??)
MPEG_REGS_BASE + 0xac

Macro Block Vector 7(mbv7)(??)
MPEG_REGS_BASE + 0xb0

Macro Block Vector 8(mbv8)(??)
MPEG_REGS_BASE + 0xb4

JPEG Decoder Control Register(jpeg_sdctl)
MPEG_REGS_BASE + 0xb8

Jpeg MCU Register (jpeg_mcu)
MPEG_REGS_BASE + 0xbc

?? (jpeg_resint)
MPEG_REGS_BASE + 0xc0

Error Flag Register(errflag)
MPEG_REGS_BASE + 0xc4

?? (crtmb)
MPEG_REGS_BASE + 0xc8

Rotate-Scale Buffer Luma Address Register(rotf_yfrmaddr)
Result buffer for MJPEG decoder, Luma Component

MPEG_REGS_BASE + 0xcc

Rotate-Scale Buffer Croma Address Register(rotf_cfrmaddr)
MPEG_REGS_BASE + 0xd0

Extra Functions Control Register(extra_func_ctrl)
Control rotate and etc.

MPEG_REGS_BASE + 0xd4

JPEG MCU (macrocell) Start Address Register (Jpg_start_mcuco)
MPEG_REGS_BASE + 0xd8

JPEG MCU (macrocell) End Address Register (Jpg_end_mcuco)
MPEG_REGS_BASE + 0xdc

JPEG Huffman Coding Register (??)
MJpeg decoder only write "0" to it

MPEG_REGS_BASE + 0xe0

JPEG Huffman Table Load Register
MJpeg decoder push dwords to it in batch,  look like values automaticly moves to shadow registers

MPEG_REGS_BASE + 0xe4

H264 Engine
Base address

H264_REGS_BASE = (MACC_REGS_BASE + 0x200)

Interupt enable reg = H264_REGS_BASE + 0x20

bitmask 0x0111 enable/disable IRQs from H264 decoder

VC1 Engine
Base address

VC1_REGS_BASE = (MACC_REGS_BASE + 0x300)

Interupt enable reg = VC1_REGS_BASE + 0x24

RMVB Engine
Base address

RMVB_REGS_BASE = (MACC_REGS_BASE + 0x400)

Interupt enable reg = RMVB_REGS_BASE + 0x14

ISP ??
Base address

ISP_REGS_BASE = (MACC_REGS_BASE + 0xa00)

Interupt enable reg = ISP_REGS_BASE + 0x8

AVC Encoder engine
Base address

ISP_REGS_BASE = (MACC_REGS_BASE + 0xb00 )

Interupt enable reg = ISP_REGS_BASE + 0x8 = Decoding processes =

This part describe how in steps cedarx decoder must be used for each filetype

Kernel driver init procedure
This required for make CedarX hardware regs in workable state

IOCTL_ENABLE_VE -> IOCTL_SET_VE_FREQ -> IOCTL_ENGINE_REQ -> IOCTL_RESET_VE

after this step user-space lib must mmap /dev/cedar_dev and get direct access to hardware registers

and after than cedarx version show be checked by reading "VE Revision register" that constain chip version

0x1623 for a10, 0x1625 for a13

MPEG Engine reset/clock init procedure
Before use MPEG engine for MPEG/MJPEG/DIVX/MS-MPEG/VP6 files, MPEG engine should be clocked and reseted

TODO

MJPEG/JPEG Decoding process
Mjpeg are simply bunch jpeg files

!!!ALPHA VERSION!!!!

MPEG Engine can Decode JPEG

JPEG decoding process (Huffman(VLD) decode ) | (Inverse Quantization(IQ)) | (Inverse Discrete Cosine Transform(IDCT)) | (YCrCb to RGB) (disp do it???)

1) do driver IOCTL init sequence and set up "Reset/Clock register" TODO: full description

2) (???) [MPEG_BASE] <- 0xc0 (jpeg_resint) [MPEG_BASE+0x1b] <- 0x13 (Hi part Trigger register)

3) Parse from jpeg and load IQ table to MPEG_BASE+ 0x80 (IQ Min Input register) [MPEG_BASE+0x80] <- TABLE ... (ALL Table) table are TWO 8x8 MATRIX first have name AC second have name DC, are sended in zigzag order http://habrastorage.org/storage/7b9677db/68a7b3d0/4e95edea/36198004.png

4) Set Result buffer (Rotate-Scale buffer regs)

Must be in reseved space, and MMIO adresses (??) [MPEG_BASE+0x1cc] <- Croma buffer address (0x0480c000) [MPEG_BASE+0xd0] <- Luma buffer address(04812000)

5) (??) [MPEG_BASE+0xb8] <- 0 (jpeg control register) [MPEG_BASE + 0xd4] <- 0 (extra functions control register) 6) Clean huffman table(??)

[MPEG_BASE+0xe4] <- 0 (huffman control register)

7) Parse from jpeg and load Huffman table to MPEG_BASE+ 0xe4(Huffman table register) [MPEG_BASE+0xe4] <- TABLE ... (For whole Table)

8) Set VBV (limit address) maxumum reseved this is for IRQ when we need more data than reserved in mem for new part [MPEG_BASE+0x34] <- SRC_BUFF+ SRC_MAX_BUFF_SIZE-1  usualy 0x047fffff

9) Set work mode in Control register

3rd bit [MPEG_BASE+0x14] <- [MPEG_BASE+0x14] & (1<<3)

10) Set SRC Buff parameters

[MPEG_BASE+0x2c] <- Offset in SRC buffer (frame offset when may frames) (0x0 initialy for first frame) [MPEG_BASE+0x30] <- VLD LEN (??) 0x0000d4e8 here [MPEG_BASE+0x28] <- 0x74000000 (??) Realy DON'T know why SRC address look like THAT it must be 0x0400000

11) Start [MPEG_BASE+0x18] <- 0xe Trigger start

12) Wait IRQ (or end somehow) than check MPEG_BASE + 0x1c register for finush (1-st bit ??? unsure here/...)

MPEG4 Decoding process
!!!ALPHA VERSION!!!!

MPEG decoding request several operations:

1) Huffman decoding (VLD)(Varable Length Decoding)

2) Inverse Quantization (IQ)

3) Inverse Cosine Transform (IDCT)

4) Inverse Scan (IS)

5) ....

CedarX MPEG Engine can do it in automatic or semi-automatic mode

MPEG decoding procedure (Previous VOP)<- /             | STREAM -(DMUX) - montion -> (Motion Decoding) -> (Montion Compensation)    | \                                                               \  /                \- textures -> (VLD) -> (IS)-> (Inverse AC and DC prediction) \ | / | |/ /---/ / |                                                                             /  \-->(IQ)->(IDCT)--\                                                         / |                                                       /                    \--->(VOP Reconstruction)<>-/

Blob exported funtions description
libv_open do clock setup and initial configuration and init selected engine doing selected *_open

MJPEG Engine API
mjpeg_setup_anaglagh_transform

mjpeg_set_vbv -Video buffering verifier config (CBR/VBR select) call calbacks vbv_get_base_addr and vbv_get_size that we have in source and save infor to internal structure.

mjpeg_set_parent - save pointer to internal structure

mjpeg_set_minor_vbv - STUD

mjpeg_reset - do reset using ve_reset_hardware and internal reset function that touches clock/reset register

mjpeg_release - call callback fbm_release and ve_reset_hardware that we have in sources

mjpeg_open - init mjpeg decoder called by libve_open

mjpeg_io_control -

mjpeg_get_stream_info - use callback for memcopy

mjpeg_get_minor_fbm STUD

mjpeg_get_fbm_num return 1;

mjpeg_get_fbm - return int valure

mjpeg_flush - STUD

mjpeg_decode - general big decoding function

mjpeg_close_anaglagh_transform

mjpeg_close - do callbacks ve_reset_hardware and fbm_release that we have in source

Other findings

 * Old and new kernel drivers both offer a way to directly access registers. It looks like for quite some functions, this approach was chosen. Probably to hide the gory details in the library.