ARM Platform-based JPEG Codec HW/SW Co-designaccess.ee.ntu.edu.tw/course/SOC2004/version...

30
SOC Consortium Course Material SoC Design Laboratory Case Study Case Study ARM Platform ARM Platform - - based JPEG Codec based JPEG Codec HW/SW Co HW/SW Co - - design design Teaching Assistant : Yu-Ju Cho Advisor : Prof. An-Yeu Wu

Transcript of ARM Platform-based JPEG Codec HW/SW Co-designaccess.ee.ntu.edu.tw/course/SOC2004/version...

  • SOC Consortium Course MaterialSoC Design Laboratory

    Case StudyCase StudyARM PlatformARM Platform--based JPEG Codec based JPEG Codec

    HW/SW CoHW/SW Co--designdesign

    Teaching Assistant : Yu-Ju ChoAdvisor : Prof. An-Yeu Wu

  • 2SOC Consortium Course Material

    Real-tim

    e OS

    SoC Design Laboratory

    Outline

    Introduction to JPEG CodecIntroduction to JPEG CodecLab Case studyReference

  • 3SOC Consortium Course Material

    Real-tim

    e OS

    SoC Design Laboratory

    ISO/IEC 10918-1 JPEG

    JPEG: Joint Photographic Experts GroupJPEG voted as international standard in 1994JPEG standard has four compression method Baseline sequential DCT-based coding Progressive DCT-based coding Lossless coding method

    Sampling and Quantization are not considered at loss-less coding scheme

    Hierarchical coding method

  • 4SOC Consortium Course Material

    Real-tim

    e OS

    SoC Design Laboratory

    Color Model in Video YCrCbYCbCr color mode is used in JPEG and MPEGCCIR-601 transform formula

    The chrominance values in YCbCr are always in the range of 0 to 1Color space transform is loss-less

    =+=++=

    BGRCBGRC

    BGRY

    r

    b

    081.0419.05.0499.0331.0168.0

    114.0587.0299.0

  • 5SOC Consortium Course Material

    Real-tim

    e OS

    SoC Design Laboratory

    Chroma Sub-sampling

    4 : 4 : 4 4 : 2 : 2

    4 : 2 : 04 : 1 : 1

    pixel with only Y value

    pixel with only Cr and Cb value

    pixel with Y, Cr and Cb value

    4:1:1 and 4:2:0 are mostly used in JPEG and MPEG

  • 6SOC Consortium Course Material

    Real-tim

    e OS

    SoC Design Laboratory

    Block Diagram of JPEG Encoder

    f(i,j)

    8x8

    DCTF(u,v)

    Quantization

    QuantizationTable

    DPCM

    RLC

    Fq(u,v)

    zig zagscan

    DC

    AC

    EntropyCoding

    CodingTables

    Data

    Tables

    Header

    01001011101

  • 7SOC Consortium Course Material

    Real-tim

    e OS

    SoC Design Laboratory

    Block Diagram of JPEG Decoder

    01001011101

    f(i,j)

    8x8

    IDCTF(u,v)Inverse

    Quantization

    QuantizationTable

    Fq(u,v)EntropyDecoder

    CodingTables

    Data

    Tables

    Header

  • 8SOC Consortium Course Material

    Real-tim

    e OS

    SoC Design Laboratory

    2-D DCT (Discrete Cosine Transform)

    ( ) ( )

    01)(2

    1)0(

    1,,1,0,,,

    ;2

    12cos2

    12cos)()(2

    2121

    1

    0

    1

    0

    221121,

    1 2

    2,121

    ==

    =

    ++=

    =

    =

    nforncandcwhere

    NkknnN

    knN

    knxkckcN

    XN

    n

    N

    nnnkk

  • 9SOC Consortium Course Material

    Real-tim

    e OS

    SoC Design Laboratory

    Basis Image of 2-D DCT

  • 10SOC Consortium Course Material

    Real-tim

    e OS

    SoC Design Laboratory

    Frequency Distribution of 2-D DCT

    DClow

    frequency

    mediumfrequency

    highfrequency

    DC Verticaledges

    Horizontaledges

    Diagonaledges

    Highfrequency

  • 11SOC Consortium Course Material

    Real-tim

    e OS

    SoC Design Laboratory

    8 point 1-D DCT Algorithm (1/2)

    ( )

    =

    ==

    +=

    =

    otherwise

    kCwhereLkfor

    LkiCxY

    lk

    L

    ikilk

    1

    02

    1;1,,1,0

    ;2

    12cos

    ,

    1

    0,

  • 12SOC Consortium Course Material

    Real-tim

    e OS

    SoC Design Laboratory

    8 point 1-D DCT Algorithm (2/2)

    ( )

    =

    ==+

    =

    = otherwise

    kCwhereLkfor

    LkiCxY lk

    L

    ikilk

    1

    02

    1;1,,1,0;

    212cos ,

    1

    0,

    [ ] [ ][ ] ( )

    =

    ++++

    =

    =

    =

    =

    43

    52

    61

    70

    1357

    3715

    5173

    7531

    7

    5

    3

    1

    43

    52

    61

    70

    6226

    4444

    2662

    4444

    6

    4

    2

    0

    7

    6

    5

    4

    3

    2

    1

    0

    75311357

    62266226

    51733715

    44444444

    37155173

    26622662

    13577531

    44444444

    7

    6

    5

    4

    3

    2

    1

    0

    ;

    cos

    xxxxxxxx

    cccccccccccc

    cccc

    YYYY

    xxxxxxxx

    cccccccccccc

    cccc

    YYYY

    icwhere

    xxxxxxxx

    cccccccccccccccccccccccc

    cccccccccccccccc

    cccccccccccccccccccccccc

    YYYYYYYY

    XCY i

  • 13SOC Consortium Course Material

    Real-tim

    e OS

    SoC Design Laboratory

    Implementation 2-D DCTExample: row-column decomposition

    Separable, row-column decomposition

    ( )

    =

    ==+

    =

    =

    = otherwise

    kCwhereLkfor

    LkiCxY

    AXAZ

    lk

    L

    ikilk

    T

    1

    02

    1;1,,1,0;

    212cos2

    1

    ,

    1

    0,

    X TransportMemory

    (Y)

    1D DCTUnit

    Z1D DCTUnit

    Y=AX Z=YAT

  • 14SOC Consortium Course Material

    Real-tim

    e OS

    SoC Design Laboratory

    Quantization Table for Luminance

    16 11 10 16 24 40 51 61

    12 12 14 19 26 58 60 55

    14 13 16 24 40 67 69 56

    14 17 22 29 51 87 80 62

    18 22 37 56 68 109 103 77

    24 35 55 64 81 104 113 92

    49 64 78 87 103 121 120 101

    72 92 95 98 112 100 103 99

  • 15SOC Consortium Course Material

    Real-tim

    e OS

    SoC Design Laboratory

    Quantization Table for Chrominance

    17 18 24 47 99 99 99 9918 21 26 66 99 99 99 9924 26 56 99 99 99 99 9947 66 99 99 99 99 99 9999 99 99 99 99 99 99 9999 99 99 99 99 99 99 9999 99 99 99 99 99 99 9999 99 99 99 99 99 99 99

  • 16SOC Consortium Course Material

    Real-tim

    e OS

    SoC Design Laboratory

    Predictive Coding of DC Coefficients

    iDC

    1iblock iblock

    1iDC

    1iDCPrevious sample

    sample iDC

    Difference

    1 ii DCDC

  • 17SOC Consortium Course Material

    Real-tim

    e OS

    SoC Design Laboratory

    Zig-zag Scan (AC Coefficients)

    DC

  • 18SOC Consortium Course Material

    Real-tim

    e OS

    SoC Design Laboratory

    Run-Length Coding

    30 -3 -1 0 0 0 0 0

    -2-2

    -1-1

    0 0 0 0 0

    0 0 0 0 0

    0

    0

    0 0 0 0 0 0 0 0

    0 0 0 0 0 0 0 0

    0 0 0 0 0 0 0 0

    0 0 0 0 0 0 0 0

    0 0 0 0 0 0 0 0

    DC

    (R,L) => (0,-3)(0,-2)(0,-2)(0,-1)(2,-1)(EOB)

  • 19SOC Consortium Course Material

    Real-tim

    e OS

    SoC Design Laboratory

    Huffman Coding for DC Coefficient

    Category AC Coefficient Range1 -1,1

    2 -3,-2,2,3

    3 -7,,-4,4,,7

    4 -15,,-8,8,,15

    5 -31,,-16,16,,31

    6 -63,,-32,32,,63

    7 -127,,-64,64,,127

    8 -255,,-128,128,,255

    9 -511,,-256,256,,511

    10 -1023,,-512,512,,1023

    11 -2047,,-1024,1024,,2047

    SSSS Value

    -1,1

    -3,-2,2,3

    -7,,-4,4,,7

    -15,,-8,8,,15

    -31,,-16,16,,31

    00

    1

    2

    3

    4

    5

  • 20SOC Consortium Course Material

    Real-tim

    e OS

    SoC Design Laboratory

    An Example of Baseline DCT-based Coding

    140 144 147 140 140 155 179 179

    144 152 140 147 140 148 167 179

    152 155 136 167 163 162 152 172

    168 145 156 160 152 155 136 160

    162 148 156 148 140 136 147 162

    147 167 140 155 155 140 136 162

    136 156 123 167 162 144 140 147

    148 155 136 155 152 147 147 136

    -128

    12 16 19 12 11 27 51 47

    16 24 12 19 12 20 39 51

    24 27 8 39 35 34 24 44

    40 17 28 32 24 27 8 32

    34 20 28 20 12 8 19 34

    19 39 12 27 27 12 8 34

    3 28 -5 39 34 16 12 19

    20 27 8 27 24 19 19 8

    FDCT185 -17 14 -8 23 -9 -13 -18

    20 -34 26 -9 -10 10 13 6

    -10 -23 -1 6 -18 3 -20 0

    -8 -5 14 -14 -8 -2 -3 8

    -3 9 7 1 -11 17 18 15

    3 -2 -18 8 8 -3 0 -6

    8 0 -2 3 -1 -7 -1 -1

    0 -7 -2 1 1 4 -6 0

    Q

    3 5 7 9 11 13 15 17

    5 7 9 11 13 15 17 19

    7 9 11 13 15 17 19 21

    9 11 13 15 17 19 21 23

    11 13 15 17 19 21 23 25

    13 15 17 19 21 23 25 27

    15 17 19 21 23 25 27 29

    17 19 21 23 25 27 29 31

    61 -3 2 0 2 0 0 -1

    4 -4 2 0 0 0 0 0

    -1 -2 0 0 -1 0 -1 0

    0 0 1 0 0 0 0 0

    0 0 0 0 0 0 0 0

    0 0 -1 0 0 0 0 0

    0 0 0 0 0 0 0 0

    0 0 0 0 0 0 0 0

    Zig-Zag

    Run-length

    (6)(61),(0,2)(-3), (0,3)(4),(0,1)(-1), (0,3)(-4),(0,2)(2), (1,2)(2),(0,2)(-2), (0,2)(-2),(5,2)(2), (3,1)(1),(6,1)(-1), (2,1)(-1),(4,1)(-1), (7,1)(-1),(0,0)

    (110)(111101)(01)(00)(100)(100)(00)(0)(100)(001)(01)(10)(11011)(10)(01)(01)(01)(01)(11111110111)(10)(111010)(1)(1111011)(0)(11100)(0)(111011)(0)(11111010)(0)(1010)

    Huffman

    total 98 bits

    Q Table

  • 21SOC Consortium Course Material

    Real-tim

    e OS

    SoC Design Laboratory

    JPEG Bitstream

    Start_of_image End_of_image

    Tables, etc. header ...... scanscan

    Frame

    Tables, etc. header Restart segment Restart ......segment

    block block ...... blockblock

  • 22SOC Consortium Course Material

    Real-tim

    e OS

    SoC Design Laboratory

    OutlineIntroduction to JPEG CodecLab Lab Case studyCase studyReference

  • 23SOC Consortium Course Material

    Real-tim

    e OS

    SoC Design Laboratory

    File StructureFinal_project --------- sw.bat -

    |

    |---- hw.bat -

    |

    |---- Download.brd - bitLM |

    |---- sw -------- _dct.cpp - DCT | |---- bmp.cpp - *.bmp | |---- jpeg.cpp - JPEG | |---- jpeg.h - jpeg.cpp | |---- main.cpp - JPEG Codec | |---- marker.h - JPEG | |---- picture.cpp -

    | |---- stream.cpp - bitstream | |---- stream.h - stream.cpp | |---- type.h -

    |

    |---- hw -------- ahb2apb.v

    |---- ahbahbtop.bit - IPXilinx |---- ahbahbtop.v

    |---- ahbapbsys.v

    |---- ahbdecoder.v

    |---- ahbmuxs2m.v

    |---- ahbzbtram.v

    |---- apbintcon.v

    |---- apbregs.v

    |---- dct.v - Chen's DCT/IDCT |---- LM_flash_load.bit

    |---- map.ucf

    |---- myip.v - DCT/IDCTIP

  • 24SOC Consortium Course Material

    Real-tim

    e OS

    SoC Design Laboratory

    Read & Write Address

    Core Module / Motherboard

    memoryand peripherals

    PCI

    Core modulealias memory

    Logic module 0

    Logic module 1

    Logic module 2

    Logic module 3

    LM registers

    Interrupt

    SSRAM

    Bus Error response

    test_register

    0xC0000000

    0xD0000000

    0xE0000000

    0xF0000000

    0xC0000000

    0xC1000000

    0xC2000000

    0xC21000000xC2100004

    0xCFFFFFFF

    Write_head

    0xcc000000

    0xcc000004

    0xcc000008

    0xcc00000c

    0xcc000010

    0xcc000014

    0xcc000018

    0xcc00001c

    Read_head

    0xcc000020

    0xcc000024

    0xcc000028

    0xcc00002c

    0xcc000030

    0xcc000034

    0xcc000038

    0xcc00003c

    Write_head

    0xcc000040

    0xcc000044

    0xcc000048

    0xcc00004c

    0xcc000050

    0xcc000054

    0xcc000058

    0xcc00005c

    Read_head

    0xcc000060

    0xcc000064

    0xcc000068

    0xcc00006c

    0xcc000070

    0xcc000074

    0xcc000078

    0xcc00007c

    FDCTFDCT IDCTIDCT

  • 25SOC Consortium Course Material

    Real-tim

    e OS

    SoC Design Laboratory

    Result for SW Simulation

    OriginalOriginal

    EncoderEncoder DecoderDecoder

  • 26SOC Consortium Course Material

    Real-tim

    e OS

    SoC Design Laboratory

    Result for HW Simulation

    OriginalOriginal

    EncoderEncoder DecoderDecoder

  • 27SOC Consortium Course Material

    Real-tim

    e OS

    SoC Design Laboratory

    Profiling Result of SW Simulation

  • 28SOC Consortium Course Material

    Real-tim

    e OS

    SoC Design Laboratory

    Lab Case StudyGoal Implement the JPEG codec system using ARM platform

    Principles Implement the ARM platform-based JPEG codec HW/SW

    co-designRequirement Analysis the profiling of pure software simulation Explain how to partition the HW/SW of JPEG codec Implement the JPEG codec with HW/SW co-design

    Discussion Explain where is the stack and heap ? And who initialize

    them

  • 29SOC Consortium Course Material

    Real-tim

    e OS

    SoC Design Laboratory

    OutlineIntroduction to JPEG CodecLab Case studyReferenceReference

  • 30SOC Consortium Course Material

    Real-tim

    e OS

    SoC Design Laboratory

    ReferenceWen-Hsiung Chen, C. Harrison Smith, and S. C. Fralick, "A Fast Computational Algorithm for the Discrete Cosine Transform," IEEE Trans. Commun., vol. COM-25, pp. 1004-1009, Sept 1977.JPEG: Still Image Data Compression Standard by William B. Pennebakerand Joan L. Mitchell, Kluwer Academic Publishers, ISBN: 0442012721

    Case StudyARM Platform-based JPEG Codec HW/SW Co-designOutlineISO/IEC 10918-1 JPEGColor Model in Video YCrCbChroma Sub-samplingBlock Diagram of JPEG EncoderBlock Diagram of JPEG Decoder2-D DCT (Discrete Cosine Transform)Basis Image of 2-D DCTFrequency Distribution of 2-D DCT8 point 1-D DCT Algorithm (1/2)8 point 1-D DCT Algorithm (2/2)Implementation 2-D DCT Example: row-column decompositionQuantization Table for LuminanceQuantization Table for ChrominancePredictive Coding of DC CoefficientsZig-zag Scan (AC Coefficients)Run-Length CodingHuffman Coding for DC CoefficientAn Example of Baseline DCT-based CodingJPEG BitstreamOutlineFile StructureRead & Write AddressResult for SW SimulationResult for HW SimulationProfiling Result of SW SimulationLab Case StudyOutlineReference