Soft Decoding of Reed-Solomon Codes -...

Advanced Hardware Architecture for

Soft Decoding Reed-Solomon Codes

Stefan Scholl, Norbert Wehn

Microelectronic Systems Design Research Group

TU Kaiserslautern, Germany

Overview

• Soft decoding decoding for the RS(255,239)

• New hardware architecture

• Goal: large FER gain (over hard decision decoding)

• Algorithm based on information set decoding

• Complexity evaluation on a Virtex 5 FPGA

Motivation RS / BCH Decoder Hardware

NASA / CCSDS

wireless

wired storage

Optical (G.709)

Widely used code: RS(255,239) or its shortened versions

Decoding Algorithms for Reed-Solomon

Hard Decoding Soft Decoding

Progress in microelectronicsallows for more complexity today!

• standard method• algebraic decoding• complexity very low:

first chip implementations inthe 1970/80s

Algorithm:

Algorithms:

Chase Decoding

Information Set Decoding

Adaptive Belief Propagation

Kötter-Vardy

Improved error correctionpossible gain: up to 3 dB(depends on length and coderate)

Algorithm:

Algorithms:

Chase Decoding

Information Set Decoding

Adaptive Belief Propagation

Kötter-Vardy

Improved error correctionpossible gain: up to 3 dB(depends on length and coderate)

Algorithm:

We consider the widely used RS(255,239)

but RS(255,239) seems to be challenging

“medium gain”hardware0.5 – 1 dB

State-of-the-art Soft Decoder Hardware

Real & complete hardware implementations for RS(255,239)

Paper Year Algorithm Gain (over HDD)

An (PhD thesis, MIT) 2010 Low complexity Chase

0.45 dB

Hsu et al (ESSCIRC) 2011 Chase 0.35 db

Garcia-Herrero et al (CSSP)

2011 Low complexity Chase

0.3 dB

Kan et al (ISTC) 2008 Adaptive BP 0.75 dB

Heloir et al (NEWCAS) 2012 Stochastic Chase 0.7 dB

Scholl et al (DATE) 2014 Information set 0.75 dB

“low gain” hardware<0.5 dB

State-of-the-art Hardware Implementations

Hard decision decoding

“low gain”<0.5 dB

“medium gain”0.5 - 1 dB

“high gain”> 1 dB

Not yetinvestigated!

Literature shows:up to 2 dB gain should be possible

0 0 0 0 1 0 1 0 1 1 0 0 1 0 1 0 1 1 0 1 0 1 0 0 0 1 1 1 0 0 0 0 1 0 1 0 0 0 1 0 1 1 1 0 1 1 0 0 1 0 1 1 0 1 0 1 1 1 0 1 1 0 0 1 0 1 0 0 0 1 0 0 1 1 0 0 0 1 0 0 1 1 0 0

Implemented Algorithm*

1 1 0 0 1 0 1 1 0 1 0 0 1 1 reliability

Received bits

*A. Ahmed, R. Koetter, and N. R. Shanbhag. Performance analysis of theadaptive parity check matrix based soft-decision decoding algorithm, 2004.

most reliable least reliable

Binary image

Information set decoding approach

0 0 0 0 1 0 1 0 1 1 0 0 1 0 1 0 1 1 0 1 0 1 0 0 0 1 1 1 0 0 0 0 1 0 1 0 0 0 1 0 1 1 1 0 1 1 0 0 1 0 1 1 0 1 0 1 1 1 0 1 1 0 0 1 0 1 0 0 0 1 0 0 1 1 0 0 0 1 0 0 1 1 0 0

1 1 0 1 1 1 0 1 1 0 0 0 0 0 0 1 1 1 0 1 1 1 0 1 0 0 0 0 1 0 0 1 1 1 1 1 0 0 1 0 0 0 1 0 1 1 0 1 0 0 0 0 0 1 0 0 0 1 0 1 1 1 1 0 0 0 0 0 1 0 1 0 1 1 1 0 0 1 0 0 0 0 0 1

1 1 0 0 1 0 1 1 0 1 0 0 1 1 reliability

Received bits

Diagonalizedby Gaussianelimination

Binary image

0 0 0 0 1 0 1 0 1 1 0 0 1 0 1 0 1 1 0 1 0 1 0 0 0 1 1 1 0 0 0 0 1 0 1 0 0 0 1 0 1 1 1 0 1 1 0 0 1 0 1 1 0 1 0 1 1 1 0 1 1 0 0 1 0 1 0 0 0 1 0 0 1 1 0 0 0 1 0 0 1 1 0 0

1 1 0 1 1 1 0 1 1 0 0 0 0 0 0 1 1 1 0 1 1 1 0 1 0 0 0 0 1 0 0 1 1 1 1 1 0 0 1 0 0 0 1 0 1 1 0 1 0 0 0 0 0 1 0 0 0 1 0 1 1 1 1 0 0 0 0 0 1 0 1 0 1 1 1 0 0 1 0 0 0 0 0 1

1 1 0 0 1 0 1 1 0 1 0 0 1 1 reliability

Received bits

001000

syndrome

Syndrome weight:

Small:Only errors in least rel. part

Large:Min. 1 errors in most rel part

Binary image

0 0 0 0 1 0 1 0 1 1 0 0 1 0 1 0 1 1 0 1 0 1 0 0 0 1 1 1 0 0 0 0 1 0 1 0 0 0 1 0 1 1 1 0 1 1 0 0 1 0 1 1 0 1 0 1 1 1 0 1 1 0 0 1 0 1 0 0 0 1 0 0 1 1 0 0 0 1 0 0 1 1 0 0

1 1 0 1 1 1 0 1 1 0 0 0 0 0 0 1 1 1 0 1 1 1 0 1 0 0 0 0 1 0 0 1 1 1 1 1 0 0 1 0 0 0 1 0 1 1 0 1 0 0 0 0 0 1 0 0 0 1 0 1 1 1 1 0 0 0 0 0 1 0 1 0 1 1 1 0 0 1 0 0 0 0 0 1

1 1 0 0 1 0 1 1 0 1 0 0 1 1 reliability

Received bits

001000

syndrome

Syndrome weight:

Small:Only errors in least rel. part

Large:Min. 1 errors in most rel part

Order 1 processing: tentatively flip each most reliable bit (here: 1912)Order 2 processing: tentatively flip all combinations of 2 most reliable bits

(~2 million cases)

Can be seen as a low complexity variant of ordered-statistics decoding

Binary image

Algorithm Improvements

We add further features for improvement (mostly from other literature):

• Use a hard decision decoder (counters potential error floor)

• Use three differently diagonalized parity check matrices (improves FER)

• Partial overlapping of diagonalized parts

• allows for sophisticated architecture (complexity reduction)

• Restrict order 2 processing to “fair” reliable bits (250 out of 1912)

• Need to determine additional group: fair reliable (besides least and most)

• Large reduction of processings (factor 60 less)

• Use approximative reliability sorting to enable parallelization

(higher speed)

Overall loss due to complexity reduction: < 0.1 dB

Our New Hardware Architecture

Implementation on Virtex 5 FPGA

Input: 2040 bit LLRs8 in parallelQuantization: 6 bits

output: 2040 bits(hard out)8 in parallel

Our Hardware Architecture

Sorting

Finds low and fair reliable bits

Finds 378 lowest out of 2040 LLRs

Shift register based insertion sort

8 sorters parallel (approximative sorting)

Stores bit positions in four memories

Gaussian Elimination /Diagonalization:

Original matrix stored in memory

Diagonalization “on the fly”

Diagonalizaton “column wise”

2 phases: setup & elimination

Saves ~70% hardware over state-of-the-art diagonalizations (e.g. systolic arrays)

Three diagonalizations: exploit overlapping

P: Fixed pivot positions!

columnoriginalmatrix

column eliminated matrix

Pipelined array eliminator

Correction Unit

Performs order 1 and 2 processing

Parallelized order 2 proc.

In 1 clock cycle: 1x order 1

6x order 2

3 instances (for 3 matrices)

Selects best results for output

Syndrome Calculation:

Required: syndrome of the diagonalized matrix

Strategy:

First: calculate syndrome using original matrix

Second: “diagonalize” syndrome in the Gaussian Elimination

Advantage: allows use of Galois field operations (much faster)

FPGA Implementations

Kan et al Scholl et al Heloir et al THIS WORK

Algorithm Adaptive BP Information Set Stoch. Chase Information Set

Chip Stratix II Virtex 5 Virtex 5 Virtex 5

Flipflops n/a 42,000 143,000 70,200

Look-Up Tables 43,700 13,700 117,000 32,400

Throughput 4 Mbit/s 800 Mbit/s 50 Mbit/s 300 Mbit/s

Communicationsgain over HDD

0.75 dB 0.75 dB 0.7 dB 1.3 dB

Our new

architectureM. Kan et al., Hardware implementation of soft-decision decoding for Reed-Solomon code.

In Proc. 5th Int. Turbo Codes and Related Topics Symp, 2008.

S. Scholl and N. Wehn, “Hardware Implementation of a Reed-Solomon

Soft Decoder based on Information Set Decoding, DATE ’14, 2014.

R. Heloir, C. Leroux, S. Hemati, M. Arzel, and W.J.Gross.

Stochastic chase decoder for reed-solomon codes. IEEE NEWCAS 2012

State-of-the-art soft decoder RS(255,239), gain > 0.5 dB

Flipflops n/a 42,000 143,000 70,200

Look-Up Tables 43,700 13,700 117,000 32,400

0.75 dB 0.75 dB 0.7 dB 1.3 dB

Our new

Flipflops n/a 42,000 143,000 70,200

Look-Up Tables 43,700 13,700 117,000 32,400

0.75 dB 0.75 dB 0.7 dB 1.3 dB

Our new

Comparison FER

This work

Summary & Outlook

Proposed new RS soft decoder hardware for RS(255,239)

Based on information set decoding

Implementation with currently best FER: gain 1.3 dB over HDD

New “High gain” architecture, besides low & medium gain

Acceptable complexity

Improving implementation efficiency

Architectures for specific application’s requirements

Approach applicable to every linear code

Summary

Future Challenges

Thank you for your attention!

Questions?

Our new Binary Gaussian Elimination

• Basic operation: adding rows onto other rows to form unit columns

• For our hardware: Two Phase Approach

1. Setup: configures addition patterns

2. Elimination: performs actual elimination

• Architecture: Column by column processing with pipelined array

Columns fromoriginal matrix

Columns of eliminated matrix

P: Fixed pivot positions!

S. Scholl, C. Stumm, and N. Wehn. Hardware Implementations of Gaussian Elimination over GF(2) for Channel Decoding Algorithms. IEEE AFRICON 2013.

Comparison, 128 x 2040 matrix

Architecture Look-Up-Tables Flipflops Throughput

SMITH* 780k* 260k*

Systolic array 82k 99k 219k matrices / s

proposed 17k 33k 272k matrices / s

Design Example: Reed-Solomon (255,239) Code:

Binary Matrix Size: 128 x 2040

Implementation on a Xilinx FPGA Chip (Virtex 7)

* estimated +25% increase-67% saving-80% saving

Efficient Gaussian elimination

is the key for efficient soft decoding!

Soft Decoding of Reed-Solomon Codes -...

Documents

Transcript of Soft Decoding of Reed-Solomon Codes -...

Reed-Solomon Encoder v9 - Xilinx · Reed-Solomon Encoder v9.0 8 PG025 November 18, 2015 Chapter 2 Product Specification The Reed-Solomon Encoder inputs k information symbols and appends

LogiCORE IP Reed-Solomon Decoder v8 - Xilinx• High speed, compact Reed-Solomon Decoder † Implements many different Reed-Solomon (RS) coding standards † Fully synchronous design

Implementation of High Speed Reed Solomon Encoder

Analyzing and Implementing a Reed-Solomon Decoder …projekter.aau.dk/projekter/files/9852205/Master_Thesis__ASPI... · Analyzing and Implementing a Reed-Solomon Decoder for Forward

CCSDS Historical Document · 5.3 interleaving of the reed-solomon symbols.....5-4 5.4 hard algebraic decoding of reed-solomon codes.....5-5 5.5 performance of the recommended reed-solomon

Reed-Solomon Coding for IEEE 802

Reed-Solomon Decoder v9

Efficiency Considerations of Cauchy Reed-Solomon

art sklar7 reed-solomon revised - pearsoncmg.comptgmedia.pearsoncmg.com/images/art_sklar7_reed-solomon/element... · 2 Reed-Solomon Codes The code is capable of correcting any combination

BCH Code & Reed Solomon Code

VHDL IMPLEMENTATION OF REED-SOLOMON CODINGethesis.nitrkl.ac.in/2715/1/Subhree_Final_Thesis.pdf · VHDL IMPLEMENTATION OF REED-SOLOMON CODING ... (BCH) codes. In digital ... where

REED-SOLOMON CODES

Reed-Solomon Codes - unipd.ittesi.cab.unipd.it/...1008717_reedSolomonCodes.pdf · Reed-Solomon Codes (Codici di Reed ... desoc in the paper " Polynomial Codes over Certain Finite

Reed-Solomon Decoder v9 - Xilinx Code Settings ... that c(x) is divisible by the generator ... Solomon Decoder v9.0 . Reed-Solomon Decoder v9.0 ...

Low Latency Reed Solomon Forward Error Correction

Error Coding Reed Solomon

FPGA Based Design of Reed Solomon Codes

Reed Solomon Codes

Reed-Solomon Encoder and Decoder

Reed-Solomon II IP Core User Guide · About the Reed-Solomon II IP Core 1 2016.05.02 UG-01090 Subscribe Send Feedback Related Information • Reed-Solomon II IP Core Document Archives