Low-Power Variation-Tolerant Nonvolatile Lookup...

5
1174 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 24, NO. 3, MARCH 2016 Low-Power Variation-Tolerant Nonvolatile Lookup Table Design Xiaoyong Xue, Jianguo Yang, Yinyin Lin, Ryan Huang, Qingtian Zou, and Jingang Wu Abstract—Emerging nonvolatile memories (NVMs), such as MRAM, PRAM, and RRAM, have been widely investigated to replace SRAM as the configuration bits in field-programmable gate arrays (FPGAs) for high security and instant power ON. However, the variations inherent in NVMs and advanced logic process bring reliability issue to FPGAs. This brief introduces a low-power variation-tolerant nonvolatile lookup table (nvLUT) circuit to overcome the reliability issue. Because of large R OFF / R ON , 1T1R RRAM cell provides sufficient sense margin as a configuration bit and a reference resistor. A single-stage sense amplifier with voltage clamp is employed to reduce the power and area without impairing the reliability. Matched reference path is proposed to reduce the parasitic RC mismatch for reliable sensing. Evaluation shows that 22% reduction in delay, 38% reduction in power, and the tolerance of variations of 2.5× typical R ON or R OFF in reliability are achieved for proposed nvLUT with six inputs. Index Terms—Logic-in-memory, low power, nonvolatile lookup table (nvLUT), RRAM, variation tolerant. I. I NTRODUCTION SRAM-based field-programmable gate arrays (FPGAs) have been widely used during the last decades. However, the volatility of SRAM has limited FPGAs in applications where high security and instant power-on are required [1]. The problem can be solved by introducing nonvolatile memory (NVM) as the configuration bit. However, the traditional NVM devices, such as antifuse, E 2 PROM, and flash, require high-voltage process and have poor logic compatibility [2], [3], thus limiting the logic density and increasing the integration cost of FPGAs. Emerging NVMs, such as MRAM, PRAM, and RRAM, have been verified with better scalability and logic compatibility [2], [4]. Based on the logic-in-memory concept, lookup table, which is the core building block in FPGAs, has been proposed with nonvolatility. First, various nonvolatile SRAM (nvSRAM) structures with MRAM and RRAM were proposed to directly replace SRAM in the traditional lookup table to acquire nonvolatility [5]–[7]. However, the size of nvSRAM cell is remarkably larger than that of SRAM, and the write disturbance is also difficult to avoid for half-select RRAM cells. For MRAM, Suzuki et al. [8] proposed a two-input nonvolatile lookup table (nvLUT) based on MRAM in the current-mode logic for low power. Suzuki et al. [9] also proposed a six-input nvLUT with serial/parallel magnetic junctions to acquire enough sensing margin. Zhao et al. [10] proposed an another Manuscript received January 25, 2015; revised March 19, 2015; accepted April 19, 2015. Date of publication May 12, 2015; date of current version February 23, 2016. This work was supported in part by the National 863 Project under Grant 2011AA010404 and Grant 2014AA032602 and in part by the Shanghai STC Project under Grant 12XD1400800. (Corresponding author: Yinyin Lin.) X. Xue, J. Yang, and Y. Lin are with ASIC and System State Key Laboratory, Department of Microelectronics, Fudan University, Shanghai 201203, China (e-mail: [email protected]; [email protected]; [email protected]). R. Huang, Q. Zou, and J. Wu are with the Technology Develop- ment and Research Center, Semiconductor Manufacturing International Corporation, Shanghai 201203, China (e-mail: [email protected]; [email protected]; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TVLSI.2015.2426876 MRAM-based nvLUT for run-time reconfiguration. Ren [11] proposed a third type of MRAM-based nvLUT named hybrid-LUT2. However, the R OFF / R ON of MRAM is smaller compared with PRAM or RRAM, resulting in less sense margin or larger area due to serial/parallel magnetic junctions. Moreover, the first three MRAM nvLUTs have a mismatch in parasitic RC between the selected path in the multiplexer and the reference path, which may cause nvLUT to fail. For hybrid-LUT2, the configuration of MRAM cells shares the same decoding circuit with logic operation, whose inputs may be wired to other logic blocks and cannot be used as the address inputs during configuration. For RRAM, Sakamoto et al. [12] proposed an nvLUT based on nanobridge. However, the programming path of nanobridge shares the same multiplexer with the logic path for selection, making the size of transistors in the multiplexer considerably large to satisfy the reset voltage for R ON . Chen et al. [13] proposed another RRAM-based nvLUT using crossbar array. However, the sneaking paths inherent in crossbar array bring considerable leakage and poor sensing margin of only 10 mV. To sum up, none of the previous work has achieved high reliability against memory and logic variations, low power, high-area efficiency, and low leakage at the same time. This brief introduces a low-power variation-tolerant nvLUT circuit to overcome the issues in the previous work. Because of its large R OFF / R ON , 1T1R RRAM cell is used as a configuration bit and a reference resistor to provide sufficient sense margin against memory and logic variations. Thus, the area cost is decreased because no parallel or serial memory cell combinations are needed to guarantee the sense margin. To reduce the power and area, single-stage sense amplifier with voltage clamp (SSAVC) is employed without compro- mising the reliability. Moreover, matched reference path (MRP) is devised to minimize the parasitic RC mismatch between the selected path in the multiplexer and the reference path for reliable sensing against logic variations. Finally, detailed evaluation is carried out to verify the benefits. II. PROPOSED LOW-POWER VARIATION-TOLERANT nvLUT To illustrate the proposed design, a two-input nvLUT is presented, as shown in Fig. 1. The input count can also be easily extended to six, which is prevailing in current main-stream FPGA products [14]. The overall architecture of nvLUT consists of an SSAVC, a tree multiplexer (TMUX), an MRP, a RRAM slice, and a footer transistor. The RRAM slice constitutes of four 1T1R RRAM cells at the left for configuration and a dummy RRAM cell at the right-most as a reference resistor. The truth table is stored in the RRAM slice in the form of resistance state, R OFF or R ON , which is different from the logic voltage in SRAM. For example, in order to program the nvLUT as a NOR gate, R0 should be programmed as R ON denoting 1, while R1, R2, and R3 should be programmed as R OFF denoting 0. The inputs IN0 and IN1 select the corresponding RRAM cell through TMUX. To perform the operations of LUT, the sense amplifier is employed to convert the resistance state of RRAM cell into logic voltage. The function of footer transistor MF is to allow current to flow during sensing and it is closed during precharge to restrain leakage. 1063-8210 © 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Transcript of Low-Power Variation-Tolerant Nonvolatile Lookup...

Page 1: Low-Power Variation-Tolerant Nonvolatile Lookup …kresttechnology.com/krest-academic-projects/krest-mtech...1174 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS,

1174 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 24, NO. 3, MARCH 2016

Low-Power Variation-Tolerant Nonvolatile Lookup Table DesignXiaoyong Xue, Jianguo Yang, Yinyin Lin, Ryan Huang, Qingtian Zou, and Jingang Wu

Abstract— Emerging nonvolatile memories (NVMs), such as MRAM,PRAM, and RRAM, have been widely investigated to replace SRAM asthe configuration bits in field-programmable gate arrays (FPGAs) forhigh security and instant power ON. However, the variations inherentin NVMs and advanced logic process bring reliability issue to FPGAs.This brief introduces a low-power variation-tolerant nonvolatile lookuptable (nvLUT) circuit to overcome the reliability issue. Because oflarge ROFF/RON , 1T1R RRAM cell provides sufficient sense marginas a configuration bit and a reference resistor. A single-stage senseamplifier with voltage clamp is employed to reduce the power and areawithout impairing the reliability. Matched reference path is proposed toreduce the parasitic RC mismatch for reliable sensing. Evaluation showsthat 22% reduction in delay, 38% reduction in power, and the toleranceof variations of 2.5× typical RON or ROFF in reliability are achieved forproposed nvLUT with six inputs.

Index Terms— Logic-in-memory, low power, nonvolatile lookuptable (nvLUT), RRAM, variation tolerant.

I. INTRODUCTION

SRAM-based field-programmable gate arrays (FPGAs) have beenwidely used during the last decades. However, the volatility ofSRAM has limited FPGAs in applications where high security andinstant power-on are required [1]. The problem can be solved byintroducing nonvolatile memory (NVM) as the configuration bit.However, the traditional NVM devices, such as antifuse, E2PROM,and flash, require high-voltage process and have poor logiccompatibility [2], [3], thus limiting the logic density and increasingthe integration cost of FPGAs.

Emerging NVMs, such as MRAM, PRAM, and RRAM, havebeen verified with better scalability and logic compatibility [2], [4].Based on the logic-in-memory concept, lookup table, which is thecore building block in FPGAs, has been proposed with nonvolatility.First, various nonvolatile SRAM (nvSRAM) structures with MRAMand RRAM were proposed to directly replace SRAM in the traditionallookup table to acquire nonvolatility [5]–[7]. However, the size ofnvSRAM cell is remarkably larger than that of SRAM, and the writedisturbance is also difficult to avoid for half-select RRAM cells.For MRAM, Suzuki et al. [8] proposed a two-input nonvolatilelookup table (nvLUT) based on MRAM in the current-modelogic for low power. Suzuki et al. [9] also proposed a six-inputnvLUT with serial/parallel magnetic junctions to acquireenough sensing margin. Zhao et al. [10] proposed an another

Manuscript received January 25, 2015; revised March 19, 2015; acceptedApril 19, 2015. Date of publication May 12, 2015; date of current versionFebruary 23, 2016. This work was supported in part by the National863 Project under Grant 2011AA010404 and Grant 2014AA032602and in part by the Shanghai STC Project under Grant 12XD1400800.(Corresponding author: Yinyin Lin.)

X. Xue, J. Yang, and Y. Lin are with ASIC and System State Key Laboratory,Department of Microelectronics, Fudan University, Shanghai 201203,China (e-mail: [email protected]; [email protected];[email protected]).

R. Huang, Q. Zou, and J. Wu are with the Technology Develop-ment and Research Center, Semiconductor Manufacturing InternationalCorporation, Shanghai 201203, China (e-mail: [email protected];[email protected]; [email protected]).

Color versions of one or more of the figures in this paper are availableonline at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TVLSI.2015.2426876

MRAM-based nvLUT for run-time reconfiguration. Ren [11]proposed a third type of MRAM-based nvLUT named hybrid-LUT2.However, the ROFF/RON of MRAM is smaller compared withPRAM or RRAM, resulting in less sense margin or larger areadue to serial/parallel magnetic junctions. Moreover, the first threeMRAM nvLUTs have a mismatch in parasitic RC between theselected path in the multiplexer and the reference path, whichmay cause nvLUT to fail. For hybrid-LUT2, the configuration ofMRAM cells shares the same decoding circuit with logic operation,whose inputs may be wired to other logic blocks and cannotbe used as the address inputs during configuration. For RRAM,Sakamoto et al. [12] proposed an nvLUT based on nanobridge.However, the programming path of nanobridge shares the samemultiplexer with the logic path for selection, making the size oftransistors in the multiplexer considerably large to satisfy the resetvoltage for RON. Chen et al. [13] proposed another RRAM-basednvLUT using crossbar array. However, the sneaking paths inherent incrossbar array bring considerable leakage and poor sensing marginof only 10 mV. To sum up, none of the previous work has achievedhigh reliability against memory and logic variations, low power,high-area efficiency, and low leakage at the same time.

This brief introduces a low-power variation-tolerant nvLUT circuitto overcome the issues in the previous work. Because of itslarge ROFF/RON, 1T1R RRAM cell is used as a configuration bit and areference resistor to provide sufficient sense margin against memoryand logic variations. Thus, the area cost is decreased because noparallel or serial memory cell combinations are needed to guaranteethe sense margin. To reduce the power and area, single-stage senseamplifier with voltage clamp (SSAVC) is employed without compro-mising the reliability. Moreover, matched reference path (MRP) isdevised to minimize the parasitic RC mismatch between the selectedpath in the multiplexer and the reference path for reliable sensingagainst logic variations. Finally, detailed evaluation is carried out toverify the benefits.

II. PROPOSED LOW-POWER VARIATION-TOLERANT nvLUT

To illustrate the proposed design, a two-input nvLUT is presented,as shown in Fig. 1. The input count can also be easily extended to six,which is prevailing in current main-stream FPGA products [14].The overall architecture of nvLUT consists of an SSAVC, a treemultiplexer (TMUX), an MRP, a RRAM slice, and a footer transistor.The RRAM slice constitutes of four 1T1R RRAM cells at the leftfor configuration and a dummy RRAM cell at the right-most as areference resistor. The truth table is stored in the RRAM slice in theform of resistance state, ROFF or RON, which is different from thelogic voltage in SRAM. For example, in order to program the nvLUTas a NOR gate, R0 should be programmed as RON denoting 1, whileR1, R2, and R3 should be programmed as ROFF denoting 0. Theinputs IN0 and IN1 select the corresponding RRAM cell throughTMUX. To perform the operations of LUT, the sense amplifier isemployed to convert the resistance state of RRAM cell into logicvoltage. The function of footer transistor MF is to allow currentto flow during sensing and it is closed during precharge to restrainleakage.

1063-8210 © 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Page 2: Low-Power Variation-Tolerant Nonvolatile Lookup …kresttechnology.com/krest-academic-projects/krest-mtech...1174 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS,

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 24, NO. 3, MARCH 2016 1175

Fig. 1. Overall architecture of the proposed low-power variation-tolerant nvLUT based on RRAM.

Fig. 2. 1T1R RRAM cell integration process and structure.(a) Cross-sectional view. (b) Schematic.

A. RRAM as a Configuration Bit and a Reference Resistor

The 1T1R RRAM cell is employed as a configuration bit anda reference resistor to provide sufficient sense margin, as shownin Fig. 1. Different from crossbar array, a 1T1R RRAM cell caneliminate the sneaking current and the disturbances during writeand read, thus saving power and acquiring high yield. The typicalRON and ROFF of RRAM are of kilo-ohms and megaohms,respectively [15], and ROFF/RON is over 100, which is at least40× larger than that of MRAM [8]. Therefore, sufficient sense marginis guaranteed and the configuration resources are also saved byhalf compared with the parallel or serial combination scheme [9].Moreover, the RRAM storage layer, i.e., R0-3 and Rref , is stacked inthe back-end of line without occupying an additional area, as shownin Fig. 2 [4], [15].

Because the characteristics of RRAM are different from conven-tional resistor, the sense margins of RON and ROFF compared with a

conventional reference resistor may suffer asymmetric changes undermemory and logic process variation, which may result in read failure.To resolve this issue, dummy RRAM cell, which is programmedto a mid-state resistance, is adopted as the reference resistor. Thus,the configuration bits and reference resistor vary in the same wayacross different temperatures and process conditions, preserving thesense margins for both RON and ROFF [15]. Moreover, the dummycell occupies less area than conventional resistors. The peripheraldecoding and writing circuits for dummy cell can also be sharedwith configuration bits, bringing less area overhead.

In our proposed nvLUT, since WL, BL, and SL are all drawnout from the RRAM slice, both unipolar and bipolar RRAM canbe programmed by enforcing set or reset voltage on BL or SLwhile activating the corresponding WL. Because the dummy RRAMcell with a mid-state resistance is used as the reference, theadopted RRAMs should support the trimming of its storage resis-tance to a specific value. In this brief, in order to evaluateour proposed nvLUT, we use the bipolar CuxSiyO RRAM withthe forming/set/reset voltage of ∼2.5/2/−1 V [4], [15]. Using aself-adaptive-write-mode (SAWN) with timely feedback and verifyduring programming [15], the dummy cell can be trimmed to aspecific storage resistance. Although the endurance of RRAM isin the order of 106 [4], it still suffices as the configurationmemory in nvLUT because reprogramming only takes place when theconfiguration data is changed.

B. SSAVC

SSAVC converts the resistance state of RRAM into a rail-to-raillogic voltage. As shown in Fig. 1, transistors M3–M6 constitute of

Page 3: Low-Power Variation-Tolerant Nonvolatile Lookup …kresttechnology.com/krest-academic-projects/krest-mtech...1174 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS,

1176 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 24, NO. 3, MARCH 2016

a latch amplifier. Transistors M1 and M2 are used to prechargethe output nodes OUT and OUTB to VDD when CLK is lowand transistor MF is used to initiate the conversion when CLK ishigh. Compared with the previous two-stage sense amplifier [8], thesingle-stage realization occupies less die area.

The internal voltages in TMUX and MRP are clamped to lowervoltages by the clamp transistors to save power. In previous work, theinner nodes of the selected path in multiplexer and reference path areboth precharged to VDD or (VDD-V th) when CLK is low [8], [10].Then, the charges are discharged to a capacitor or ground whenCLK is high, resulting in considerable power waste. To alleviatethis issue, transistors M7 and M8 are inserted between the senseamplifier and the TMUX/MRP. By applying a proper clamp voltageVbias, which is lower than VDD, on the gates of M7 and M8, the innernodes of the selected path in TMUX and MRP can only be prechargedto (Vbias-V th). In an FPGA chip, Vbias for different nvLUTs canbe generated by a single voltage regulator with negligible overheadand it can also be tuned for different PVT conditions. Because ofthe quadratic relationship between energy and voltage, considerableaverage power saving can be achieved by the reduction of prechargevoltage. Although the voltage clamp may incur reduced currents intothe sense amplifier, large ROFF/RON of RRAM still helps to preservethe sense margin without impairing the reliability.

C. MRP

Although trimming Rref by SAWM can help to disabuse theparasitic resistance mismatch between the selected path in TMUXand the reference path, their parasitic capacitance mismatch cannot beeasily estimated and compensated. The MRP is devised to minimizethe parasitic RC mismatch between the above-mentioned two paths.To illustrate this point, IN0 and IN1 are assumed to take the logicvalues of 0 and 1, respectively. As shown in Fig. 1, the path markedby the green dash line in TMUX, P01, is selected to be comparedwith the reference path, Pref . For reliable sensing, the parasitic RCsof P01 and Pref should be equivalent. Therefore, the transistorsMP8 and MP10 with their gate grounded are, respectively, addedat the nodes B and D in MRP to imitate the parasitic effects ofOFF-state transistors MP2 and MP3 at the nodes A and C in TMUX.Moreover, the transistors in MRP take the same size with the passtransistors in TMUX.

Fig. 3(a) and (b) compares the parasitic RC of MRP with P01. It isassumed that the ON-state resistance of the pass transistor is Rp, thedrain or source parasitic capacitance of the pass transistor is Cp, andthe drain parasitic capacitance of the access transistor in the1T1R RRAM cell is Ca. Fig. 3(c) and (d) also gives the previousreference resistor tree (RRT) and its corresponding RC equivalentcircuit [8]. Because the conventional resistor is used as the referencein RRT, the area cost is prominent and the parasitic capacitancecannot be ignored like RRAM resistor. Here, the parasitic capacitanceof the reference resistor in RRT is assumed to be Cres, which is largerthan Ca or Cp due to its larger size. According to the Elmore delayformula, the RC time constant from node A/B/B ′ to node E/F/F ′can be calculated as follows:

tAE = 5 Rp ∗ Cp + 2 Rp ∗ Ca (1)

tB F = 5 Rp ∗ Cp + 2 Rp ∗ Ca (2)

tB ′F ′ = 6 Rp ∗ Cp + 2 Rp ∗ Cres. (3)

By comparison, the proposed MRP has the same parasitic RC withthe selected path in TMUX, while RRT has more parasitic RC . Theexcessive parasitic RC in RRT may slow down the discharging of thereference path, making the sense amplifier prone to output 1 when

Fig. 3. Parasitic RC equivalent circuits of (a) P01 and (b) Pref in Fig. 1.(c) Schematic and (d) parasitic RC equivalent circuit of RRT [8].

TABLE ISIZING OF DEVICES IN THE PROPOSED nvLUT

the resistance margin between the configuration bit and the referenceresistor is subtle due to memory variation.

III. EVALUATION AND ANALYSIS

Due to large ROFF/RON of RRAM, it is easy to ensure thereliability of the proposed nvLUT. Therefore, our priority is todetermine Rref and Vbias by the power-delay product. For evaluation,the input count is extended to six to comply with the mainstreamFPGA products. The evaluation is carried out in 90-nm logicprocess and the clock frequency is 1 GHz for comparison with theprevious work.

A. Functional Characteristics

The typical values of RON and ROFF for RRAM are chosen to be5 K� and 1 M�, respectively [15]. Because OUT and OUTB areprecharged to VDD when CLK is low, the low-to-high transition timetpLH is 0 and the high-to-low transition time tpHL, which is not 0,is the key delay parameter. Therefore, the delay is half of tpHL. Thevalue of tpHL is dependent on Rref and Vbias as well as the sizing ofdevices in the circuit. The sizing of devices in the proposed nvLUT isshown in Table I.

The values of Rref and Vbias are determined by sweeping to findthe optimal power-delay product. Fig. 4 shows the delay, averagepower at 1 GHz, and power-delay product of the proposed nvLUTwith respect to Vbias for different Rref values. Fig. 4(a) showsthat smaller Rref brings faster discharging in the reference path,thus decreasing tpHL. Moreover, larger Vbias also helps to speedup sense speed because larger currents are injected into the senseamplifier from TMUX and MRP during sensing. Fig. 4(b) shows

Page 4: Low-Power Variation-Tolerant Nonvolatile Lookup …kresttechnology.com/krest-academic-projects/krest-mtech...1174 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS,

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 24, NO. 3, MARCH 2016 1177

Fig. 4. (a) Delay, (b) average power at 1 GHz, and (c) power-delay productof proposed nvLUT with Vbias for different Rref values.

that Vbias is predominant in affecting the average power and thatRref almost has no effect. Fig. 4(c) shows the power-delay productbecomes smaller with smaller Rref and there exists a minimum valueas Vbias changes. However, smaller Rref leads to decreased resistancemargin between Rref and RON, leading to weaker reliability.Therefore, Rref is programmed to 20 K� to acquire sufficientresistance margin while keeping the delay small. The correspondingoptimum Vbias is ∼0.9 V, as shown in Fig. 4(c). The correspondingdelay and average power are 117 ps and 3.55 μW at 1 GHz, whichare reduced by 22% and 38%, respectively, compared with six-inputMRAM nvLUT [9].

B. Reliability Improvement

The reliability improvement of the proposed nvLUT is studied bythe Monte Carlo statistical analysis with 104 iterations. RRAM suffersfrom severe variations and the practical resistance distribution canbe about 2× its typical value with 3σ variation [4], [15]. AlthoughROFF/RON is high, Rref /RON is only 4 in this brief, and the variationin RON may affect the correctness of nvLUT output under the logic

Fig. 5. Reliability comparison of the proposed nvLUT in [8] and [9].

process variation and mismatch of transistors. The simulation isperformed by varying RON and ROFF under logic process variationand mismatch from foundry silicon data. Among the circuit elements,the mismatches between M3 and M4, M5 and M6, and M7 and M8are also considered. The calculation of error rate is as follows.

1) Vary RON or ROFF by X times or 1/X.2) Carry out Monte Carlo statistical analysis for 104 iterations

with process variation and mismatch considered.3) Calculate the error rate.4) Repeat the above three steps to get more points.Fig. 5 shows the reliability comparison of the proposed

nvLUT with previous works when considering memory resistance(RON or ROFF) variations under the same process variation andmismatch. Only with logic process variation and mismatch, theproposed RRAM nvLUT can reduce the error rate by 53% comparedwith the six-input MRAM nvLUT using the techniques in [8]. Theproposed RRAM nvLUT can tolerate RON and/or ROFF variationsof 2.5× typical values, which is 40% larger compared with six-inputMRAM nvLUT in [9].

IV. CONCLUSION

The design techniques of low-power variation-tolerant nvLUT aredescribed in this brief. RRAM is adopted as the configuration bit andthe reference resistor to provide large sense margin, thus alleviatingthe effects of memory and logic process variations. Because of thehigh ROFF/RON of RRAM, SSAVC helps to reduce the power andarea without impairing the reliability. The MRP is also devisedto reduce the parasitic RC mismatch between the selected pathin the multiplexer and the reference path for reliable operation.By evaluation, remarkable improvements in power, delay, area, andreliability are achieved.

REFERENCES

[1] S. D. Brown, R. J. Francis, J. Rose, and Z. G. Vranesic, Field-Programmable Gate Arrays. Boston, MA, USA: Kluwer, 1992.

[2] S. Seo et al., “Reproducible resistance switching in polycrystalline NiOfilms,” Appl. Phys. Lett., vol. 85, no. 23, pp. 5655–5657, 2004.

[3] L. Torres, R. M. Brum, L. V. Cargnini, and G. Sassatelli, “Trendson the application of emerging nonvolatile memory to processors andprogrammable devices,” in Proc. IEEE Int. Symp. Circuits Syst. (ISCAS),May 2013, pp. 101–104.

[4] M. Wang et al., “A novel Cux SiyO resistive memory in logic technologywith excellent data retention and resistance distribution for embed-ded applications,” in Proc. Symp. VLSI Technol. (VLSIT), Jun. 2010,pp. 89–90.

[5] X. Xue et al., “Nonvolatile SRAM cell based on Cux O,” in Proc.9th Int. Conf. Solid-State Integr.-Circuit Technol. (ICSICT), Oct. 2008,pp. 869–871.

[6] N. Bruchon, L. Torres, G. Sassatelli, and G. Cambon, “Technologicalhybridization for efficient runtime reconfigurable FPGAs,” in Proc. IEEEComput. Soc. Annu. Symp. VLSI (ISVLSI), Mar. 2007, pp. 29–34.

Page 5: Low-Power Variation-Tolerant Nonvolatile Lookup …kresttechnology.com/krest-academic-projects/krest-mtech...1174 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS,

1178 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 24, NO. 3, MARCH 2016

[7] W. Zhao, E. Belhaire, B. Dieny, G. Prenat, and C. Chappert,“TAS-MRAM based non-volatile FPGA logic circuit,” in Proc. Int. Conf.Field-Program. Technol. (ICFPT), Dec. 2007, pp. 153–160.

[8] D. Suzuki et al., “Fabrication of a nonvolatile lookup-table circuitchip using magneto/semiconductor-hybrid structure for an immediate-power-up field programmable gate array,” in Proc. Symp. VLSICircuits (VLSIC), Jun. 2009, pp. 80–81.

[9] D. Suzuki, M. Natsui, T. Endoh, H. Ohno, and T. Hanyu, “Six-inputlookup table circuit with 62% fewer transistors using nonvolatile logic-in-memory architecture with series/parallel-connected magnetic tunneljunctions,” J. Appl. Phys., vol. 111, no. 7, pp. 07E318-1–07E318-3,2012.

[10] W. Zhao, E. Belhaire, C. Chappert, and P. Mazoyer, “Power and areaoptimization for run-time reconfiguration system on programmable chipbased on magnetic random access memory,” IEEE Trans. Magn., vol. 45,no. 2, pp. 776–780, Feb. 2009.

[11] F. Ren, “Energy-performance characterization of CMOS/magnetic tunneljunction (MTJ) hybrid logic circuits,” M.S. thesis, Dept. Elect. Eng.,Univ. California, Los Angeles, CA, USA, 2011.

[12] T. Sakamoto et al., “A nonvolatile programmable solid elec-trolyte nanometer switch,” in Proc. IEEE Int. Solid-State CircuitsConf. (ISSCC), Feb. 2004, pp. 290–529.

[13] Y.-C. Chen, H. Li, and W. Zhang, “A novel peripheral circuit forRRAM-based LUT,” in Proc. IEEE Int. Symp. Circuits Syst. (ISCAS),May 2012, pp. 1811–1814.

[14] S. Jung et al., “Thermally-assisted Ti/Pr0.7Ca0.3MnO3 ReRAM withexcellent switching speed and retention characteristics,” in Proc. IEEEInt. Electron Devices Meeting (IEDM), Dec. 2011, pp. 3.6.1–3.6.4.

[15] X. Xue et al., “A 0.13 μm 8 Mb logic-based Cux SiyO ReRAM withself-adaptive operation for yield enhancement and power reduction,”IEEE J. Solid-State Circuits, vol. 48, no. 5, pp. 1315–1322,May 2013.