Session 17 Overview: TX and RX Building Blocks

32
290 2017 IEEE International Solid-State Circuits Conference 978-1-5090-3758-2/17/$31.00 ©2017 IEEE ISSCC 2017 / SESSION 17 / TX AND RX BUILDING BLOCKS / OVERVIEW Session 17 Overview: TX and RX Building Blocks RF SUBCOMMITTEE 1:30 PM 17.1 A Digitally Assisted CMOS WiFi 802.11ac/11ax Front-End Module Achieving 12% PA Efficiency at 20dBm Output Power with 160MHz 256-QAM OFDM Signal Y. H. Chee, MediaTek, San Jose, CA In Paper 17.1, MediaTek describes a digitally assisted CMOS WiFi 802.11ac/11ax front-end module. The TX path delivers 20dBm output power with -35dB EVM and 12% PAE. The RX path achieves a 2.6dB NF and a 12.5dB gain while consuming 9mA from a 2.5V supply. 2:00 PM 17.2 A 28GHz Magnetic-Free Non-Reciprocal Passive CMOS Circulator Based on Spatio-Temporal Conductance Modulation T. Dinc, Columbia University, New York, NY In Paper 17.2, Columbia University describes a magnetic-free non-reciprocal passive CMOS circulator. Operating at 28GHz, the circulator achieves 3.2/3.3dB insertion losses, 1dB-insertion-loss bandwidth of 4.6GHz, >21dBm P 1dB and ~4dB NF using sub-harmonic spatial-temporal conductance modulation. 2:30 PM 17.3 A 60GHz On-Chip Linear Radiator with Single-Element 27.9dBm P sat and 33.1dBm Peak EIRP Using Multifeed Antenna for Direct On-Antenna Power Combining T. Chi, Georgia Institute of Technology, Atlanta, GA In Paper 17.3, Georgia Institute of Technology presents a 60GHz on-chip linear radiator. Using a multifeed antenna with direct on-antenna power combining, the transmitter generates 27.9dBm P sat and 33.1dBm peak EIRP with 23.4% PAE at 59GHz. Without pre-distortion, it achieves -21.9dB EVM with 20.2dBm P avg for a 4Gb/s 16-QAM signal. Subcommittee Chair: Piet Wambacq, imec, Belgium RF transceivers enable everything from the plethora of connectivity on mobile devices to emerging applications in fixed point-to- point links, imaging, and sensing. Advances across all aspects of signal generation, modulation, power amplification, and radiation are required to reduce power dissipation while increasing performance. The papers in this session highlight several advances in the state of the art in RF, mm-wave, and THz domains. Antenna interface improvements include a circulator for TX/RX isolation, a digitally assisted CMOS front-end module, a polar PA with intrinsic nonlinearity compensation, electrical-balance-duplexer impedance detection, and radiator-embedded power combining. Wideband systems for spectroscopy achieve a wide bandwidth for integrated spectroscopy and parallel multi-tone generation. A 310-to-370GHz array embeds beam steering with independent frequency tuning. High data-rate communication is demonstrated with a 5m 130GHz 12.5Gb/s link and a 105Gb/s 300GHz transmitter. Session Chair: Brian Ginsburg, Texas Instruments, Dallas, TX Session Co-Chair: Payam Heydari, University of California, Irvine, Irvine, CA

Transcript of Session 17 Overview: TX and RX Building Blocks

ISSCC 2017 Digest of Technical PapersISSCC 2017 / SESSION 17 / TX AND RX BUILDING BLOCKS / OVERVIEW
Session 17 Overview: TX and RX Building Blocks RF SUBCOMMITTEE
1:30 PM 17.1 A Digitally Assisted CMOS WiFi 802.11ac/11ax Front-End Module Achieving 12% PA Efficiency at
20dBm Output Power with 160MHz 256-QAM OFDM Signal Y. H. Chee, MediaTek, San Jose, CA
In Paper 17.1, MediaTek describes a digitally assisted CMOS WiFi 802.11ac/11ax front-end module. The TX path delivers 20dBm output power with -35dB EVM and 12% PAE. The RX path achieves a 2.6dB NF and a 12.5dB gain while consuming 9mA from a 2.5V supply.
2:00 PM 17.2 A 28GHz Magnetic-Free Non-Reciprocal Passive CMOS Circulator Based on Spatio-Temporal
Conductance Modulation T. Dinc, Columbia University, New York, NY
In Paper 17.2, Columbia University describes a magnetic-free non-reciprocal passive CMOS circulator. Operating at 28GHz, the circulator achieves 3.2/3.3dB insertion losses, 1dB-insertion-loss bandwidth of 4.6GHz, >21dBm P1dB and ~4dB NF using sub-harmonic spatial-temporal conductance modulation.
2:30 PM 17.3 A 60GHz On-Chip Linear Radiator with Single-Element 27.9dBm Psat and 33.1dBm Peak EIRP Using
Multifeed Antenna for Direct On-Antenna Power Combining T. Chi, Georgia Institute of Technology, Atlanta, GA
In Paper 17.3, Georgia Institute of Technology presents a 60GHz on-chip linear radiator. Using a multifeed antenna with direct on-antenna power combining, the transmitter generates 27.9dBm Psat and 33.1dBm peak EIRP with 23.4% PAE at 59GHz. Without pre-distortion, it achieves -21.9dB EVM with 20.2dBm Pavg for a 4Gb/s 16-QAM signal.
Subcommittee Chair: Piet Wambacq, imec, Belgium
RF transceivers enable everything from the plethora of connectivity on mobile devices to emerging applications in fixed point-to- point links, imaging, and sensing. Advances across all aspects of signal generation, modulation, power amplification, and radiation are required to reduce power dissipation while increasing performance. The papers in this session highlight several advances in the state of the art in RF, mm-wave, and THz domains. Antenna interface improvements include a circulator for TX/RX isolation, a digitally assisted CMOS front-end module, a polar PA with intrinsic nonlinearity compensation, electrical-balance-duplexer impedance detection, and radiator-embedded power combining. Wideband systems for spectroscopy achieve a wide bandwidth for integrated spectroscopy and parallel multi-tone generation. A 310-to-370GHz array embeds beam steering with independent frequency tuning. High data-rate communication is demonstrated with a 5m 130GHz 12.5Gb/s link and a 105Gb/s 300GHz transmitter.
Session Chair: Brian Ginsburg, Texas Instruments, Dallas, TX
Session Co-Chair: Payam Heydari, University of California, Irvine, Irvine, CA
291DIGEST OF TECHNICAL PAPERS •
ISSCC 2017 / February 7, 2017 / 1:30 PM
3:15 PM 17.5 An Intrinsically Linear Wideband Digital Polar PA Featuring AM-AM and AM-PM Corrections Through
Nonlinear Sizing, Overdrive-Voltage Control, and Multiphase RF Clocking M. Hashemi, Delft University of Technology, Delft, The Netherlands
In Paper 17.5, Delft University of Technology and Ampleon show a wideband, intrinsically linear digital PA. Nonlinear sizing, overdrive-voltage control, and multiphase RF clocking correct AM-AM and AM-PM nonlinearities to achieve -40dBc ACPR and -31dB EVM for a 40MHz 64-QAM signal with a peak PAE of 28.8%.
3:45 PM 17.6 Rapid and Energy-Efficient Molecular Sensing Using Dual mm-Wave Combs in 65nm CMOS: A 220-
to-320GHz Spectrometer with 5.2mW Radiated Power and 14.6-to-19.5dB Noise Figure C. Wang, Massachusetts Institute of Technology, Cambridge, MA
In Paper 17.6, the Massachusetts Institute of Technology describes a rapid and energy-efficient molecular sensing 220-to-320GHz spectrometer in 65nm CMOS. Dual THz combs produce 5.2mW radiated power and have a noise figure of 14.6 to 19.5dB.
4:15 PM 17.7 A Packaged 90-to-300GHz Transmitter and 115-to-325GHz Coherent Receiver in CMOS for Full-Band
Continuous-Wave mm-Wave Hyperspectral Imaging T. Chi, Georgia Institute of Technology, Atlanta, GA
In Paper 17.7, Georgia Institute of Technology presents a transmitter and a coherent receiver for full-band mm- wave hyperspectral imaging, flip-chip integrated with wideband Vivaldi antennas. A distributed quadrupler has +/-2dB Pout variation over 90 to 300GHz. The sub-harmonic receiver reaches -115dBm sensitivity and operates over 115 to 325GHz.
4:30 PM 17.8 A Compact 130GHz Fully Packaged Point-to-Point Wireless System with 3D-Printed 26dBi Lens
Antenna Achieving 12.5Gb/s at 1.55pJ/b/m M. Sawaby, Stanford University, Stanford, CA
In Paper 17.8, Stanford University, University of Nice, STMicroelectronics, Instituto de Telecomunicações, ISCTE-IUL, and University of Lisbon demonstrate a 130GHz point-to-point wireless system. Fully packaged silicon ICs integrated with a 3D printed antenna, a 5m wireless link is established with 32dBm EIRP, 12.5Gb/s with BER <10-6, and energy efficiency of <8pJ/b.
4:45 PM 17.9 A 105Gb/s 300GHz CMOS Transmitter
K. Takano, Hiroshima University, Higashihiroshima, Japan In Paper 17.9, Hiroshima University, NICT, and Panasonic Corporation describe a transmitter in 40nm CMOS operating at 300GHz. Introducing an image/LO suppression technique, it demonstrates a 105Gb/s data-rate over a single 32-QAM channel.
5:00 PM 17.10 A 318-to-370GHz Standing-Wave 2D Phased Array in 0.13μm BiCMOS
H. Jalili, University of California, Davis, Davis, CA In Paper 17.10, the University of California, Davis, presents a 318-to-370GHz 2D phased array in 0.13μm BiCMOS. Adding a travelling wave to a standing wave radiator allows beam steering over 128°/53° in the E/H planes, independent from the 15.1% frequency tuning.
2:45 PM 17.4 A Sub-mW Antenna-Impedance Detection Using Electrical Balance for Single-Step On-Chip Tunable
Matching in Wearable/Implantable Applications C. Lu, Holst Centre / imec, Eindhoven, The Netherlands
In Paper 17.4, imec describes antenna impedance detection for single-step on-chip tunable matching. Consuming only 0.83mW, the Cartesian detection embedded within an electrical balance duplexer demonstrates an accuracy of 18 degrees of phase and 0.1 of magnitude of the antenna reflection coefficient.
17
ISSCC 2017 / SESSION 17 / TX AND RX BUILDING BLOCKS / 17.1
17.1 A Digitally Assisted CMOS WiFi 802.11ac/11ax Front- End Module Achieving 12% PA Efficiency at 20dBm Output Power with 160MHz 256-QAM OFDM Signal
Yuen Hui Chee1, Fatih Golcuk1, Toru Matsuura1, Christopher Beale2, James F. Wang1, Osama Shanaa1
1MediaTek, San Jose, CA, 2MediaTek, Kent, United Kingdom
Front-end modules (FEM) typically employ expensive III-V or SiGe technologies to provide relatively higher PA output power and lower LNA noise figure (NF) for larger distance coverage compared to what can be achieved in a CMOS transceiver SoC [1]. The WiFi FEM is typically designed as a standalone entity using linear and inefficient PA topologies, such as Class-A/AB, resulting in an FEM not taking advantage of the full capability of the transceiver SoC. Furthermore, due to the stringent EVM requirement, almost 10dB back-off from Psat is required, resulting in a poor PAE of <7% at +20dBm Pout for the conventional Class-A/AB topologies regardless of device technology [1-3]. The CMOS FEM in Fig. 17.1.1 addresses the above issues and achieves performance comparable to that of GaAs/SiGe FEM but offers higher efficiency while using the full capability of the transceiver to enhance its performance. The proposed FEM integrates a PA, an LNA, a T/R switch, a transmit signal-strength indicator (TSSI) and an RF digital pre-distortion (DPD) calibration loopback path. It has two ICs integrated inside the same package. The PA, the LNA, and the DPD-loopback path are implemented on a 55nm bulk CMOS IC, while the T/R switch, PA output balun, and TSSI are integrated on a 0.18μm CMOS SOI IC.
The proposed FEM employs a Doherty PA topology to improve its efficiency. The Doherty PA combines a main-path amplifier and an auxiliary-path amplifier via a wideband impedance-inverter network as shown in Fig. 17.1.1. The impedance- inverter network modulates the load of the main amplifier such that it operates at a constant output swing from the 9dB-back-off point to the Psat of the Doherty PA for best efficiency. On the other hand, the auxiliary amplifier is designed to mainly deliver the signal level from the 9dB-back-off point to the Psat so that, when combined with the main-amplifier output, the OFDM signal is constructed with best overall efficiency. The impedance-inverter network causes a 90° phase shift between the main and auxiliary paths, and this is compensated by an input 0°/90° λ/4 coupler, which is implemented using lumped coupled inductors as shown in Fig. 17.1.2. The coupler is terminated with a complex termination impedance, Ziso, to achieve good gain and phase balance without any additional matching network between the coupler and the PA driver. The non-linear input capacitance of the PA can perturb the 0°/90° phase relationship at the PA input as a function of input power, resulting in a power loss. This is mitigated by adding a PMOS compensation capacitor at the PA input to offset the non-linear capacitance of the PA itself as shown in Fig. 17.1.2. The WiFi 11ac/11ax band extends from 4.9 to 5.9GHz, which poses a challenge in sustaining good PA performance over such a wideband. This problem has traditionally been combated by using a relatively low-Q network for resonance and matching circuits to cover the entire band, which reduces both the PA gain and its efficiency. In this design, however, programmable higher-Q resonant tank circuits with relatively narrow band are used for the PA. The resonance center frequency is digitally programmable by the transceiver via a high-speed 3-wire interface to the FEM during channel switching. The main and auxiliary PAs and PA drivers are differential common-source amplifiers with 3.3V I/O cascode devices. On-chip input and output baluns are used to convert the differential signals to single-ended signals and provide matching to reduce pin count and external components.
In order to take full advantage of the transceiver capability, DPD is applied to the PA, which requires a sensing RF-loopback path to feed the PA characteristics back to the transceiver and its DPD engine. A polynomial-based memory DPD is used to correct for the PA nonlinearity. An OFDM signal is used during PA DPD training and proper DPD polynomial coefficients are calculated in digital domain accordingly. The intrinsic EVM of the loopback path itself, due to noise and distortion of its own circuitry, should be reasonably better than that of the PA targeting the WiFi 11ac/11ax EVM of -35dB/-38dB for accurate DPD training. To relax the linearity requirements of its circuits, the DPD-loopback path employs both a fixed 20dB capacitor attenuator to reduce the high PA voltage swing plus a variable 20dB attenuator to accommodate the dynamic range of the receiver circuits in the transceiver SoC as shown in Fig. 17.1.1. The PA has a power gain of 26dB and with the targeted average output power of 20dBm, the average power levels at the PA input and at the DPD-loopback-path output during DPD PA training are -6dBm and -20dBm, respectively. Because of the finite isolation between the PA input and the RF loopback-path output, due to coupling inside
the FEM package, the transceiver SoC package, and the PCB routing between them, the integrity of the DPD-loopback path can be severely limited as the signal it carries back to the transceiver may be corrupted by the coupled signals from the forward transmit path. With such an arrangement, system level simulation with MCS9/MCS11 signals shows that an isolation between the forward transmit path and the DPD-loopback path of at least 40dB is required, which is quite difficult to reliably achieve at 5.9GHz with a small QFN package and different PCB layout routing to the transceiver SoC. To solve this problem, the loopback path uses a phase-alternator block, as shown in Fig. 17.1.1. The phase alternator consists of a pair of differential switches that can either pass though the input signal (Ph=0) or invert the input signal (Ph=1), as shown in Fig. 17.1.3. Any unwanted coupled signal to the RF-loopback path after the phase alternator can be cancelled by performing two sensing operations with different phase alternations and then subtracting them during digital post-processing.
The single-ended dual-gain LNA is realized using a cascode common-source topology with inductive degeneration and a tuned load with a bypass mode. The LNA input and PA output are connected to the antenna via an asymmetric CMOS SOI T/R switch shown in Fig. 17.1.3. The switch device sizes are selected to favor the TX insertion loss. The postlayout-simulated insertion loss of this switch is 0.4dB/0.6dB for transmitter/receiver path, respectively. A stacked switch structure is used to divide the large output swing among the stack devices for reliability. The FEM also features a TSSI with similar architecture as in [4] but with an improved sensing method for better accuracy.
The proposed FEM is fabricated and measured for specification compliance. The measurement in Fig. 17.1.4 shows the AM-AM of the PA measured at the PA RF output as well as at the RF DPD-loopback-path output. The intrinsic AM-AM obtained via the DPD-loopback path does not resemble the PA characteristics due to unwanted coupling as mentioned earlier, resulting in an incorrect DPD (AM- PM shows similar behavior). Using the phase alternator, two separate measurements are taken and the coupling is removed by subtracting the two measurements in the digital domain. There is still a small residue AM-AM (and AM-PM) peak error of 0.12dB (and 0.85deg) between the re-constructed behavior measured at the DPD-loopback output and the measurement at the antenna port, which results in a ~-45dB EVM floor and is sufficient for the targeted PA EVM with DPD. This finite error is caused by small unwanted signal coupling inside the FEM to the input of the phase-alternator circuit, which cannot be canceled out with phase alternation. Figure 17.1.5 shows the measured PA EVM and PAE vs. Pout at 5.8GHz with DPD. The PA+T/R switch achieve 20dBm/18dBm output power with a 12%/10% PAE and an EVM of -35dB/-38dB while passing the IEEE spectral mask with at least 2dB/4dB margin for 160MHz 802.11ac-MCS9/802.11ax-MCS11 signals, respectively. The measured efficiency includes all port-to-port PA-path circuits such as PA drivers, T/R switch loss, biasing-circuit current, integrated matching loss, etc. The measured Psat of the PA is 29dBm and it varies by <0.8dB over the entire 4.9-to-5.9GHz band. At 5.5GHz, the LNA+T/R switch achieve 2.6dB NF, -6dBm IP1dB, and 12.5dB power gain with 9mA supply current from a 2.5V supply. Under the bypass mode, the LNA+T/R switch provide -12dB power gain, 13dB NF and +7dBm IP1dB while consuming 0.5mA. The FEM is powered from a single 3.3V supply and requires no external matching components.
The die micrograph of the two ICs is shown in Fig. 17.1.7. The 0.18μm CMOS SOI IC die area is 2000μm×575μm, while the 55nm bulk CMOS IC die area is 1872μm×759μm. The FEM is housed in a 20-pin 5×5 QFN package. It passes 2000V/200V/500V HBM/MM/CDM ESD testing and reliability/stability tests. Figure 17.1.6 shows the comparison table with previously published work. The proposed CMOS FEM delivers similar output power as the state-of-the-art III-V/SiGe-based FEM while providing higher efficiency and supporting 160MHz modulation, thanks to the proposed topologies, design techniques, and digital assistance, which enhanced performance.
References: [1] RF Micro Devices Inc. “RFFM8505 FEM Datasheet,” Rev. DS131106, 2013, www.rfmd.com/store/downloads/dl/file/id/28884/rffm8505_data_sheet.pdf. Accessed Sept. 2016. [2] C.-W. Huang, et al., “A Highly Integrated Single Chip 5-6 GHz Front-end IC Based on SiGe BiCMOS that Enhances 802.11ac WLAN Radio Front-End Designs”, IEEE RFIC, pp. 227-230, May 2015. [3] J. Park, et al., “A Highly Linear Dual-Band Mixed-Mode Polar Power Amplifier in CMOS with an Ultra-Compact Output Network”, IEEE JSSC, vol. 51, no. 8, pp. 1756-1770, Aug. 2016. [4] Y.-H. Chung, et. al., “Dual-band Integrated Wi-Fi PAs with Load-Line Adjustment and Phase Compensated Power Detector”, IEEE RFIC, pp. 223-226, May 2015.
978-1-5090-3758-2/17/$31.00 ©2017 IEEE
ISSCC 2017 / February 7, 2017 / 1:30 PM
Figure 17.1.1: Proposed CMOS FEM block diagram. Figure 17.1.2: 0°/90° coupler schematic and main/aux-PA path nonlinear input capacitance compensation for efficiency enhancement.
Figure 17.1.3: DPD-loopback-path phase alternator and asymmetric SOI T/R- switch schematics.
Figure 17.1.5: Measured PA+T/R Switch PAE, dynamic EVM vs Pout, and spectral masks for WiFi 11ac VHT160 MCS9/11ax HE160 MCS11 signals with DPD. Figure 17.1.6: Comparison table with prior art.
Figure 17.1.4: Measured AM-AM characteristics at antenna (ANT) port and DPD- loopback port, and reconstructed PA+DPD loopback AM/AM.
17
ISSCC 2017 PAPER CONTINUATIONS
294 • 2017 IEEE International Solid-State Circuits Conference
ISSCC 2017 / SESSION 17 / TX AND RX BUILDING BLOCKS / 17.2
17.2 A 28GHz Magnetic-Free Non-Reciprocal Passive CMOS Circulator Based on Spatio-Temporal Conductance Modulation
Tolga Dinc, Harish Krishnaswamy
Columbia University, New York, NY
A significant challenge for silicon-based mm-wave systems is a low-loss shared- antenna (ANT) interface with high linearity, isolation (ISO) and bandwidth (BW). Shared ANT interfaces with simultaneous transmit and receive capability are critical for mm-wave 5G base stations that need to simultaneously communicate with multiple users, FMCW-based radars, and emerging full-duplex systems [1].
Lorentz reciprocity is a fundamental property of any linear and time-invariant medium characterized by symmetric permittivity and permeability tensors. A three-port passive network cannot be reciprocal, lossless, and matched at all ports at the same time. As a result, a reciprocal, passive, matched shared-ANT interface, such as an electrical-balance duplexer (EBD) [2], has at least a 3dB theoretical loss (typically around 4dB at RF/mm-wave). This theoretical loss can be avoided by breaking Lorentz reciprocity. Conventional non-reciprocal circulators rely on ferromagnetic materials, but they are bulky, expensive, and not compatible with CMOS. Active devices are inherently non-reciprocal, but active circulators are severely limited in their linearity and noise performance [3]. In recent years, there has been progress on achieving non-reciprocity and building circulators without magnetic materials by exploiting time-variance, specifically spatio-temporal permittivity modulation at low-RF frequencies [4,5]. In a circuit implementation, the permittivity modulation is achieved using varactors, and in general has limited modulation index on semiconductor substrates (Cmax /Cmin is usually around 2-4), either resulting in large form-factors [4] or high losses [5]. In addition, these techniques are yet to be demonstrated on silicon and are also not suitable for mm-wave due to the poor varactor quality factor. On the other hand, conductivity can be easily modulated on a semiconductor substrate using transistors as passive switches. Conductivity modulation has a very high modulation index (CMOS transistor ROFF/RON can be as high as 103 to 105), resulting in small form factor and very low loss as demonstrated in the RF CMOS circulator in [6,7] using an N-path filter-based implementation. However, N-path filters are not amenable to mm-wave due to stringent clocking requirements and transistor parasitics.
We present a near-28GHz fully integrated circulator in 45nm SOI CMOS, demonstrating magnetic-free passive non-reciprocity on silicon at mm-wave. Millimeter-wave operation is enabled by the concept of spatio-temporal conductance modulation, which results in the breaking of phase non-reciprocity. The spatio-temporal conductance modulation features the following advantages over the phase-shifted N-path filter concept of [6,7]: (i) modulation or switching is performed at a frequency much lower than the operation frequency (1/3rd in this case), enabling operation at mm-wave, (ii) four 50% duty-cycle I/Q phases are required, as opposed to a large number of low-duty-cycle clock phases typically used in N-path filters, again easing mm-wave operation, and (iii) switching is performed across transmission-line delays, as opposed to capacitors as in N-path filters, enhancing BW. The 25GHz circulator achieves 3.3/3.2dB TX- to-ANT/ANT-to-RX insertion losses (IL), respectively, 18.3 to 21.2dB of TX-to-RX isolation (ISO) over the 4.6GHz 1dB IL BW, 3.3 to 4.4dB ANT-to-RX NF, and >21dBm TX-to-ANT/ANT-to-RX input P1dB.
The spatio-temporal conductance modulation concept consists of two sets of I/Q switches implemented as Gilbert quads on either end of I/Q transmission line delays (Fig. 17.2.1). The operation of this structure can be explained with a simplified-circuit theoretic mixing analysis. In the forward direction, the first set of I/Q switches commutates the signal at a frequency (ωm) lower than the operating frequency (ωin), thus creating two mixing products at ωin+ωm and ωin-ωm. The mixing products in the I/Q paths experience phase shifts of -φ1 and -φ2 at ωin-ωm and ωin+ωm, respectively, as they flow through the transmission lines. On the other end, the second set of I/Q switches commutates the mixing products at ωm but with a staggered phase shift of φ, generating mixing products at ωin, ωin+2ωm, and ωin-2ωm. The signals at ωin-2ωm and ωin+2ωm are 180° out of phase and cancel out. On the other hand, if 2φ = φ1-φ2 =-π (or equivalently, 2ωmTd = π, where Td is the delay of the transmission line), the mixing products at ωin add up constructively into a single signal with perfect lossless transmission and a phase shift of φ-φ1, or -90°-φ1. A similar analysis in the reverse direction shows lossless transmission but a non-reciprocal phase shift of -φ -φ1, or +90° -φ1. In other words, the spatio-temporal modulation network provides a non- reciprocal phase shift with a 180° difference between forward and reverse directions (and lossless reciprocal magnitude response). We choose
φ1=(ωin-ωm)Td=180°, giving insertion phases of +90° and -90° in the forward and reverse directions, respectively. Combining this equation with 2ωmTd = π, we obtain ωm=ωin/3 (i.e. the third subharmonic). A circulator can now be constructed by exploiting constructive and destructive interference between the non-reciprocal phase element and a reciprocal phase element in forward and reverse directions (Fig. 17.2.1). Similar to [6,7], a 3λ/4 transmission-line loop is wrapped around the non-reciprocal phase component so that signals can circulate in only one direction (-270°-90° results in constructive interference in one direction and -270°+90° results in destructive interference in the other). As mentioned, the use of a lower modulation frequency as well as 50% duty-cycle IQ clocks enables operation at mm-waves. Further, the use of transmission lines in both the reciprocal and non-reciprocal paths leads to better frequency-response matching between them, resulting in superior ISO and IL BWs than in [6,7].
Here, for 25GHz operation, ωm= ωin/3=8.33GHz. A differential architecture (Fig. 17.2.2) reduces the LO feedthrough and improves power handling. The fully balanced Gilbert quads are designed using 2×16μm/40nm floating-body transistors, improving the power handling further. Artificial transmission lines with inductor Q of 20 are used in the non-reciprocal phase element, which is symmetrically placed between the TX and RX ports so that device parasitics could be absorbed into the λ/8 artificial transmission lines on either side. The LO path consists of fine-phase-tuning varactors, a 2-stage poly-phase filter generating the differential I/Q signals, and self-biased differential inverter chains to generate the 8.33GHz square clocks.
Figures 17.2.3 to 17.2.5 describe the measurements, in which the on-chip baluns are de-embedded through test structures, and a probe, terminated with a broadband 50Ω load, is landed on the third port in each measurement. The small- signal TX-to-ANT and ANT-to-RX ILs are 3.3dB and 3.2dB, respectively, and a broadband TX-to-RX ISO of 18.3 to 21.2dB over 4.6GHz (the 1dB BW of the ILs) is seen, limited by the impedance of the probe and load at the third port (a challenge for all circulators and exacerbated at mm-wave). The TX-to-ANT/ANT- to-RX input P1dBs are >+21.5/+21dBm, respectively (setup limited). The TX-to-ANT and ANT-to-RX IIP3s are ~+20dBm. The noteworthy P1dBs are high relative to the IIP3s due to a graceful large-signal linearity degradation mechanism inherent to the concept that is beyond the scope of this paper. TX-to-RX ISO compresses by 1dB and 3dB at +11.4dBm and +21.45dBm, respectively. This can be partially compensated by an external ANT impedance tuner, which enables a higher initial small-signal ISO of 25dB (still setup limited) and 1dB compression at +12.3dBm. ANT-RX NF is 3.3 to 4.4dB, consistent with the IL and showing negligible degradation due to LO phase noise. When compared with prior art (Fig. 17.2.6), this work is superior to active mm-wave circulators [3] in all metrics (loss, linearity, NF and BW). When compared with a passive EBD [2], this work achieves >1dB overall advantage in the sum of TX-to-ANT and ANT-to-RX ILs while operating at >10× higher frequency, although techniques to improve power handling and integrated tuners/balance networks for ISO are desirable. When compared with the N-path filter-based circulator of [6,7], this work scales to mm- wave and significantly enhances BW. Losses and LO path power can be lowered further with a process with thick upper metals and by using tuned LO buffers, respectively.
Acknowledgments: We acknowledge the DARPA ACT program and NSF EFMA 1641100 for financial support, and Global Foundries for fabrication donation.
References: [1] T. Dinc, et al., “A 60GHz CMOS Full-Duplex Transceiver and Link with Polarization-Based Antenna and RF Cancellation,” IEEE JSSC, vol. 51, no. 5, pp. 1125-1140, May 2016. [2] B. van Liempd, et. al., "A +70dBm IIP3 Single-Ended Electrical-Balance Duplexer in 0.18μm SOI CMOS," ISSCC, pp. 1-3, Feb. 2015. [3] J.-F. Chang, et. al., "Design and Analysis of 24-GHz Active Isolator and Quasi- Circulator," IEEE TMTT, vol. 63, no. 8, pp. 2638-2649, Aug. 2015. [4] S. Qin, et. al., "Nonreciprocal Components with Distributedly Modulated Capacitors," IEEE TMTT, vol. 62, no. 10, pp. 2260-2272, Oct. 2014. [5] N. Estep, et al., "Magnetless Microwave Circulators Based on Spatiotemporally Modulated Rings of Coupled Resonators," IEEE TMTT, vol. 64, no. 2, pp. 502- 518, Feb. 2016. [6] J. Zhou, et al., "Receiver with Integrated Magnetic-free N-Path-Filter-Based Non-Reciprocal Circulator and Baseband Self-Interference Cancellation for Full- Duplex Wireless," ISSCC, pp. 178-180, Feb. 2016. [7] N. Reiskarimian and H. Krishnaswamy, “Magnetic-Free Non-Reciprocity Based on Staggered Commutation,” Nat. Commun., vol. 7, no. 4, Apr. 2016.
978-1-5090-3758-2/17/$31.00 ©2017 IEEE
Figure 17.2.1: Magnetic-free non-reciprocal passive mm-wave circulator based on spatio-temporal conductance modulation.
Figure 17.2.2: Block and circuit diagram of the 25GHz magnetic-free non- reciprocal passive 45nm SOI CMOS circulator.
Figure 17.2.3: Measured circulator TX-ANT, ANT-RX and TX-RX S-parameters.
Figure 17.2.5: Measured TX-RX large-signal isolation and circulator ANT-RX noise figure. Figure 17.2.6: Performance summary and comparison.
Figure 17.2.4: Circulator TX-ANT and ANT-RX large-signal measurements.
17
ISSCC 2017 PAPER CONTINUATIONS
Figure 17.2.7: Die micrograph of the test chip in 45nm SOI CMOS.
296 • 2017 IEEE International Solid-State Circuits Conference
ISSCC 2017 / SESSION 17 / TX AND RX BUILDING BLOCKS / 17.3
17.3 A 60GHz On-Chip Linear Radiator with Single-Element 27.9dBm Psat and 33.1dBm Peak EIRP Using Multifeed Antenna for Direct On-Antenna Power Combining
Taiyun Chi, Fei Wang, Sensen Li, Min-Yu Huang, Jong Seok Park, Hua Wang
Georgia Institute of Technology, Atlanta, GA
A major challenge for low-cost silicon-based mm-wave wireless links, e.g., for the 5G communication, is to provide large transmitter (Tx) output power (Pout) with high energy efficiency and linearity from a limited supply voltage, so that the high path loss and limited link budget at mm-wave can be compensated. Power combining is often required for high-power mm-wave Tx. The existing power- combining techniques are mainly in two categories. Passive on-chip/on-package networks can combine Pout from multiple power amplifiers (PAs) and feed a single antenna port [1-4]. However, lossy power combiners and large impedance transformation ratios degrade the total Pout delivered to the antenna and lower the Tx efficiency. Alternatively, spatial power combining using antenna array increases the total EIRP but at the expense of a large array-panel size. Moreover, a large antenna array often presents an exceedingly narrow (or even pencil-sharp) beamwidth; this complicates the Tx/Rx alignment and is challenging for dynamic and mobile mm-wave applications, such as 5G links. In addition, adding silicon lens enhances EIRP but increases cost and packaging complexity.
We propose an on-chip linear radiator, as a multifeed antenna (MFA) driven by multiple linear PAs, which achieves very low-loss direct on-antenna power combining and boosts the total Tx Pout with high efficiency, demonstrating the unique advantages of circuit-antenna co-designs. Its single-element implementation in a 45nm CMOS SOI process generates +27.9dBm saturated power (Psat) with 23.4% PAE and +33.1dBm peak EIRP at 59GHz. It supports 4Gb/s 16-QAM signal with -21.9dB EVM and 4.8Gb/s 64-QAM signal with -25.4dB EVM. The radiator element can be implemented as an array to further boost the Tx EIRP, and its frequency can be scaled to address various mm-wave applications.
The proposed radiator consists of cascaded lumped Wilkinson dividers for input- power distribution, 16 unit PAs, 4-to-1 parallel power combiners at the PA outputs, and a 4-feed on-chip slot antenna for direct on-antenna power combining and radiation (Fig. 17.3.1). The unit PA has 2 stages, comprising a Class-B cascode PA stage with 2V VDD and a Class-AB common-source driver with 1V VDD. Neutralization capacitors improve the differential-mode stability and power gain. The input balun, inter-stage matching, and PA-output-matching networks are designed using transformers for compact layout.
Compared with conventional single-feed antennas, the MFA comprises multiple antenna feeds whose feeding signals collectively synthesize the desired on- antenna voltage/current profiles and then realize an identical far-field pattern as its single-feed antenna counterpart [5-7]. The MFA combines the power from multiple feeds directly on the antenna with high efficiency. Moreover, the MFA can realize on-antenna power combining that down-scales the radiation impedance at each feed without extra passive network and greatly eases Tx designs. A 4-feed MFA design and 4-to-1 parallel combiners are shown in Fig. 17.3.2 with a zoom-in view of the feed1 and feed2. The 4 feeds are driven from two sides of the slot with equal amplitude and 180° phase difference to excite the desired differential E-field for slot radiation. The 3D EM-simulated input impedance at each feed is 13Ω at 60GHz, verifying radiation impedance down- scaling by the MFA. The 4-to-1 parallel combiner scales the feed impedance to 52Ω at each unit-PA output to match the optimum load-pull impedance. The 4- feed slot MFA uses the top aluminum layer as the ground plane and radiates from the chip backside. High-resistivity substrate of the SOI process results in 5.6dBi simulated peak antenna gain at 60GHz, 74.5% MFA radiation efficiency, 84% (0.7dB) combiner passive efficiency, and thus 62.5% total radiation efficiency including the MFA and the combiners (Fig. 17.3.2). The MFA and the parallel combiners combine Pout from 16 unit PAs on the antenna, achieving 11.3dB enhancement on the total Pout. Note that the MFA boosts the Tx Pout in a single- antenna footprint (slot size=2.5mm×20μm) without any silicon lens. Unlike array-based spatial combining, the MFA does not enlarge the panel size or reduce the antenna beamwidth.
To evaluate the unit PA, a PA test structure (TS) is implemented in the same process (Fig. 17.3.3). The measured TS gain is 24dB and 22.4dB at 55GHz and 60GHz, respectively. The measurement shows 16.7dBm Psat and 14.6dBm P1dB at 60GHz with 28.3% PAEmax and 22.8% PAE1dB, matching well with the simulations. The PA TS Psat variation is below 0.8dB from 50 to 67GHz.
Next, the radiator IC is flip-chip packaged to a Rogers CLTE-AT PCB, and the mm- wave input is fed by an end-launch connector. A horn antenna and power sensor measure the radiator CW output at far-field (65cm). At center 59GHz, the radiator achieves 33.1dBm peak EIRP (Fig. 17.3.4). The measured E-/H-plane patterns closely match the EM simulations. The E-/H-plane ripples are due to the finite PCB ground. A full 3D scan is required to directly measure the total radiated power, which is not supported by our measurement setup. Instead, the 3D EM-simulated antenna gain is used to calculate Pout from measured EIRP, yielding 27.9dBm Psat
at 59GHz with 23.4% PAEmax. Alternatively, the total Pout is estimated using TS measurements. Considering the 12dB enhancement due to 16-PA power combining and the 0.7dB simulated parallel-combiner loss, the TS-based estimated Psat is 27.8dBm, closely matching the EIRP-based Psat value of 27.9dBm. The 1dB Psat bandwidth is from 56.5 to 61GHz and is limited by the slot antenna. At 59GHz, the P1dB is 25dBm, and the AM-PM nonlinearity measured using a sampling oscilloscope is 5.6°. The radiator performance with complex modulations at different symbol rates is shown in Fig. 17.3.5. The input baseband data is generated by an AWG and is up-converted to 59GHz by an external mixer and an image-rejection filter. Without any pre-distortion, the radiator achieves -21.9dB EVM with 20.2dBm Pavg for 4Gb/s (1GSym/s) 16-QAM signal, and -25.4dB EVM with 19.3dBm Pavg for 4.8Gb/s (0.8GSym/s) 64-QAM signal, verifying its linearity in dynamic operations. Compared with the reported 60GHz silicon PAs in Fig. 17.3.6, this work achieves the highest Psat and P1dB with a very competitive PAE and data rate. The linear radiator can be scaled to an array and/or to different frequency for various mm-wave applications.
Acknowledgements: The authors would like to acknowledge GlobalFoundries and Army Research Office (ARO). The authors also thank members in Georgia Tech GEMS Group for helpful technical discussions.
References: [1] C. Chappidi and K. Sengupta, “A Frequency-Reconfigurable Mm-Wave Power Amplifier with Active-Impedance Synthesis in an Asymmetrical Non-Isolated Combiner,” ISSCC, pp. 344–345, Feb. 2016. [2] A. Larie, et al., “A 1.2V 20 dBm 60 GHz Power Amplifier with 32.4 dB Gain and 20% Peak PAE in 65nm CMOS,” ESSCIRC, pp.175–178, Sept. 2014. [3] R. Bhat, et al., “Large-Scale Power-Combining and Mixed-Signal Linearization Architectures for Watt-Class mmWave CMOS Power Amplifiers,” IEEE TMTT, vol. 63, no. 2, pp. 703–718, Feb. 2015. [4] D. Zhao, and P. Reynaert, “An E-Band Power Amplifier with Broadband Parallel-Series Power Combiner in 40-nm CMOS,” IEEE TMTT, vol. 63, no. 2, pp. 683–690, Feb. 2015. [5] R. King and T. Wu, “The Cylindrical Antenna with Arbitrary Driving Point,” IEEE Trans. Antennas Propag., vol. 13, no. 5, pp. 710–718, Sept. 1965. [6] S. Bowers and A. Hajimiri, “Multi-Port Driven Radiators,” IEEE TMTT, vol. 61, no. 12, pp. 4428–4441, Dec. 2013. [7] S. Li, et al., “A Multi-Feed Antenna for Antenna-Level Power Combining,” IEEE APS/URSI, June 2016. [8] K. Khalaf, et al., “Digitally Modulated CMOS Polar Transmitters for Highly- Efficient mm-Wave Wireless Communication,” IEEE JSSC, vol. 51, no. 7, pp. 1579–1592, July 2016.
978-1-5090-3758-2/17/$31.00 ©2017 IEEE
ISSCC 2017 / February 7, 2017 / 2:30 PM
Figure 17.3.1: Circuit schematic of the 60GHz on-chip linear radiator element with a multifeed antenna (MFA) implemented as a 4-feed on-chip slot antenna for direct on-antenna power combining.
Figure 17.3.2: EM-simulated synthesized E-field distribution by the 4 feeds on the slot antenna, load impedance for the unit PA (ZL), peak antenna gain and radiation efficiency versus frequency.
Figure 17.3.3: Die micrograph of the unit-PA test structure (TS), simulated/measured small-signal S-parameters and large-signal performance versus frequency of the TS.
Figure 17.3.5: Measured EVM, constellations, and spectra with complex modulations at 59GHz. Figure 17.3.6: Comparison with state-of-the-art silicon mm-wave PAs.
Figure 17.3.4: Measured large-signal performance and radiation patterns of the linear radiator IC.
17
ISSCC 2017 PAPER CONTINUATIONS
Figure 17.3.7: Pictures of the flip-chip packaged PCB, die micrograph, and an X-ray image to verify the alignment between the flip-chip bumps and the PCB traces.
298 • 2017 IEEE International Solid-State Circuits Conference
ISSCC 2017 / SESSION 17 / TX AND RX BUILDING BLOCKS / 17.4
17.4 A Sub-mW Antenna-Impedance Detection Using Electrical Balance for Single-Step On-Chip Tunable Matching in Wearable/Implantable Applications
Chuang Lu, Ao Ba, Yao-Hong Liu, Xiaoyan Wang, Christian Bachmann, Kathleen Philips
Holst Centre / imec, Eindhoven, The Netherlands
Wearable/implantable devices, e.g., heart-rate-monitor straps and implanted wireless sensors, need to be ultra-low-power (ULP), compact, and also robust against the proximity effect, which can significantly degrade the antenna and front- end performance and hence battery lifetime. A fully integrated adaptive front-end with a tunable matching network (TMN) using low-power and fast impedance detection is highly desirable for robust and efficient operation.
The impedance mismatch detection is crucial in such tuning systems in terms of tuning speed and power consumption. Several impedance detection techniques are presented in recent prior arts [1-3]. All of them have limited detection precision, and only detect the mismatched impedance in certain direction [1,2] or range [3]. Time-consuming optimization methods (i.e. exhaustive search or a successive approximation) are therefore required. Yet, no on-chip exact impedance detection has been demonstrated. In addition, the detection methods in [1,3] consume approximately 30mW due to the complexity of the detection circuits, which is not suitable for the ULP applications with only a sub-mW power budget for the detection circuit. Moreover, an external bulky directional coupler [1] or off-chip tuner [3] for the detection or tuning are not allowed in the targeted applications because of the form factor limitation. In this work, a fully integrated low-power impedance-detection technique is presented, featuring an improved detection precision by an impedance calibration, which enables single-step fast matching-network tuning.
The concept of the proposed detection technique using a hybrid transformer (XFM) is shown in Fig. 17.4.1. A hybrid-XFM-based duplexer was proposed in [4] to achieve a high TX-RX isolation by satisfying an electrical balance. In this work, the hybrid XFM is utilized in a different configuration for the impedance detection. The actual impedance seen from the XFM (i.e. Z’Ant or Γ’Ant) is compared to the desired reference impedance (ZOpt), and the amplitude and phase of the detected “leakage” signal (PDet) is utilized to determine the antenna impedance. As described by the left equation in Fig. 17.4.1, when ZOpt of 50Ω is loaded, the phase difference between PDet and PPA reveals the phase of Γ’Ant, and the ratio of the amplitude of PDet over PPA is the amplitude of Γ’Ant multiplied by a constant depending on r, which is the aspect ratio of the XFM.
The impedance detection is done by an IQ down-conversion, as shown in the right part of Fig. 17.4.1, which preserves both amplitude and phase information. The IQ mixers are switched by the same input signal as the PA. Note that the phase information of the IQ-mixer outputs (i.e. atan(VDet,Q/VDet,I)) includes not only the phase of Γ’Ant, but also phase offsets (Φoffset) introduced by the PA, the XFM, IQ generation, and the buffer. To improve the precision of ΓAnt detection, an impedance calibration is necessary to de-embed: (1) the phase offset Φoffset; (2) the absolute gain of the detection path; and (3) the impedance transformation of the 50Ω TMN (i.e. from ΓAnt to Γ’Ant). The impedance calibration is done by switching to a reference impedance with a known ΓCal (e.g.,ΓCal of 0.5, or ZCal of 150Ω, is used in this design) at the antenna port. As shown in the second equation in Fig. 17.4.1, the ratio between the known ΓCal and the corresponding detected IQ levels (VDet,I|ΓCal + j·VDet,Q|ΓCal) is used to derive the actual ΓDet.
Figure 17.4.2 shows the implemented top-level diagram. The balun for the differential PA is reused for the hybrid XFM, with an extra tap to the detection circuits. Comparing to the detector using weak coupling in [3], the hybrid XFM is more reliable, because its behavior is better controlled and less vulnerable to undesired coupling (e.g., EM coupling from the LC-tank of an oscillator). The XFM has an aspect ratio of r = 0.16 in order to reduce power loss in the reference load, and a turn ratio of 1:1. A TMN with a two-stage PI network is implemented with a tuning range to cover |ΓAnt| of up to 0.5. The tuning capability is achieved from the shunt switchable capacitor banks, as shown in Fig. 17.4.2, which has lower loss and better linearity than series tunable capacitors. The TMN is characterized to map a setting for the optimum PA and LNA performance to each of the ΓAnt
values. Note that the passive components in the TMN are less sensitive to the temperature and supply, as a result of which frequent impedance calibration is not necessary. The TMN is mainly sensitive to variations of the capacitors, which can be characterized in practice and applied to the default look-up-table (LUT). Furthermore, switches SWDET, SWPA and SWRX select between a detection mode, a PA mode (hybrid XFM as a normal balun), and an RX mode.
To verify the detection technique, it is integrated with the TMN and a low-power front-end module, including a 2.4GHz differential Class-D PA and a sliding-IF RX. Note that the matching network is also necessary to suppress the harmonics from the switching PA. To share the TMN, the same optimum load/source impedance is designed for the PA/LNA, and a fixed matching network at the LNA input is used to match to a higher impedance. Since a conventional quadrature-LO generation at the carrier frequency is too power hungry (few mW) for just impedance detection and is not readily available from the sliding-IF structure, in this work, a dedicated ultra-low-power digital-to-time-converter (DTC)-based quadrature-LO generation with a relaxed noise performance is proposed, as shown in Fig. 17.4.3. To assure a 90° delay, a calibration scheme is further proposed, by comparing the outputs of the XOR and XNOR operations on the IQ signals. The IQ generation and calibration are verified by measurements with an IQ mismatch of only 1.4°. Note that this calibration is robust against PVT variations, since only relative comparison between XOR and XNOR is necessary.
The chip is fabricated in 40nm CMOS, occupying a core area of 1.1×1.1mm2 (see Fig. 17.4.7). The power consumption of the detection is only 0.83mW (including 0.6mW for the LO quadrature generation and buffering). A source/load impedance tuner is used to verify the impedance detection and characterize the RX and PA performance in case of impedance variation. Figure 17.4.4 shows the measured I and Q levels for different ΓAnt (upper) and the calibrated phase and amplitude of ΓDet (lower). The raw IQ levels result in a phase error of about 190° and the amplitudes are not directly linked to the |ΓAnt|. After calibration, the phase of ΓDet
corresponds to the phase of the actual ΓAnt, and is not dependent to |ΓAnt|. The detection phase is significantly improved after calibration with a worst-case error of about 18°, which is sufficient for the impedance tuning. The calibrated |ΓDet| shows a good accuracy with an error of about 0.1 up to |ΓAnt| of 0.5. Based on the detected impedance, an optimum TMN setting is applied for the RX and PA. The resulted NF of the RX, Pout, and the efficiency, η, of the PA are shown in Fig. 17.4.5, in comparison with the non-tuned case (with TMN setting for 50Ω matching). The NF is improved up to 0.9dB and 1.3dB for |ΓAnt| of 0.3 and 0.5, and the tuning range of the input matching is shown in the Smith chart. Marginal improvement of the NF is noticed for ΓAnt phase larger than 180°, which can be further improved by increasing the tuning range of the TMN. The Pout is improved up to 0.5dB and 1.2dB for |ΓAnt| of 0.3 and 0.5 respectively, while there is an overall improvement on η. Performance summary and comparison are shown in Fig. 17.4.6. The proposed impedance-detection technique is demonstrated with low-power and improved accuracy, which enables single-step TMN tuning for better PA and RX performance in an adaptive front-end module in presence of antenna impedance variations.
References: [1] H. Song, et al., "A CMOS Adaptive Antenna-Impedance-Tuning IC Operating in the 850MHz-to-2GHz Band," ISSCC, pp. 384-385, Feb. 2009. [2] Y. Yoon, et al., "A 2.4-GHz CMOS Power Amplifier with an Integrated Antenna Impedance Mismatch Correction System," IEEE JSSC, vol. 49, no. 3, pp. 608-621, Mar. 2014. [3] S. Kousai, et al., "Polar Antenna Impedance Detection and Tuning for Efficiency Improvement in a 3G/4G CMOS Power Amplifier," ISSCC, pp. 58-59, Feb. 2014. [4] M. Mikhemar, et al., "A Tunable Integrated Duplexer with 50dB Isolation in 40nm CMOS," ISSCC, pp. 386-387, Feb. 2009.
978-1-5090-3758-2/17/$31.00 ©2017 IEEE
ISSCC 2017 / February 7, 2017 / 2:45 PM
Figure 17.4.1: Concept diagram (left) and calibration scheme (right) of the antenna-impedance detection.
Figure 17.4.2: The top-level diagram of the front-end module for impedance- mismatch detection and tuning.
Figure 17.4.3: The DTC-based quadrature generation for the detection and the calibration using XOR and XNOR.
Figure 17.4.5: Measured RX and PA performance before and after tuning. Figure 17.4.6: Performance summary and comparison of the mismatch- detection and tuning systems.
Figure 17.4.4: Measured and calibrated detection outputs.
17
ISSCC 2017 PAPER CONTINUATIONS
Figure 17.4.7: Die micrograph.
ISSCC 2017 / SESSION 17 / TX AND RX BUILDING BLOCKS / 17.5
17.5 An Intrinsically Linear Wideband Digital Polar PA Featuring AM-AM and AM-PM Corrections Through Nonlinear Sizing, Overdrive-Voltage Control, and Multiphase RF Clocking
Mohsen Hashemi1, Yiyu Shen1, Mohammadreza Mehrpoo1, Mustafa Acar2, René van Leuken1, Morteza S. Alavi1, Leonardus de Vreede1
1Delft University of Technology, Delft, The Netherlands 2Ampleon, Nijmegen, The Netherlands
To fully benefit from the progress of CMOS technologies, it is desirable to completely digitize the TX, replacing its final stage with a digitally controlled PA (DPA). The DPA consists of arrays of small sub-PAs that are digitally controlled to modulate the output amplitude, thus operating as an RF-DAC [1-6]. DPAs are normally designed in a switched mode (Classes E/D/D-1, etc.) to achieve high efficiency while using high sampling rate to attenuate and push the spectral images to higher frequencies. However, they suffer from high nonlinearity in their AM-code-word (ACW) to AM and ACW-to-PM conversion. To correct for such nonlinearities, digital pre-distortion (DPD) of the input signal is often used [1-3], typically implemented by look-up tables (LUT). Unfortunately, DPD approaches suffer from large signal-BW expansion due to their inherently nonlinear characteristics. This, combined with the already present BW regrowth in a polar TX in the AM and PM paths, yields significant hardware-speed/power constraints when the signal BW becomes large. For a Cartesian TX, the use of LUT-DPD is even more complicated since a full 2D LUT is typically required [2]. To relax the overall system complexity, it is highly desirable to have a PA with a maximum inherent linearity without compromising its power or efficiency. In this work, an ACW-AM correction based on nonlinear sizing along with controlling the peak voltage of RF clocks (overdrive voltage tuning) and a ACW-PM correction based on multiphase RF clocking are introduced to linearize the characteristic curves of a Class-E polar DPA with intent to avoid any kind of pre-distortion.
Figure 17.5.1 depicts this concept featuring 9b amplitude modulation. The DPA consists of 8 differential sub-PA segments. These segments are nonlinearly sized to linearize the ACW-AM. The PVT/load impedance variations are compensated by digitally tuning the supply voltage of PA buffers with an on-chip programmable LDO. The input PM RF clock is amplified and fed to the multiphase RF-clocking circuit, which generates 5 differential RF clocks with different phase offsets. The output clocks are applied simultaneously to different sub-PA segments to flatten the ACW-PM curve. Explicit capacitive tuning between the DPA differential outputs is used to enhance the efficiency at full power against clock skewing and duty- cycle variation. A passive mesh structure equalizes the delays for all RF clock lines from the multiphase RF-clocking circuit to the DPA. POUT-aware clock gating is used to reduce driver power in back-off. Coarse time alignment between the amplitude and phase information is performed in the digital domain; fine adjustments are handled by a 4b delay line in the path of the ACW sampling clock.
Figure 17.5.2 shows the ACW-AM linearization concept (top) and its simplified circuit implementation (bottom). In a Class-E (or D-1) DPA, which uses transistor on-resistance (RON) for AM control [1], the ACW-AM curve is a strongly nonlinear function of RON-to-RLoad ratio. Conventionally, the total size of the switched-on devices is a linear function of ACW [1-6]. By nonlinearly sizing the sub-PA segments based on the inverse of the ACW-AM curve, the overall ACW-AM curve of the DPA is piecewise linearized (Fig. 17.5.2 top). Since the transistor RON is a function of the overdrive voltage (VOD) too, it is used for compensating the PVT/load variations with insignificant impact on the peak drain efficiency (DE) and PSAT. A 6b programmable on-chip LDO controls the VOD for the whole DPA by controlling the VDD of RF-clock buffers. It has a resolution of 9 to ~10mV with an output range of 0.6 to 1.2V. Once programmed for optimum linearity, it is fixed during the normal DPA operation.
Figure 17.5.3 shows the ACW-PM correction based on multiphase RF clocking. It consists of a bank of 5 static phase shifters implemented by programmable delay lines. For a nonlinearly sized Class-E DPA with the conventional single-phase RF clocking, the output phase decreases with increasing ACW (Fig. 17.5.3 top). In time domain, this translates to an increment in the delay at the output. In order to reduce the phase error, the RF clocks of smaller sub-PA segments are applied
with larger fixed delays (larger phase offsets) and the larger segments are fed with smaller fixed delays (smaller phase offsets). In this technique, in contrast to a conventional DPA with a DPD approach, no dynamic phase correction is needed for each ACW code. For example, at 6dB back-off power, the sub-PA segments 1-2, 3-4 are fed simultaneously with phase offsets Δφ1 > Δφ2, respectively, while other segments are turned off and at full power, the sub-PA segments 1-2, 3-4, 5-6, 7, 8 are fed simultaneously with phase offsets Δφ1 > Δφ2 >Δφ3 > Δφ4 > Δφ5, respectively. The output currents of these individual sub-PA segments are summed, and the overall output phase is inherently averaged, allowing a controllable and considerable reduction of phase error in the ACW-PM curve. The phase-offset of each clock is digitally controllable over a range of 80° with a resolution of 5°, which is more than enough to cover errors caused by PVT/load changes.
The circuit is prototyped in 40nm bulk CMOS. The core area is 1mm×0.45mm (Fig. 17.5.7). The raw ACW is applied to the DPA using an on-chip 4K SRAM running at 625MHz. The input RF signal is phase modulated off-chip. All measurements are done without applying any kind of DPD. Measured ACW-AM and ACW-PM for a digital input ramp are shown in Fig. 17.5.4(top). The effectiveness of PVT/load compensation for ACW-AM is measured and plotted for different control-bit settings. The ACW-PM curves before averaging (for one ramp period) and after averaging (over 256 ramp periods) are shown. The two-tone measurements (Fig. 17.5.4 bottom) with a 1.2MHz bandwidth show an IM3 and IM5 of -60dBc and -50dBc, respectively, at 2 to 2.1 GHz. Peak POUT, DE and PAE are measured and plotted vs. frequency for different PA VDD (Fig. 17.5.4 bottom). It can be seen that by increasing the drain voltage, the peak POUT and PAE increase, while the peak DE does not change significantly. However, for reliability concerns, the drain voltage is kept less than 0.6V during the normal operation. With a VDD=0.6V, the peak POUT, DE and PAE are 16.1dBm, 43.7% and 32%, respectively.
Figure 17.5.5 shows the dynamic performance for QAM modulated signals measured at fc=2GHz. ACPR1 is as low as -49 to -40dBc for modulation bandwidths of 1.2 to 40MHz. The measured EVM is -33dB for 16-QAM and -31dB for 64-QAM signals with a 40MHz BW.
Figure 17.5.6 summarizes and compares this work with the state of the art. The nonlinear sizing with VOD tuning and multiphase RF clocking provide a very high DPD-less RF-DAC linearity (EVM/ACPR) for wideband signals (>=20MHz) without sacrificing output power or efficiency. In fact, its linearity competes with state- of-the-art DPD-less Cartesian DPA [4] while the output power and efficiency performance are superior by 5 to ~6dB and 10 to 12%, respectively.
Acknowledgment: The authors acknowledge Atef Akhnoukh from TU Delft and the imec/Europractice IC service team for their unlimited and high quality support, the people of Ampleon and NXP for their encouragement and advices, the projects SEEDCOM (STW) and EAST (Catrene) for the financial support and Masoud Babaie and Earl McCune for their useful suggestions.
References: [1] D. Chowdhury, et al., “An Efficient Mixed-Signal 2.4-GHz Polar Power Amplifier in 65-nm CMOS Technology”, IEEE JSSC, vol. 46, no. 8, pp. 1796-1809, Aug. 2011. [2] W. Yuan, et al., “A Quadrature Switched Capacitor Power Amplifier”, IEEE JSSC, vol. 51, no. 5, pp. 1200-1209, May 2016. [3] J. Park, et al., “A Highly Linear Dual-Band Mixed-Mode Polar Power Amplifier in CMOS with An Ultra-Compact Output Network”, IEEE JSSC, vol. 51, no. 8, pp. 1756-1170, Aug. 2016. [4] P. Paro Filho, et al., “A Transmitter with 10b 128MS/S Incremental-Charge- Based DAC Achieving −155dBc/Hz Out-of-Band Noise”, ISSCC, pp. 164–165, Feb. 2015. [5] S. Zheng, et al., “A WCDMA/WLAN Digital Polar Transmitter with Low-Noise ADPLL, Wideband PM/AM Modulator, and Linearized PA”, IEEE JSSC, vol. 50, no. 7, pp. 1645-1656, July 2015. [6] A. Ba, et al., “A 1.3nJ/b IEEE 802.11ah Fully Digital Polar Transmitter for IoE Applications”, ISSCC, pp. 440-441, Feb. 2016.
978-1-5090-3758-2/17/$31.00 ©2017 IEEE
ISSCC 2017 / February 7, 2017 / 3:15 PM
Figure 17.5.1: Overall block diagram of the linear DPA. Figure 17.5.2: ACW-AM linearization concept and circuit.
Figure 17.5.3: ACW-PM linearization concept and circuit.
Figure 17.5.5: Measured 40MHz 64-QAM EVM and ACPR vs BW. Figure 17.5.6: Performance summary and comparison table.
Figure 17.5.4: Measured ACW-AM/PM and Pout/DE/ IMD vs fc.
17
ISSCC 2017 PAPER CONTINUATIONS
Figure 17.5.7: Die micrograph.
ISSCC 2017 / SESSION 17 / TX AND RX BUILDING BLOCKS / 17.6
17.6 Rapid and Energy-Efficient Molecular Sensing Using Dual mm-Wave Combs in 65nm CMOS: A 220-to-320GHz Spectrometer with 5.2mW Radiated Power and 14.6-to-19.5dB Noise Figure
Cheng Wang, Ruonan Han
Massachusetts Institute of Technology, Cambridge, MA
Millimeter-wave/terahertz rotational spectroscopy offers ultra-wide-detection range of gas molecules for chemical and biomedical sensing. Therefore, wideband, energy-efficient, and fast-scanning CMOS spectrometers are in demand. Spectrometers using narrow-pulse sources and electromagnetic scattering [1] are broadband, but their resolutions do not meet the requirement (<10kHz) of the absolute specificity. Alternatively, a scheme using a single tunable tone exhibits significant trade-off between bandwidth and performance. The 245GHz spectrometer in [2] presents 4mW radiated power, but only has a 14GHz bandwidth. In [3] and [4], broader bandwidths are achieved at the expense of degraded radiated power (0.1mW) and noise figure (NF=18.4 to ~23.5dB). In addition, given a typical 10kHz resolution and 1ms integration time, scanning a 100GHz bandwidth with a single tone takes as long as 3 hours. This paper reports a rapid, energy-efficient spectrometer architecture based on dual-frequency-comb scanning. A 220-to-320GHz CMOS spectrometer prototype based on this architecture is demonstrated with a total radiated power of 5.2mW and a NF of 14.6 to ~19.5dB.
The spectrometer shown in Fig. 17.6.1 consists of two identical comb chips with a fixed frequency offset fIF. From each comb, 10 frequency lines are transmitted through the gas sample and are simultaneously used as a local-oscillator (LO) signal for the heterodyne mixing of the wave from another comb. The maximum scanning speed of a single-tone spectrometer with certain sensitivity is determined by the probing signal power, which is fundamentally limited by the population saturation of molecular states. In comparison, in a comb, each probing channel reaches such maximum speed, leading to a much shorter total scanning time through parallel operation. Each comb chip is driven by a tunable reference signal, fref (45 to 46.67GHz). This signal is tripled to f0 (135 to 140GHz) and power- divided into up- and down-conversion chains. The chains produce tones evenly spaced every 5GHz using an external clock signal fD of 10GHz, which is frequency- divided by 2 inside every up/down mixer. Each tone is then doubled and radiated by an active-molecular-probe (AMP) block. The final 10 comb lines located at 6fref+i10GHz (i=-5…+4) are simultaneously radiated from the chip backside and seamlessly cover 220-to-320GHz band. The proposed comb architecture enables scalability to higher bandwidth with extended cascading of narrowband channels. The narrowband operation also eliminates the aforementioned bandwidth- performance trade-offs in circuits.
In the AMP, the energy efficiency is further improved with multifunctional structures and optimum device feedback. Shown in Fig. 17.6.2, the AMP core serves as a radiating frequency doubler and a subharmonic mixer simultaneously. In the doubler mode, an NMOS pair is driven differentially at f0. The drain voltage swing is boosted via two λ/4 resonators, Slot1. At 2f0, in-phase standing waves are formed inside Slot1, which acts as a folded slot antenna with a simulated radiation efficiency of 45%. Next, Slot2, which supports the quasi-TEM wave associated with the differential mode at f0, partially recycles the amplified signal back to the input. When harmonic signal at 2f0 is generated at the NMOS drains, the TM wave associated with its common mode is rejected in Slot2. That prevents the leakage of 2f0 signal into the lossy gates through the feedback path. The amplitude and phase of the recycled signal at f0 is controlled by the characteristic impedance and phase of TL1 and Slot2. Shown in Fig. 17.6.2, their optimum values (ZTL1=30Ω, θTL1=55°, ZSlot2 = 80Ω, θSlot2 =88°) enable in-phase addition between input and recycled waves without causing instability. Compared to conventional designs without feedback, the simulated doubler conversion efficiency at 275GHz increases from 18% to 43%. In the mixer mode, the input wave at 2f0+fIF is coupled into the heavily driven transistors via the folded slot antenna (Slot1). Both the input signal and the LO signal at 2f0 are in common mode, which enable the extraction of the combined, down-converted signal fIF through an integrated RF chock. The simulated single-sideband (SSB) NF is 20.2dB at 275GHz and is improved to 16.3dB when the transistor drain-bias current is zero due to lower thermal and flicker noise.
In these AMPs, phase and amplitude imbalance of baluns (Fig. 17.6.1) deteriorate the efficiency and LO-leakage rejection. Figure 17.5.3 shows an on-chip balun using orthogonal-mode filtering similar to that in Slot2 (Fig. 17.6.2). Only the fully differential quasi-TEM wave is allowed to propagate from the input port to the perfectly symmetric output ports of the balun. This mechanism leads to near-zero imbalance between the two output ports, which is verified by the simulation results in Fig. 17.6.3. Figure 17.6.3 also shows the up/down mixer for comb- spectral generation. Up and down frequency conversions are achieved by selected combination of quadrature signals at f0 and 5GHz. In simulation, the LO and image rejections are better than 30dB, and the conversion loss is 2.3dB. The 5GHz I/Q signals are generated by a digital frequency divider inside each mixer.
The chip is implemented using a bulk 65nm CMOS process. A hemispheric silicon lens, rather than a hyper-hemispheric one, is used for its small sensitivity to position offset. The chip DC power is 1.7W. Each AMP is characterized independently with the bias of other AMPs turned off to entirely eliminate irrelevant radiation. The antenna radiation pattern is measured by a VDI WR-3 even-harmonic mixer (EHM) with a horn antenna at 10cm distance. Figure 17.6.4 shows the antenna pattern for the AMP at 265GHz. The average directivity of the 10 AMPs is 10.1dBi. The equivalent isotropically radiated power (EIRP) of each comb line is measured using a PM4 power meter. The total radiated power of the 10 comb lines is 5.2mW (2f0=275GHz, fD=10GHz). The average phase noise of the 10 comb lines is -102dBc/Hz at 1MHz offset. By measuring the conversion gain using an OML WR-3 frequency extender and the noise floor in the receiver mode, 14.6 to ~19.5dB SSB NF is obtained under zero AMP bias current. The calculation of NF uses the power received by the AMP antenna aperture, hence includes the antenna loss but de-embeds the performance improvement due to the beam collimation. Lastly, Fig. 17.6.5 presents a spectroscopy setup for the sensing of acetonitrile (CH3CN) with pressure of 3Pa. One measured spectral section from 275.5 to 276GHz is shown, which agrees with the JPL molecular spectroscopy catalog [6]. An absorption line at 275.86781GHz is obtained using direct transmission as shown in Fig. 17.6.5. To further eliminate the standing wave formed inside the gas chamber, wavelength modulation with modulation frequency fm of 50kHz, frequency deviation Δf of 240kHz is applied in one comb. The second-order derivative of the same spectrum line is then obtained by measuring the output signal at 2fm from another comb, which shows a line width of 380kHz, demonstrating the absolute detection specificity. Figure 17.6.6 shows the comparison table with other state of the art systems implemented in silicon and operating above 200GHz. Through rapid combing of the spectrum, a high energy efficiency of 0.17mJ/point (1ms integration time) is achieved, demonstrating a new path for broadband sensing via parallelism.
Acknowledgements: This work is supported by MIT Center for Integrated Circuits & Systems and TSMC University Shuttle Program. The authors acknowledge Dr. Richard Temkin at MIT and Prof. Ehsan Afshari at University of Michigan for the support of testing instruments, and Dr. Stephen Coy, Prof. Keith Nelson, Prof. Robert Field and Tingting Shi at MIT for technical discussions and assistance.
References: [1] W. Xue, et al., "A 40-to-330GHz Synthesizer-Free THz Spectroscope-on-Chip Exploiting Electromagnetic Scattering," ISSCC, pp. 428-429, Feb. 2016. [2] K. Schmalz, et al., "245-GHz Transmitter Array in SiGe BiCMOS for Gas Spectroscopy," IEEE Trans. THz Sci. Technol., vol. 6, no. 2, pp. 318-327, Mar. 2016. [3] N. Sharma, et al., "200-280GHz CMOS RF Front-End of Transmitter for Rotational Spectroscopy," IEEE Symp. VLSI Tech., pp. 329-331, June 2016. [4] Q. Zhong, et al., "A 210-to-305GHz CMOS Receiver for Rotational Spectroscopy," ISSCC, pp. 426-427, Feb. 2016. [5] R. Han, et al., "A SiGe Terahertz Heterodyne Imaging Transmitter with 3.3 mW Radiated Power and Fully-Integrated Phase-Locked Loop," IEEE JSSC, vol. 50, no. 12, pp. 2935-2947, Dec. 2015. [6] JPL Molecular Spectroscopy, “JPL Catalog Search Form”. Accessed on Sept. 2, 2016, spec.jpl.nasa.gov/ftp/pub/catalog/catform.html [7] Z. Wang, et al., “A CMOS 210-GHz Fundamental Transceiver with OOK Modulation,” IEEE JSSC, vol. 49, no. 3, pp. 564-580, Mar. 2014.
978-1-5090-3758-2/17/$31.00 ©2017 IEEE
Figure 17.6.1: Dual-frequency-comb spectrometer in CMOS. Figure 17.6.2: Active-molecular-probe (AMP) block.
Figure 17.6.3: Slot balun and up/down-conversion mixer.
Figure 17.6.5: Spectroscopy for molecular sensing. Figure 17.6.6: Die micrograph and performance comparison table.
Figure 17.6.4: Measurement results of the chip.
17
ISSCC 2017 PAPER CONTINUATIONS
Figure 17.6.7: Die micrograph.
ISSCC 2017 / SESSION 17 / TX AND RX BUILDING BLOCKS / 17.7
17.7 A Packaged 90-to-300GHz Transmitter and 115-to- 325GHz Coherent Receiver in CMOS for Full-Band Continuous-Wave mm-Wave Hyperspectral Imaging
Taiyun Chi, Min-Yu Huang, Sensen Li, Hua Wang
Georgia Institute of Technology, Atlanta, GA
Millimeter-wave/THz hyperspectral imaging has numerous applications in security, non-destructive evaluation, material characterization, and medical diagnostics [1]. Unlike single-frequency imaging, hyperspectral imaging operates over a wide frequency range and offers spectroscopic information on each imaging pixel. This combines mm-wave/THz high-resolution imaging with spectroscopy and improves detection sensitivity and specificity. In practice, pulse- based imaging supports fast data acquisition, but requires receiver (RX) with real-time wideband sampling (>50GHz). Such instantaneous broadband imaging modality inevitably exhibits severely degraded sensitivity (due to integrated noise) and requires high-end signal sampling, both of which make it very challenging to achieve a low-cost SoC solution. On the other hand, continuous-wave (CW) imaging supports better sensitivity, especially using coherent detection method with a low IF bandwidth [2–5]. Its operation allows for the use of a simplified heterodyne receiver, enabling silicon-based implementations of the entire imaging system. However, there are limited mm-wave/THz integrated electronic systems available that support CW hyperspectral imaging with a large bandwidth (BW), sufficient output power (Pout), and high sensitivity. Some existing CW transmitters (TX) use the harmonics for wideband coverage, which cannot support full-band scanning at any frequency in the band [2]. In this paper, a full-band CW TX/RX chipset is proposed to realize a generic hyperspectral imaging system without knowing the particular band of interest. We therefore optimize its performance to achieve flat TX Pout and RX conversion gain (CG) over a broad BW. Our mm- wave/THz hyperspectral imaging system comprises a 90-to-300GHz TX with a ±2dB Pout variation using a distributed quadrupler architecture and a 115-to- 325GHz 4th-subharmonic coherent RX with -115dBm sensitivity (1kHz RBW) using high-order filter-based matching networks (MNs). The TX and RX chips are flip- chip integrated with wideband vivaldi antennas on low-cost organic LCP (liquid crystal polymer) substrates. This packaged wideband system offers a promising solution for low-cost field-deployable hyperspectral imaging.
To achieve wideband THz signal generation, a distributed-quadrupler (DQ) scheme is proposed as the imaging TX (Fig. 17.7.1). Distributed topologies can absorb the device parasitic capacitors into the input/output synthetic T-lines and allow for broadband power combining from each stage [6,7]. Each DQ stage comprises 4 transistors driven by differential quadrature signals (±I and ±Q) at the fundamental frequency f0. The input differential-I/Q signals propagate through gate T-lines, drive the DQ stages, and generate even-order harmonics. The drain T-line achieves broadband matching (>400GHz), and the simulated 3dB-insertion- loss BW of the loaded 10-stage drain T-line is 327GHz (Fig. 17.7.1). Moreover, to ensure constructive combining of the forward 4f0 signals among the DQ stages, the group delay of the 4f0 signal between the adjacent DQ stages on the drain T- line is designed to be equal to that of the f0 signal on the gate T-line over the TX BW. Two 10-stage DQs driven by differential I or Q signals are in-phase combined at the antenna input port to boost the 4f0 output signal and cancel the 2f0
harmonic.
In conventional DAs, the DC supply feed is through on-chip chokes or an off-chip bias-tee via the output pad. However, for this 90-to-300GHz DQ TX, it is difficult to realize any on-chip choke, while off-chip bias-tees complicate the TX-antenna packaging. Thus, an on-chip 0.5dB-ripple 3rd-order Chebyshev bandpass filter (BPF) is inserted between the on-chip termination resistor and the drain output of the 1st-stage DQ. This BPF provides a broadband termination of the backward waves and a DC supply feed point through its shunt T-line (Fig. 17.7.1). The broadband differential-I/Q inputs for the DQ are generated by an on-chip Marchand balun and a transformer-based quadrature-generation network (Fig. 17.7.2). At f0=60GHz, the EM-simulated total average passive loss of the input network is 2.5dB with a 6dB inherent loss due to 1:4 power dividing. The ±1dB amplitude-mismatch BW is from 38-to-84GHz (1:2.2), and the 5° phase-mismatch BW is from 35-to-82GHz (1:2.3), achieving matched and broadband I/Q driving signals.
The wideband RX employs a 4th-subharmonic mixing (SHM) topology using an anti-parallel diode pair (APDP) [5]. Compared with power detectors, this coherent heterodyne detection achieves a much higher sensitivity and SNR using low-IF operation, obviating the need of high-power illumination sources in imaging [3]. In this 4th-SHM RX, the received signal is mixed with the 4th LO harmonic, which reduces the required LO power and frequency to be 7dBm and <82GHz. By absorbing the APDP parasitic capacitors into two 4th-order Chebyshev BPF MNs in the LO and RF paths, a large RX BW, low conversion loss, and high out-of- band rejection are achieved (Fig. 17.7.3). The BW of the 4th-order Chebyshev BPF is proportional to g5/Q, where Q is the APDP loaded quality factor, and g5 is the load coefficient in the normalized filter prototype. A larger g5 provides a larger BW but introduces in-band ripple. In this RX, a 0.5dB ripple is chosen with g5=1.98 to ensure RX CG flatness. This also inherently applies a 1:2 impedance transformation to match the RX 50Ω input with the APDP parallel resistance. The IF output is amplified by an on-chip LNA and an open-drain buffer.
A proof-of-concept 10-stage DQ TX and a 4th-SHM RX are implemented in a 45nm CMOS SOI process. For RX characterization, the IF frequency is set as 250MHz with a 1kHz RBW (corresponding to 1ms time constant) on the spectrum analyzer. The measured RX CG is between -3dB and 1dB in 115-to-325GHz band with an LO power of 7dBm±1dB (Fig. 17.7.4). The RX CG can be further boosted by integrating extra high-gain IF amplifier on-chip. The single-sideband (SSB) noise figure (NF) is calculated from the measured noise floor at the IF output and the CG [5]. The measured SSB NF is 29 to 37dB across the RX band. The sensitivity at the RF input is defined as KB•T•BW•NF, with BW=1kHz, where KB is Boltzmann’s constant and T is the absolute temperature. A sensitivity of -107 to -115dBm is achieved in 115-to-325GHz band, demonstrating the broadband high sensitivity of the coherent RX. For the TX characterization, Fig. 17.7.4 shows the simulated and measured TX Pout. At center 200GHz, TX Pout is -11dBm with 18dBm Pin at 50GHz. TX Pout flatness is within ±2dB for 90-to-300GHz band. The 300GHz high- end frequency is limited by the drain T-line loss, while the 90GHz low-end frequency is due to the imbalance of the I/Q driving signals. An on-chip 22.5-to- 75GHz DA (with 18dBm Pout) can be integrated as the TX source for full system integration [6]. The measured TX Pout vs. Pin is shown in Fig. 17.7.4 with the maximum TX Pout of -6dBm at 20dBm Pin.
The wideband TX and RX are flip-chip packaged to a 2mil LCP substrate with on- package 100-to-280GHz vivaldi antennas. Hyperspectral images of a cookie and a screw inside a translucent package at multiple frequencies are shown in Fig. 17.7.5. The mm-wave/THz images clearly delineate the cookie and the screw inside the package, which is useful in non-destructive and non-contact food safety screening.
Acknowledgements: The authors would like to thank GlobalFoundries for chip fabrication. This work was supported in part by the US Army Research Office under Grant No. W911NF- 15-P-0021.
References: [1] C. Chang, Hyperspectral Imaging: Techniques for Spectral Detection and Classification. New York, NY, USA: Springer Science & Business Media, 2003. [2] K. Statnikov, et al., “160-GHz to 1-THz Multi-Color Active Imaging with a Lens- Coupled SiGe HBT Chip-Set”, IEEE TMTT, vol. 63, no. 2, pp. 520–532, Feb. 2015. [3] C. Jiang, et al., “A 320GHz Subharmonic-Mixing Coherent Imager in 0.13μm SiGe BiCMOS,” ISSCC, pp. 432–433, Feb. 2016. [4] N. Sharma, et al., “160-310 GHz Frequency Doubler in 65-nm CMOS with 3- dBm Peak Output Power for Rotational Spectroscopy”, IEEE RFIC, pp. 186–189, May 2016. [5] Q. Zhong, et al., “A 210-to-305GHz CMOS Receiver for Rotational Spectroscopy”, ISSCC, pp. 426–427, Feb. 2016. [6] J. Chen and A. Niknejad, “Design and Analysis of a Stage-Scaled Distributed Power Amplifier”, IEEE TMTT, vol. 59, no. 5, pp. 1274–1283, May 2011. [7] K. Lin, et al., “A Broadband Balanced Distributed Frequency Doubler with a Sharing Collector Line,” IEEE Microw. Wireless Compon. Lett., vol. 19, no. 2, pp. 110–112, Feb. 2009.
978-1-5090-3758-2/17/$31.00 ©2017 IEEE
ISSCC 2017 / February 7, 2017 / 4:15 PM
Figure 17.7.1: Schematic of the 10-stage DQ, and simulated S11 and S21 for the BPF-based termination and the loaded output T-line.
Figure 17.7.2: The HFSS model of the input Machand balun and the differential- I/Q-generation network, and the EM-simulated amplitude/phase responses of the four quadrature outputs at f0.
Figure 17.7.3: Schematic of the 4th SHM as the 115-to-325GHz RX, and the simulated frequency responses for the LO and RF paths.
Figure 17.7.5: Hyperspectral images of a cookie and a screw inside a translucent package for food safety screening at multiple frequencies.
Figure 17.7.6: Comparison of the state-of-the-art coherent mm-wave/THz imagers.
Figure 17.7.4: Measured RX CG, SSB NF, and sensitivity vs. frequency. Measured TX Pout vs. frequency, and measured TX Pout vs. Pin.
17
ISSCC 2017 PAPER CONTINUATIONS
Figure 17.7.7: Die micrographs and pictures of the packaged TX/RX chips and modules.
306 • 2017 IEEE International Solid-State Circuits Conference
ISSCC 2017 / SESSION 17 / TX AND RX BUILDING BLOCKS / 17.8
17.8 A Compact 130GHz Fully Packaged Point-to-Point Wireless System with 3D-Printed 26dBi Lens Antenna Achieving 12.5Gb/s at 1.55pJ/b/m
Nemat Dolatsha1, Baptiste Grave1,2, Mahmoud Sawaby1, Cheng Chen1, Afshin Babveyh1, Siavash Kananian1, Aimeric Bisognin3,4, Cyril Luxey3, Frederic Gianesello4, Jorge Costa5,6, Carlos Fernandes7, Amin Arbabian1
1Stanford University, Stanford, CA, 2CEA-LETI-MINATEC, Grenoble, France, 3University of Nice, Nice, France, 4STMicroelectronics, Crolles, France, 5Instituto de Telecomunicações, Lisbon, Portugal, 6ISCTE-IUL, Lisbon, Portugal, 7University of Lisbon, Lisbon, Portugal
Low-cost, energy efficient, high-capacity, scalable, and easy-to-deploy point-to- point wireless links at mm-waves find a variety of applications including data intensive systems (e.g., data centers), interactive kiosks, and many emerging applications requiring data pipelines. Operating above 100GHz enables compact low-footprint system solutions that can multiplex Tb/s aggregate rates for dense deployments; therefore competing with wired solution in many aspects including rate and efficiency, but much more flexible for deployment. The focus is on small- footprint fully integrated solutions, which overcome traditional packaging challenges imposed at >100GHz with commercial and low-cost solutions.
Traditional wireless systems use higher-order modulations to achieve spectral efficiency, often at the cost of energy efficiency. For point-to-point links we exploit the available spatial degrees of freedom (in form of multiplexing with narrow mm- wave beams and alternate polarizations) to relax spectral efficiency requirements and enable improved energy efficiency using an OOK system and duty-cycling the TX. The proposed compact and wideband system is fully packaged with a low- cost flip-chip IC on an organic substrate that includes an antenna-in-package (AiP) feeding an integrated 3D-printed high-gain lens.
Figure 17.8.1 shows the TRX IC block diagram and packaging solution. A wideband single-carrier OOK transmission combined with a non-coherent reception is chosen to drastically improve energy efficiency by eliminating power hungry blocks including synthesizers, quadrature mixers, and other modulator circuits.
Major circuit blocks are detailed in Fig. 17.8.2. The PA, as one of the main power bottlenecks, is cycled ON/OFF for OOK operation. Cycling the 130GHz oscillator for OOK modulation at >10Gb/s would lead to larger total power consumption due to fast settling constraints and is avoided. A single-stage differential standard cascode PA (diff-PA) is chosen to provide adequate mm-wave gain, bandwidth (BW), and output power while maintaining reliability margins for breakdown constraints. A Pout of ~10dBm is achieved with 21% efficiency under modulation. The 2:1 output balun provides both impedance transformation, which enhances the gain, and ESD protection. To create the OOK modulation, the PA tail current is switched by M4-M5, which bias or short out the current source (M2). Rise/fall times are maintained below 15ps. M3 sustains a minimum trickle current path. The diff-PA consumes 30mA in ON and 1mA in OFF mode, which is optimized for transition speed and stability of the PA as well as the VCO loaded with the PA input.
The VCO is a common-collector Colpitts oscillator (Fig. 17.8.2) with a tank based on L1 and NMOS varactors. λ/4 T-lines connect the collector to a 1.5V supply and the emitter to ground. Rb, in the common-mode (CM) bias path, suppresses CM gain and oscillation. A 12mA current bias ensures startup and sufficient drive amplitude under PVT, verified with measurement across multiple samples and packages.
The 130GHz LNA (Fig. 17.8.2) is designed for low NF, low power and achieving BW>15GHz. The first stage is biased at NFmin. A small gain-boosting inductor is placed in the base of Q1-B whose size is determined as a trade-off between gain and stability considering PVT margins. The next stages are biased for maximum gain to drive the envelope detector (ED). Bandwidth extension is achieved with series cascode inductors between QX-A and QX-B. The simulated gain and NF of the LNA is respectively 26dB and 9.5 to 10dB. The nominal input P1dB is -38dBm. The LNA consumes 24mW in nominal and 16.5mW in low-power (LP) mode, which has a gain of 15dB and a minimal effect on other metrics, as depicted in Fig. 17.8.2.
To further reduce the power consumption of the RX, while maintaining >15GHz BW, we opted for direct detection of 130GHz OOK signals using an ED, which is optimized for gain and NF. This was compared to heterodyne schemes, or direct
detection using mixers, which are power hungry. With the choice of an optimum current density for maximum gain (enhanced nonlinearity regime), and device size to fulfill the NF requirement based on input power, the ED consumes only 750μW, with NF<15dB at input sensitivity levels.
To address the stringent link requirements, and also to utilize spatial degrees of freedom and to suppress interference, very-high-gain compact antennas with large BW, narrow beamwidth, and low leakage to the cross-polarization are needed. We target low-cost solutions with >30dBi gain and >20GHz BW at 130GHz.
Our design relies on 3D printing and low-loss-organic BGA-packaging techniques (Fig. 17.8.3). A 2×2 aperture-coupled patch-antenna array with a large BW illuminates the elliptical lens for maximum directivity. The additional substrate- integrated grounded cavity minimizes unwanted TM0 surface waves. The array is optimized to achieve a directivity of 10 to 11dBi necessary for appropriate illumination of the lens. The input match covers 96 to 142GHz and 10.9dBi (±0.4) directivity is measured at 116 to 140GHz. An extended hemispherical lens, made of basic low-cost ABS-M30 plastic material (used extensively in classic 3D printers), is chosen for achieving a high gain in a small form factor. In order to reduce the dielectric losses and minimize the sensitivity to manufacturing imperfections, the lens is designed as a hollow shell with optimized removal of internal volume (Fig. 17.8.3). Measurements with the lens show S11<-10dB, antenna gain >26dBi and polarization purity >25dB above 114GHz. The small discrepancies between measured and simulated gain is mainly due to a 1o residual misalignment in measurements setup in addition to not being entirely in far-field (at 80cm) due to setup limitations.
Figure 17.8.4 shows the setup for measurement of the TX spectrum and EIRP. The VCO tunability range is 123 to 131GHz (Fig. 17.8.4), and the EIRP of 16.3dBm (9.3dBm Pout, 8dBi measured BGA gain, 1dB flip-chip loss) and 32.5dBm (2.8dB lower than expected due to likely residual misalignments) is measured without and with lens, respectively. In order to perform measurements using a single BERT and also observe the performance in complex mm-wave channels, a metal reflector is used to close the TX-RX loop when transceivers are placed side by side with an absorber in between to avoid direct feedthrough paths (Fig. 17.8.5). Even in the complex/non-ideal extended channel formed by the reflector, 12.5Gb/s data transmission with 10-6 BER is measured at 5m. At short-range (50cm LoS) and LP mode, 11.5Gb/s with 10-6 BER is achieved with 25% lower power consumption (Fig. 17.8.7).
Figure 17.8.6 shows the comparison table with state-of-the-art high-speed TRXs. FOM1 (energy/bit/range) is improved by >40× compared to [1-6] owing to the high gain yet compact antenna that together with the efficient TRX places this system in competition with wired solutions. FOM2 represents the maximum achieved EIRP for a given power consumption at TX and achieves 131× improvement. Figure 17.8.7 shows the pad-limited 1.62×1.98mm² die micrograph in 55nm BiCMOS (active area of 0.4mm²).
Acknowledgement: This work was supported in part by Systems on Nanoscale Information fabriCs (SONIC), one of the six SRC STARnet Centers, sponsored by MARCO and DARPA, Google, EMX, and by the National Science Foundation under grant CNS-1518632". N. Dolatsha, B. Grave, and M. Sawaby contributed equally to this work.
References: [1] C. Byeon, et al., "A 67mW 10.7Gb/s 60GHz OOK CMOS Transceiver for Short- Range Wireless Communications," IEEE TMTT, vol. 61, no. 9, pp. 648-650, Sept. 2013. [2] Z. Wang, et al., "A CMOS 210-GHz Fundamental Transceiver with OOK Modulation," IEEE JSSC, vol. 49, no. 3, pp. 564-580, Mar. 2014. [3] S. Shahramian, et al., "A 16-Element W-Band Phased Array Transceiver Chipset with Flip-Chip PCB Integrated Antennas for Multi-Gigabit Data Links," IEEE RFIC, pp. 27-30, May 2015. [4] S. Thyagarajan, et al., "A 240GHz Fully Integrated Wideband QPSK Receiver in 65nm CMOS," IEEE JSSC, vol. 50, no. 10, pp. 2268-2280, Oct. 2015. [5] R. Wu, et al., "A 42Gb/s 60GHz CMOS Transceiver for IEEE 802.11ay," ISSCC, pp. 248–249, Feb. 2016. [6] K. Tokgoz, et al., "A 56Gb/s W-Band CMOS Wireless Transceiver," ISSCC, pp. 242-243, Feb. 2016. [7] G. Mangraviti, et al., "A 4-Antenna-Path Beamforming Transceiver for 60GHz Multi-Gb/s Communication in 28nm CMOS," ISSCC, pp. 246-247, Feb. 2016.
978-1-5090-3758-2/17/$31.00 ©2017 IEEE
Figure 17.8.1: Low-cost, energy-efficient, high-capacity, scalable and easy-to- deploy point-to-point wireless links: applications (top), block diagram of the TRX (bottom). Figure 17.8.2: Detailed schematics of the OOK TRX with design trade-offs.
Figure 17.8.3: Antenna-in-package, BGA stack-up, fabricated BGA, cross- sectional view of the lens, E/H-plane radiation patterns at 130GHz, gain vs frequency, and input match.
Figure 17.8.5: Transceiver power consumption breakdown, TRX end-to-end measurement setup with the reflector, eye diagram and bathtub curves for data transmission at 2m and 5m.
Figure 17.8.6: Performance comparison to state-of-the-art mm-wave transceivers.
Figure 17.8.4: Transmitter characterization (top), measurements of CW output spectrum and VCO tunability (bottom). This measurement setup is also used to measure output power and EIRP w/- and w/o the lens.
17
ISSCC 2017 PAPER CONTINUATIONS
Figure 17.8.7: Fully packaged transceiver and die micrograph (top), short-range (50cm) LoS measurement set-up for low-power mode operation and its bathtub curve (bottom).
308 • 2017 IEEE International Solid-State Circuits Conference
ISSCC 2017 / SESSION 17 / TX AND RX BUILDING BLOCKS / 17.9
17.9 A 105Gb/s 300GHz CMOS Transmitter
Kyoya Takano1, Shuhei Amakawa1, Kosuke Katayama1, Shinsuke Hara2, Ruibing Dong2, Akifumi Kasamatsu2, Iwao Hosako2, Koichi Mizuno3, Kazuaki Takahashi3, Takeshi Yoshida1, Minoru Fujishima1
1Hiroshima University, Higashihiroshima, Japan 2National Institute of Information and Communications Technology, Koganei, Japan 3Panasonic, Yokohama, Japan
“High speed” in communications often means “high data-rate” and fiber-optic technologies have long been ahead of wireless technologies in that regard. However, an often overlooked definite advantage of wireless links over fiber