Time-based All-Digital Technique for Analog Built-in Self Test · Maruthi, Vinay, Shantanu,...
Transcript of Time-based All-Digital Technique for Analog Built-in Self Test · Maruthi, Vinay, Shantanu,...
Time-based All-Digital Technique for Analog
Built-in Self Test
A thesis submitted for the degree of
Doctor ofPhilosophyin the Faculty of Engineering
Submitted by
Rajath Vasudevamurthy
Department of Electrical Communication Engineering
Indian Institute of Science
JULY 2013
To
The Lotus feet
of the
Goddess of Learning
Acknowledgements
I am grateful for the opportunity of spending the last six years at the wonderful
campus of the Indian Institute of Science, the stay replete with ample opportu-
nities of acquiring knowledge and to all the friends who made the stay enjoyable.
Firstly, I would like to thank my advisor Dr. Bharadwaj Amrutur for being a
constant source of inspiration and support. I would like to thank the chairman of
the ECE department, Prof. P. Vijay Kumar, and all the faculty members for offer-
ing interesting courses and innovative teaching methods. I would like thank all the
ECE office staff, especially Mr. Srinivas Murthy and Mr. C. T. Nagaraj; and lab
staff Mrs. Subhashini and Mrs. Radhika for handling all the paper work smoothly.
I would thank all the staff members of Systems Lab, CeNSE and Micro-Nano
Characterization Facility (MNCF), CeNSE, especially Manikant Singh, Dr. Vi-
jay Mishra, Ms. Ashwini and Dr. Girish Kunte for letting me borrow equipment
enabling me to complete my work on time.
I would like to thank Dr. Rubin Parekhji, Dr. Devanathan, Dr. Shrivaths Ravi
of Texas Instruments, India for giving me the opportunity of learning from prac-
ticing industry experts and also providing me valuable TA (Teaching Assistant)
experience. Special thanks again to Dr. Rubin Parekhji for arranging my pre-
sentation at TI, and to Lakshmanan and Chethan for detailed discussions and
feedback.
I would like to thank all my lab mates - Viveka, Kaushik, Manikandan,
Pushkar, Sagar, Nagaraju, Doney; and members of neighbouring lab - Javed,
Vishal, Immaneul, Zaira and Manas, for the excellent work environment and
timely help when needed. Special thanks to Vishal and Manikandan for or-
ganizing weekly lab talks; and to Vikram, Janaki and Siva for similar efforts
earlier; giving us an occasion to share knowledge and hone presentation skills. I
iv
would like to specially thank BT for patiently resolving linux issues. I would also
like to thank the alumni of lab - Vikram, Pratap, Janaki, Satyam, Siva, Balaji,
Raghavendra, Syam, Karthik, Nandish. Special thanks to Satyam for removing
my post-submission uncertainty. Special thanks to Syam for help with PCB de-
sign and Yasasvi for soldering help and die micrograph of chapter 7. I would like
to thank the students of our department - Neeraj, Abhinav, Rakshith, Mohan,
Ajay, Harshavardhan, Harsha, Harish, Govind, Haricharan, Aarthi, Prashanth,
Arun, Nischal for valuable and informal informative discussions. I would like to
thank all the masters students who joined with me - Suraj Sindia, Shashidhar,
Vinay N S, V T Arun, Rakesh Kumar, Virag Shah, Pramod, Girish and many oth-
ers for excellent company and stimulating discussions during coursework. Special
thanks to Suraj Sindia for helping me in writing out chapter 2 and to V T Arun
for the die micrograph of chapter 4.
I would like to thank H. L. Prasad for teaching me Sanskrit, and along with
Santhosh providing me valuable start-up experience. I would like to thank mem-
bers of the Samskr.ta Sangha - G. P. R. Yasasvi, Rudra Murthy, Hari Pavan
Kumar, Siva Rama Krishna, Abhinav, Navin, Suraj K, Chennakeshava, Bhar-
gava, Ankur Raina, Vishvas Acharya, Abhiram B, Vivekanand Mannangi, Hari
Ganesh, Subramanian T R for inspiring Sanskrit learning and organizing vari-
ous programmes. I would like to thank members of Kannada Sangha - Vadi-
raj, Shivaprasad, Baburao Sherikar, Venkatesh, Pradeepa T K, Suryaprakash,
Maruthi, Vinay, Shantanu, Shivanand, Bhaskar, Prasanna, Nagaraj, Deepak
Paramashivan, Shivamogga Rakesh, B. S. Sheshachala, Smt. Nandini, Smt. Udaya-
kumari for organizing various programmes and giving me an opportunity to be
in the organizing committee. Special thanks to Vadiraj for letting me use his
cycle, greatly reducing my commuting time. I would like to thank members of
the Vivekananda Study Circle - Rajasekhar, Prasad, Sushant, Sonal, Abheek,
Goutham, Durga Datta and specially Sri Gokulmuthu, for the wonderful dis-
cussions we had. Thanks are due to members of Telugu Samskr.tika Samiti -
G. P. R. Yasasvi, Hari Pavan Kumar, Sainath B., Sheshadri, Rakesh Kande for
organizing wonderful programmes and being patient enough to clear my doubts
of the language. I would like to thank all the S-block friends - Srinidhi, Bharath,
Shivananju, Naveen, Avinash Achar, Keshav, Sushrutha, Venu, Pradeepa, Prashanth,
v
Premkumar, Laxman for being with me during tough times. Special thanks to
members of Prasthuta and Praharshini, namely Abhiram Soori, Raghavendra,
Dharmesh, Varun, Krishna for organizing various programmes and stimulating
discussions. Thanks are due to Jaishankar, Souren Misra and Srinath for tea
time simulating discussions. I would like to thank the members of Papyrus - BT,
Janaki, Vijayanth, Chetana, Gokul, for giving me an opportunity to help them.
I would like to thank Rohit Vallam, PhD scholar, CSA dept., for coordinating
Prof. Vittal Rao’s lectures and giving me an opportunity to typeset some of the
lectures, honing my LATEX skills. Thanks to all the Students’ Council chairmen -
Brijesh Bhatt, Sreevalsa, Pramod Kumar Verma and Ganesh for warm company
and kind co-operation.
I would also like to thank Anvesha, Jaidev, Jayanarayan, Naveen, Nandaku-
mar, Siva Rama Krishna all of IIT Bombay for the excellent company at the
26th VLSI Design Conference in Pune. I thank Shivaprasad and Viswanath
for accommodating me a night each and taking me out in Bombay, and to
Prof. Rushikesh Joshi for excellent hospitality and gift of books on our visit to
IIT Bombay for the DIT project review. Special thanks to Pramod for being an
excellent room-mate in Hyderabad while attending the 25th VLSI Design Confer-
ence. Special thanks to BT, Manodipan, Nandish, Vikram for excellent company
at Delhi and trip to National Brain Research Institute (NBRI), Manesar while
attending the 23rd VLSI Design Conference.
I would like to thank Mr. Sravan Kumar Gampa, lawyer at K&S Partners, who
interacted with me and drafted our patent application; and to the panel consist-
ing of Prof. Anurag Kumar, Prof. S. A. Shivashankar and Prof. Navakanta Bhat
for listening to our presentation and approving the patent application. I would
also like to thank Prof. P S Sastry, Prof. Rajesh Sundaresan, Prof. K. J. Vi-
noy and Prof. Navakanta Bhat for quizzing me at my comprehensive exam and
the panel of Prof. Anurag Kumar, Prof. Utpal Mukherji, Prof. P. Vijay Kumar,
Prof. A Chockalingam besides my advisor for quizzing me in the interview for
admission.
Last but not the least, I am deeply indebted to my family for letting me
embark on the uncertain journey of a PhD and fully supporting me throughout.
Abstract
A scheme for Built-in-Self-Test (BIST) of analog signals with minimal area over-
head, for measuring on-chip voltages in an all-digital manner is presented in this
thesis. With technology scaling, the inverter switching times are becoming shorter
thus leading to better resolution of edges in time. This time resolution is observed
to be superior to voltage resolution in the face of reducing supply voltage and
increasing variations as physical dimensions shrink. In this thesis, a new method
of observability of analog signals is proposed, which is digital-friendly and scal-
able to future deep sub-micron (DSM) processes. The low-bandwidth analog test
voltage is captured as the delay between a pair of clock signals. The delay thus
setup is measured digitally in accordance with the desired resolution.
Such an approach lends itself easily to distributed manner, where the rout-
ing of analog signals over long paths is minimized. A small piece of circuitry,
called sampling head (SpH) placed near each test voltage, acts as a transducer
converting the test voltage to a delay between a pair of low-frequency clocks. A
probe clock and a sampling clock is routed serially to the sampling heads placed
at the nodes of analog test voltages. This sampling head, present at each test
node consists of a pair of delay cells and a pair of flip-flops, giving rise to as many
sub-sampled signal pairs as the number of nodes. To measure a certain analog
voltage, the corresponding sub-sampled signal pair is fed to a Delay Measurement
Unit (DMU) to measure the skew between this pair. The concept is validated by
designing a test chip in UMC 130 nm CMOS process. Sub-mV accuracy for static
signals is demonstrated for a measurement time of few milliseconds and ENOB of
5.29 is demonstrated for low bandwidth signals in the absence of sample-and-hold
circuitry.
The sampling clock is derived from the probe clock using a PLL and the design
vii
equations are worked out for optimal performance. To validate the concept, the
duty-cycle of the probe clock, whose ON-time is modulated by a sine wave, is
measured by the same DMU. Measurement results from FPGA implementation
confirm 9 bits of resolution.
List of publications from this
thesis
Patent
1. Rajath Vasudevamurthy and Bharadwaj Amrutur, “System and Method for
Built-in Self Test (BIST) in an Integrated Circuit,” filed on 28th September
2012, bearing application number 4068/CHE/2012.
Journals
1. Rajath Vasudevamurthy, Pratap Kumar Das and Bharadwaj Amrutur,
“Time-Based All-Digital Technique for Analog Built-in-Self-Test,” IEEE
Transactions on Very Large Scale Integration (VLSI) Systems, In early ac-
cess
2. Bharadwaj Amrutur, Pratap Kumar Das and Rajath Vasudevamurthy,
“0.84 ps Resolution Clock Skew Measurement via Subsampling,” IEEE
Transactions on Very Large Scale Integration (VLSI) Systems, vol. 19,
No. 12, Dec. 2011, pp. 2267 - 2275.
Conferences
1. Rajath Vasudevamurthy and Bharadwaj Amrutur, “Multiphase technique
to speed-up delay measurement via sub-sampling,” 26th International Con-
ference on VLSI Design, 5th-10th January 2013, Pune, India. (Nominated
for Best Student Paper award)
ix
2. Rajath Vasudevamurthy, Pratap Kumar Das and Bharadwaj Amrutur, “A
Mostly-Digital Analog Scan-out Chain for Low Bandwidth Voltage Mea-
surement for Analog IP Test,” in proceedings of 44th IEEE International
Symposium on Circuits and Systems (ISCAS), May 2011, pp. 2035 - 2038.
Contents
Acknowledgements iii
Abstract vi
List of publications from this thesis viii
Contents x
List of Figures xiv
List of Tables xvii
Acronyms xviii
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Testing Economics . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Scope of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Organization of the thesis . . . . . . . . . . . . . . . . . . . . . . 4
2 State-of-the-Art Analog/RF BIST 5
2.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Increasing observability of analog circuits . . . . . . . . . . . . . . 6
2.2.1 Analog Routing . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.2 Analog Routing with a Digital Interface . . . . . . . . . . 8
2.2.3 Analog Waveform capturers . . . . . . . . . . . . . . . . . 8
2.3 BIST methods for Analog Circuits . . . . . . . . . . . . . . . . . . 8
CONTENTS xi
2.3.1 Vector based methods . . . . . . . . . . . . . . . . . . . . 8
2.3.2 Vectorless methods . . . . . . . . . . . . . . . . . . . . . . 9
2.3.3 BIST in the SoC context . . . . . . . . . . . . . . . . . . . 10
2.3.4 Concurrent test techniques . . . . . . . . . . . . . . . . . . 11
2.4 Spectral analysis based tests . . . . . . . . . . . . . . . . . . . . . 11
2.5 Mixed Signal Test . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.5.1 Testing of Data Converters . . . . . . . . . . . . . . . . . . 12
2.5.2 Clock signal testing . . . . . . . . . . . . . . . . . . . . . . 12
2.6 RF Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.6.1 RF Design Considerations . . . . . . . . . . . . . . . . . . 12
2.6.1.1 Linearity of LNA-mixer . . . . . . . . . . . . . . 13
2.6.1.2 Noise Figure of LNA . . . . . . . . . . . . . . . . 13
2.6.2 RF testing approaches . . . . . . . . . . . . . . . . . . . . 13
2.6.2.1 Loopback technique . . . . . . . . . . . . . . . . 13
2.6.2.2 Statistical Sampler . . . . . . . . . . . . . . . . . 14
2.6.2.3 Noise figure measurement . . . . . . . . . . . . . 14
2.7 Time-based ADC design . . . . . . . . . . . . . . . . . . . . . . . 14
2.8 Distributed Architecture . . . . . . . . . . . . . . . . . . . . . . . 16
2.9 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3 State-of-the-Art Time-to-Digital Converters 19
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2 TDC with gate-delay resolution . . . . . . . . . . . . . . . . . . . 22
3.3 TDC with sub-gate-delay resolution . . . . . . . . . . . . . . . . . 23
3.4 Oversampling TDC Considerations . . . . . . . . . . . . . . . . . 25
3.5 Oscillator-based TDC . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.6 Sub-sampling Approach . . . . . . . . . . . . . . . . . . . . . . . 29
3.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4 Proposed Architecture 33
4.1 Proposed Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.1.1 Measurement Procedure . . . . . . . . . . . . . . . . . . . 35
4.2 Voltage-to-Delay Conversion . . . . . . . . . . . . . . . . . . . . . 37
CONTENTS xii
4.2.1 Sample-and-Hold Action . . . . . . . . . . . . . . . . . . . 38
4.2.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . 40
4.3 Design Considerations . . . . . . . . . . . . . . . . . . . . . . . . 40
4.4 Hardware Implementation Details . . . . . . . . . . . . . . . . . . 42
4.4.1 Sub-sampling Based Delay Measurement Unit (DMU) . . . 42
4.4.2 Generation and Routing of Clocks . . . . . . . . . . . . . . 45
4.5 Measured Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.5.1 DC Measurements . . . . . . . . . . . . . . . . . . . . . . 46
4.5.2 AC Measurements . . . . . . . . . . . . . . . . . . . . . . 48
4.5.3 A note on stability of calibration data . . . . . . . . . . . 50
4.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5 Performance Limits and Sampling Clock Generation 52
5.1 Behavioral model . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.2 Derivation of Design Parameters . . . . . . . . . . . . . . . . . . . 56
5.3 Use of PLL to Generate Sampling Clock . . . . . . . . . . . . . . 58
5.4 Experimental Validation . . . . . . . . . . . . . . . . . . . . . . . 62
5.4.1 Implementation of duty-cycle measurement unit . . . . . . 64
5.4.2 Large divide ratios and dithered divide ratio . . . . . . . . 66
5.5 Measurement Results . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
6 Multiphase technique to speed-up delay measurement via sub-
sampling 73
6.1 Proposed Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.2 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6.3 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . 79
6.3.1 Case of Fixed Input Delay . . . . . . . . . . . . . . . . . . 79
6.3.2 Cose of Slowly Varying Input Delay . . . . . . . . . . . . . 79
6.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
7 Example Application 84
7.1 Power Scalable Receiver Implementation . . . . . . . . . . . . . . 84
7.2 BIST Implementation . . . . . . . . . . . . . . . . . . . . . . . . . 85
CONTENTS xiii
7.3 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . 89
7.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
8 Conclusions 91
8.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
8.2 Scope for future work . . . . . . . . . . . . . . . . . . . . . . . . . 92
A Unbiased Delay Estimator 94
B Noise in Inverter Chain 98
References 100
List of Figures
2.1 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Possible BIST Architecture . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Distributed solution with only digital signals routed over long paths 17
3.1 Concept of a TDC . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2 Basic TDC with gate delay resolution . . . . . . . . . . . . . . . . 22
3.3 Basic TDC wrapped back as a ring . . . . . . . . . . . . . . . . . 23
3.4 Vernier TDC with sub-gate delay resolution of D −D′ = ∆ . . . . 24
3.5 Classical oscillator-based TDC . . . . . . . . . . . . . . . . . . . . 27
3.6 Gated Ring Oscillator TDC . . . . . . . . . . . . . . . . . . . . . 28
3.7 Illustration of a typical Clock Distribution Network with various
components contributing to clock skew. Courtesy: Pratap Kumar
Das [1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.8 Illustration of Sub-sampling approach . . . . . . . . . . . . . . . . 30
3.9 Timing diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.1 Proposed Architecture for Analog BIST . . . . . . . . . . . . . . . 34
4.2 Schematic Circuit of current-starved Voltage to Delay cell (V2D) . 41
4.3 Block Diagram of Implemented Set-up. . . . . . . . . . . . . . . . 43
4.4 Illustration of timing diagram and the concept of Sub-sampling. . 45
4.5 Die photo along with snapshot of layout. . . . . . . . . . . . . . . 46
4.6 Plots of ‘offset-canceled’ differential delay versus differential volt-
age for the settings mentioned. Refer Table 4.2 for the settings of
#1,2,3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.7 DNL and INL plots for the setting of entry 8 in Table 4.3 . . . . . 49
LIST OF FIGURES xv
4.8 Plot of difference delay versus difference voltage after subtracting
the delay at zero difference voltage from both curves at two time
instants separated by 1.5 hours. . . . . . . . . . . . . . . . . . . . 51
5.1 Behavioral model of voltage quantization employing the DMU . . 53
5.2 Plot of OSR versus n, showing the existence of optimal OSR for a
given fin. Effective n = min(n1, n2) (5.14). The other parameters
of the equations are taken from the settings described in Table 4.3.
The dots indicate the results summarized in Table 4.3. The gap
between the modeled and measured behavior is because the differ-
ential delay generated is a small fraction of the clock time period,
and the resolution improves as the ratio of differential delay to
time period increases. An explicit way of ensuring it to speed-up
measurement and achieved SNR is described in Chapter 6. . . . . 56
5.3 A typical PWM signal - the modulating sine wave is also shown in
dotted lines. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.4 Block diagram of system implemented in Virtex 5 development board 64
5.5 State Machine of System Implemented in FPGA (Fig. 5.4) . . . . 64
5.6 The sources of probe and sampling clocks for different cases . . . 65
5.7 Samples of duty-cycle measurement - Quantized values of the input
sine wave . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.8 Spectrum of measured duty-cycle samples, showing a clear peak at
10 Hz, the input sine frequency . . . . . . . . . . . . . . . . . . . 70
5.9 Linear curve fitting of SNR versus log(N) . . . . . . . . . . . . . 71
5.10 Gap between theoretically predicted parameters and actual mea-
surement settings. Note that the difference is least at large values
of SNR. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.11 Plot showing theoretical limits on SNR and results (mean SNR)
obtained from measurement. . . . . . . . . . . . . . . . . . . . . . 72
6.1 Block diagram of DMU (Delay Measurement Unit) based on sub-
sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
LIST OF FIGURES xvi
6.2 Illustration of the sampling clock precessing around the input clock.
The circumference represents the time period of input clock while
the sector represents the delay to be measured. The asterisk
shaped points are the edges of sampling clock. . . . . . . . . . . . 75
6.3 Counts corresponding to period and delay. Two-phase and four-
phase clocks measure delay twice and four times in a beat period
respectively, thus providing more accuracy in the same measure-
ment time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
6.4 Plot of speed-up obtained corresponding to fraction of delay to
time-period. Here N is the number of phases of clock available. . 77
6.5 Flowchart for the proposed scheme . . . . . . . . . . . . . . . . . 82
6.6 Block diagram implemented in MATLAB Simulink . . . . . . . . 83
7.1 Block diagram of power-scalable receiver. Courtesy: Kaushik Ghosal 86
7.2 Sampling head (SpH) . . . . . . . . . . . . . . . . . . . . . . . . . 86
7.3 Architecture of voltage controlled delay cells . . . . . . . . . . . . 87
7.4 Block Diagram of the BIST Setup . . . . . . . . . . . . . . . . . . 88
7.5 Die micrograph of the power scalable receiver implementation with
the layout snapshots of BIST blocks inserted . . . . . . . . . . . . 89
7.6 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . 90
List of Tables
3.1 Comparison of various time measurement architectures . . . . . . 32
4.1 Comparison of various voltage-to-time conversion techniques . . . 39
4.2 Summary of Measured Results for DC input . . . . . . . . . . . . 47
4.3 Summary of Measured Results for Sine wave input . . . . . . . . . 49
5.1 Example numbers for parameters discussed in (5.27) and (5.26) . 59
5.2 Example numbers for design parameters fp and N for desired SNR
at given frequency fin . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.3 Summary of Measured Results comparing asynchronous and syn-
chronous cases of sampling clock generation . . . . . . . . . . . . 67
5.4 Summary of Measured Results comparing asynchronous and dithered
(synchronous) cases of sampling clock generation . . . . . . . . . . 67
6.1 Summary of Measured Results for fixed input delay . . . . . . . . 80
6.2 Summary of Measured Results for slowly varying delay . . . . . . 80
Acronyms
AC Alternating Current
ADC Analog-to-Digital Converter
AGC Automatic Gain Control
ATE Automatic Test Equipment
BIST Built-In Self Test
CAD Computer Aided Design
CDN Clock Distribution Network
CDR Clock and Data Recovery
CMFB Common Mode Feed-Back
CMOS Complementary Metal Oxide Semiconductor
DAC Digital-to-Analog Converter
DC Direct Current
DfT Design for Testability
DLL Delay Locked Loop
DMU Delay Measurement Unit
DNL Differential Non-Linearity
DPPM Defective Parts Per Million
DSM Deep Sub-Micron
DSP Digital Signal Processing
DUT Device Under Test
Acronyms xix
ENOB Effective Number Of Bits
FPGA Field Programmable Gate Array
GPS Global Positioning System
GRO Gated Ring Oscillator
IEEE Institute of Electrical and Electronics Engi-
neers
IF Intermediate Frequency
INL Integral Non-Linearity
IP Intellectual Property
LFSR Linear Feedback Shift Register
LNA Low Noise Amplifier
LO Local Oscillator
LSB Least Significant Bit
NF Noise Figure
OSR Over-Sampling Ratio
PET Positive Electron Tomography
PLL Phase-Locked Loop
PSD Power Spectral Density
PVT Process, Voltage and Temperature (varia-
tions)
RISI Received Interference Strength Indicator
RSSI Received Signal Strength Indicator
SAR Successive Approximation Register
SNR Signal to Noise Ratio
Acronyms xx
SoC System on-chip
SpH Sampling Head
SRAM Static Random Access Memory
TDC Time-to-Digital Converter
UMC United Microelectronics Corporation
V2D Voltage to Delay
VCO Voltage Controlled Oscillator
VGA Variable Gain Amplifier
Chapter 1
Introduction
1.1 Motivation
System on-chip (SoC) designs are becoming increasingly popular owing to the
tremendous integration capability enabled by CMOS technology scaling [2]. With
increasing integration, designers fabricate analog, digital and mixed-signal cir-
cuits on the same chip to reduce packaging and assembly costs. With technology
scaling, majority of the required signal processing (especially nonlinear) is imple-
mented digitally and minimal analog circuitry is used, mainly to interface with
the external world [3]. But with shrinking physical dimensions, increasing process
variability is offering a design challenge. While CAD (Computer Aided Design)
tools are deployed increasingly for digital designs, analog and mixed-signal cir-
cuits are getting tougher to design due to reduction of available voltage headroom
and increasing variability.
With shrinking physical dimensions, automated production testing of chips
also becomes very essential. Lot of work has been to modify digital designs
in a way conducive to ease testing, motivated by the feature of observability
and controllability of critical internal nodes on-chip. Commonly used techniques
include
• insertion of observe and control flops at critical internal nodes and at input
and output ports of memory,
• stitching all the flops into a scan chain,
Chapter 1. Introduction 2
• testing of memory by writing in various patterns and reading out.
Such automated techniques are well developed for digital designs since the
fault classes are well defined, such as
• stuck-at faults,
• path delay or transition faults,
• coupling faults, etc.
whereas, defining such fault classes for analog circuits is not straight-forward [4].
As a result, one has to test the analog circuit to check if it meets the specifi-
cations desired, leaving little room for automation. Moreover, analog designers
themselves put margins in their designs to ensure robustness. For these rea-
sons, automated analog and mixed-signal testing had taken a back seat thus far,
but can no longer be so with the increasing popularity of SoC designs and deep
sub-micron (DSM) processes.
1.2 Testing Economics
While high quality of test procedure is needed, it is also very essential to reduce
the cost of testing. With technology scaling, the cost of testing a transistor is
already about a third of the manufacturing cost and is on the increase. The testers
interfacing to the devices are themselves very costly, and every millisecond spent
by the chip on the tester costs. The research opportunity is to ensure test quality
with minimal impact to test time. More details of test economics are described
in [3].
The total testing cost can be split up into two parts - time spent on the tester
and extra area in silicon to enable testing. Let CSi be the cost of silicon per unit
area and CT be the cost per unit time spent on the tester. The extra area taken
up on silicon to enable testing must be compensated by the reduction in testing
time and the cost incurred thereof.
Suppose T1 is the time spent on the tester without extra on-chip built-in self-
test (BIST) and if that reduces to T2 as result of putting on-chip BIST taking up
an area ASi, then [5]
T1CT ≥ ASiCSi + T2CT (1.1)
Chapter 1. Introduction 3
In other words, the maximum area of the on-chip BIST feasible without increasing
the total testing cost is given by
ASi ≤CTCSi
(T1 − T2) (1.2)
1.3 Scope of the thesis
The cost of designing analog and mixed-signal circuits is mainly limited by the
cost of testing [6]. Hence, design of low cost test strategies is critical to the
manufacturing of analog and mixed-signal circuits.
A technique of digitizing on-chip analog test voltages by way of time-based
processing is presented in this thesis. The technique is well tuned for observing
analog voltages internal to the chip, either for production testing or in-use mon-
itoring. This technique allows a distributed architecture, wherein a small piece
of circuitry called sampling head (SpH) is placed at each test node while the
measurement unit common to all is located centrally. Such an approach avoids
analog routing over long paths along with shielding, thereby saving area and mak-
ing the design digital-friendly. Also, since the test voltage is always connected
to the sampling head, this approach avoids insertion of switches into the signal
path which can potentially degrade system performance. The said sampling head
(SpH) locally converts test voltages into time delay on a pair of low frequency
clock signals, which is then routed to the central measurement unit. Since it is
the digital signals which are routed over long paths, the test power estimates can
be calculated easily.
The mentioned approach needs a pair of clocks - a probe clock to carry the
delay information and a sampling clock to ease the measurement of the delay
thus setup. Measured results from an 130 nm test chip implemented in UMC
CMOS process confirms the ability of resolving voltages of less than a milli-volt
and yields an ENOB (effective number of bits) of 5.29 bits in the dynamic range
of 100 mV.
Performance limits based on the frequencies of probe and sampling clocks
are derived. An FPGA implementation of a time based ADC (Analog-to-Digital
Converter) is described using the said method, wherein the said sampling clock
Chapter 1. Introduction 4
is derived from the probe clock, with an intention of minimizing the number of
pins needed to talk to the tester. Measured results from FPGA implementation
confirms achievable SNR of 55 dB. A technique of speeding up the measurement
using multiple phases of the probe clock is described.
1.4 Organization of the thesis
A review of the state-of-the-art BIST techniques is presented in Chapter 2. A
review of the state-of-the-art time measurement techniques are presented in Chap-
ter 3. An overview of the proposed architecture is presented in Chapter 4. Anal-
ysis of the system for performance limits and FPGA implementation of sampling
clock generation system is presented in Chapter 5. Chapter 6 describes a tech-
nique to overcome the limitation of dynamic range, Chapter 7 presents an exam-
ple of the proposed technique used in a test chip manufactured in UMC 130nm
process and Chapter 8 concludes.
Appendix A provides a proof that the delay estimator used yields an unbiased
estimate and Appendix B shows that the SNR of the voltage information prop-
agating in the form of delay through an inverter chain decreases with increasing
chain length.
Chapter 2
State-of-the-Art Analog/RF
BIST
2.1 Background
When the design of analog circuits using discrete components were in vogue,
the problem of analog testing or diagnosis was one of fault localization, i.e., of
identifying the fault site and taking appropriate corrective measures. Such faults
are said to be either
parametric where the component values are different from nominal values, or
catastrophic such as short, open, stuck-at or coupling faults leading to failure.
But with the recent pervasiveness of integrated circuits and the popularity of
SoC designs enabled by the tremendous integration, analog testing is increasingly
becoming a necessity for manufacturing high quality devices and reducing time-
to-market [6]. Such an analog testing may be broadly categorized into
structural test fault model based testing
functional test specification based testing
alternate test signature/checksum based testing [7]
Historically, analog circuits have been tested functionally against their specifi-
cations, owing to fewer number of primary inputs and outputs. Although fault
models are well understood in the context of digital designs, it is still in its
Chapter 2. State-of-the-Art Analog/RF BIST 6
nascent stage in analog circuits. Design of robust test criteria for analog testing
is derived in [8] employing tools from machine learning and using feature extrac-
tion. A number of fault models for different analog components are presented
in [9] for structural testing, the most important models being sensitivity (of out-
put parameters to circuit elements) based test [10] and transfer function based
test [11].
Both structural and functional tests are mainly used for production testing
and are concerned predominantly with the design of automatic test equipment
(ATE). But the electronics in an ATE test head lags behind the performance
capacity of the device under test (DUT), thus compromising the information
available at the ATE. In order to circumvent this limitation, test engineers are
increasingly opting for design for testability (DfT) and BIST methodologies [6].
BIST techniques can be used both for production testing, making the required
testers (ATE) simpler and therefore cheaper; and also to monitor circuits in-field
during normal operation. This two-fold application is increasingly making BIST
techniques the most preferred choice for high volume and low test cost strategies.
A review of a representative set of these techniques is described next.
2.2 Increasing observability of analog circuits
Analog designers are increasingly operating transistor circuits in the saturation
region, since the prevalent channel length modulation provides a linear variation
of drain current versus the drain-to-source voltage, simply modeled as a current
source in parallel with a resistor [12]. But this mode of designing necessitates
the additional testing requirement of testing for DC biasing faults to ensure that
the transistor operates in the intended mode [3, 5]. With the popularity of IP-
based designs, where certain IP cores are embedded into SoCs, the accessibility to
individual IPs, especially analog circuits is reduced. As a result, an architecture
which can observe a few test nodes possibly distributed all over the chip, as shown
in Fig. 2.1 is sought.
Chapter 2. State-of-the-Art Analog/RF BIST 7
Figure 2.1: Problem Definition
2.2.1 Analog Routing
A technique of analog routing, wherein voltages and/or currents to be measured
in some internal circuitry are literally “scanned” out to test pins have been pro-
posed [13], but it is suitable only for low frequency signals. An approach of using
an op-amp in one of two modes is presented in [14]; where the op-amp operates
as a voltage follower in ‘test’ mode and as an amplifier in the ‘functional’ mode.
It is argued that such an approach does not degrade performance as is caused
by insertion of many switches. The IEEE 1149.4 standard, which is an extended
version of the boundary scan standard, defines an analog test bus architecture
where several test points inside the system are addressable. Two extra pins - a
primary input and a primary output - are needed for excitation of analog test sig-
nals and measurement of outputs respectively. These signals are routed through
the system by an analog bus of two wires, and all other interconnection points
are connected by analog switches.
But, in all these techniques, analog circuits are used to route analog volt-
ages/currents, which can themselves lead to signal distortion while propagation
in the signal path, possibly caused by coupling and parasitic loads. It is hence
Chapter 2. State-of-the-Art Analog/RF BIST 8
desirable to have testing circuitry which are simpler than those being tested.
2.2.2 Analog Routing with a Digital Interface
Analog routing with digital interface has also been proposed [15], where an ana-
log voltage is digitized and the bits are scanned out serially through a single
pin. Similarly, one can also scan in digital bits and excite circuits with ana-
log voltages using a digital-to-analog converter (DAC). But with reducing power
supply voltages in the deep sub micron technology nodes, leading to reduction of
available voltage headroom, designing conventional ADC architectures for such
applications is becoming increasingly difficult.
A technique of using comparators in either static mode for DC signals or
clocked mode for dynamic signals is presented in [16], which enables the ‘verifi-
cation’ of voltage levels in analog circuits. But the variable reference needed for
the comparator is said to come from bias voltages of other IPs, which may not
scale well to the DSM process due to increased variability of bias voltages.
2.2.3 Analog Waveform capturers
Authors in [17, 18] suggest a technique of displaying analog signal waveforms
using the technique of sub-sampling. Such a technique works well for periodic
signals, otherwise periodicity has to be introduced artificially, as done in [18].
Although the method is well-suited for viewing waveforms in a laboratory, it
cannot be used directly for automated testing.
2.3 BIST methods for Analog Circuits
The methods proposed to address the need for built-in self-test of circuits can be
classified based on the need for application of test vectors.
2.3.1 Vector based methods
A technique of signature-based (check-sum) fault checking is introduced in [19]
which uncovers faults that affect a circuit’s DC transfer function. While use of
sinusoidal signals with frequency scan enables the characterization of a linear
Chapter 2. State-of-the-Art Analog/RF BIST 9
system, application of a large number of sinusoidal components is a slow process,
and especially if the system dynamics are slow [20]. An optimal choice of sinu-
soidal stimuli can be made based on the sensitivity analysis presented in [21]. A
technique of applying pseudo-random noise by way of a sequence generated by
an LFSR (Linear Feedback Shift Register) converted by the existing DAC and
an ADC capturing the response is presented in [22]. One can also make use of
signature analysis of the data captured by ADC as reported in [15]. Techniques
of on-chip ramp generation with precisely controlled slopes are presented in [23]
with an intention of testing ADCs.
2.3.2 Vectorless methods
A multi-tone testing technique is presented in [24], where the DC outputs of
multiple IPs modulate different tones which are added together and analyzed
by a single digitizer. The oscillation-based technique described in [25] allows
the testing of amplifiers and filters without needing an external stimulus, as the
circuit under test is converted to an oscillator in the test mode. The frequency
and amplitude characteristics of the oscillator deviate in the presence of faults.
But this method necessitates the insertion of switches for reconfigurability, which
could lead to performance degradation.
A method of analyzing supply current by use of current sensors, exploiting
the cross-correlation between supply current and output dynamics is described
in [26]; but calibration is needed for each specific circuit as the consumption is
technology dependent.
Authors in [5] have proposed an architecture, where the DC voltages (of BIST
sensors) of test nodes are all tied together to a common bus and digitized centrally
through a 12-bit ADC, shown in Fig. 2.2. Even AC signals are converted to DC
through an envelope detector circuit, but calibration is required in this case to
map the digitized values to analog amplitudes. In such a case, one has to know
the number of sensor nodes in advance or design for a worst-case scenario to
ensure that the value on the bus settles within a specified time. Also, for the
AC case, calibration puts a lower bound on the testing time required. It would
be beneficial to have an approach where the design of driver for the bus can
Chapter 2. State-of-the-Art Analog/RF BIST 10
Figure 2.2: Possible BIST Architecture
be made independent of the number of test nodes, and also eliminate the need
for calibration in testing AC signals, which would reduce the time required for
testing.
2.3.3 BIST in the SoC context
The additional hardware overhead for waveform generators and analyzers for
BIST can be minimized if the test circuitry is used by other components in the
system, as is likely in an SoC context. A scheme of generating arbitrary waveforms
for stimulus by way of applying a high-speed bit stream filtered by an analog low-
pass filter, and digitizer by using a voltage comparator with a variable reference is
reported in [27]. Although an elegant solution for BIST, the filters needed for the
signal generator and digitizer, and requirement of high precision synchronization
are the overheads for such a scheme.
Chapter 2. State-of-the-Art Analog/RF BIST 11
2.3.4 Concurrent test techniques
Some of the techniques discussed earlier has the circuit topology modified dur-
ing test [25] or input signal is being controlled by the test scheme [27]. But for
deep sub-micron process, which are very sensitive to noise and radiation effects,
development of test strategies that evaluate the circuit during normal operation,
referred to as on-line or concurrent test, is of interest [20]. A strategy to enable
such a concurrent testing by use of duplicates of the circuit under test is pre-
sented in [28], where a comparison mechanism verifies the similarity between the
programmable reference block and the block under test. But the programmable
reference block may be difficult to obtain for a variety of analog circuits.
A technique of digital replication is proposed in [29], where a filter learns a
model of the fault-free circuit and compares it with the actual circuit under test.
The area overhead of such a scheme is large, since at least two ADCs are needed
to sample the inputs and outputs of the circuit being modeled, and the compute
power to learn the model is also huge. A method of concurrent error detection for
linear analog circuits using continuous checksums is proposed in [30], where the
specifying parameters of the circuit under test change due to presence of faults.
Clearly, such a solution is dependent on circuit topology and needs digitizers to
evaluate checksums.
2.4 Spectral analysis based tests
A number of DSP-based techniques of testing Analog and Mixed Signal circuits
are described in [31], especially the Fourier Voltmeter (FVM) where the magni-
tude and phase of any arbitrary spectral component of a periodic waveform can
be measured by a pair of quadrature correlators.
A method where the power spectral density (PSD) of the output of a system
excited with white noise and used as a signature of the test is described in [32],
and a distance measure between two PSDs is used to decide whether the circuit
under test is faulty or not. A similar technique is used in [20] to measure the
deviation of quality factor of a biquad filter when excited with a low amplitude
stimulus and output captured by a low resolution ADC.
Chapter 2. State-of-the-Art Analog/RF BIST 12
2.5 Mixed Signal Test
2.5.1 Testing of Data Converters
A technique of full speed testing of ADC based on histogram of the output in
response to sinusoidal excitation is presented in [33], where gain and measures of
non-linearity can be derived from the said histogram. A technique of precision
DAC testing is described in [34] using low resolution ADCs with dithering. Static
linearity test of better than 1 LSB for a 14-bit DAC is demonstrated using a 6-bit
ADC.
2.5.2 Clock signal testing
A technique of characterizing jitter of clocks is described in [6], where the test
clock is mixed with a reference clock and the jitter or phase noise of the clock is
determined from the statistics of the error signal at the output of the mixer. But
generation of an accurate reference clock is the bottleneck in this scheme. How-
ever, such a method will be useful in characterizing data dependent deterministic
jitter, periodic jitter, bounded and uncorrelated random jitter [35].
2.6 RF Test
A typical RF link is composed by a receiver and a transmitter section. A block
diagram of a power scalable receiver implemented in a 130 nm CMOS UMC
process is shown in Fig. 7.1, the important blocks of which are LNA (Low Noise
Amplifier), mixer, ADC, VGA (Variable Gain Amplifier) and filter along with
a PLL (phase-locked loop) for local oscillator (LO) as needed [36]. Similarly, a
transmitter consists of DAC, mixer, filter, power amplifier and a PLL for LO.
2.6.1 RF Design Considerations
The parameters a designer needs to keep in mind while designing circuits to work
at radio frequencies is explained as a design hexagon in [37]. A few of those
parameters which are critical from a testing perspective are presented here.
Chapter 2. State-of-the-Art Analog/RF BIST 13
2.6.1.1 Linearity of LNA-mixer
Linearity is an important issue in the receiver front-end because strong interfer-
ences which may be present at the antenna can potentially ‘drown’ the signal
of interest. Third order non-linear distortion is particularly important, because
intermodulation products may fall in the desired signal band. A performance pa-
rameter called “third order intercept point” (IP3) is defined to characterize this
specific behaviour [37].
2.6.1.2 Noise Figure of LNA
The SNR of the signal degrades continuously as it passes through the receiver
chain. A parameter called “noise figure” is defined to characterize this degrada-
tion in SNR. According to Friis’ formula [38], the noise figure of the first stage
in the receiver contributes the most to the overall noise figure, and hence the
first stage of a receiver chain is typically a ‘low-noise amplifier’, with the critical
specification being its noise figure (NF).
2.6.2 RF testing approaches
2.6.2.1 Loopback technique
One of most important strategies used traditionally is the loopback technique,
which routes the signal from the transmitter back to the receiver without using
a wireless link. As mentioned in 1.1, designers try to minimize analog circuitry
in the system and do the signal processing digitally, invariably using ADCs and
DACs to interface to the external world. Such ADCs and DACs, which will
be present on-chip, can be made use of for controlling and observing analog
signals [3, 40].
While it poses lesser test overhead as every block is not separately tested,
an issue is the possibility of specific faults getting masked, since the entire RF
path is tested. A loopback strategy suitable for transceivers is proposed in [41]
where a certain spectral signature is fed to the transmitter and the output of the
receiver chain is captured and analyzed for faults. In [42], optimized periodic
bit streams modulated at baseband are used in a loopback configuration and
Chapter 2. State-of-the-Art Analog/RF BIST 14
functional parameters such as gain and IIP3 of the transmitter and receiver are
estimated from the captured receiver response.
2.6.2.2 Statistical Sampler
A technique of evaluating the spectrum at specific test points in the signal path
is described in [43] where the test node is compared with noise signal using
a single-bit comparator, and digitally processing the single bit output. This
technique is also demonstrated to compute the IP3 of mixer using the two-tone
stimulus. Although the sampler itself has low area overhead and can be replicated
at multiple test nodes, implementation of well controlled noise generator can be
an issue.
2.6.2.3 Noise figure measurement
As mentioned in 2.6.1.2, noise figure is the degradation in SNR between the input
and output of a block. Hence, it can be measured as the difference between input
and output SNR of the DUT. An alternate definition of noise figure is given
by [37], as
Noise figure (dB) = 10 log10
(Total Output Noise Power
Output Noise Power due to Input Signal Only
)With a view to measure noise figure, [44] presents a technique of measuring noise
power by comparing it with a low amplitude periodic reference signal using a
single bit comparator, exploiting the phenomenon described in [45]. The range
of amplitudes of the reference signal for acceptable error is also described, in
accordance with [45].
2.7 Time-based ADC design
Quite a few approaches of testing re-using ADCs that may present in the system
were described in this chapter. The design of ADCs in the deep sub-micron pro-
cesses are getting increasingly difficult due to reducing voltage headroom and in-
creasing process variations. However, in the case of time based architectures, time
Chapter 2. State-of-the-Art Analog/RF BIST 15
resolution has improved since the transition time of digital signals has reduced
with technology scaling [46]. The all-digital nature of time-based approaches of-
fers itself for scaling and suits stringent area and power specifications. A lot of
recent research activity has focused on designing ADCs based on this method-
ology of time based architectures. Although use of such time-based ADCs for
testing applications is not explicitly described in literature, such ADCs can po-
tentially be made use of for testing applications too. Hence, a brief survey of
such time-based ADCs is presented next.
The two main parts of such solutions are (a) ‘transducer’ to convert voltages
into time pulses or delays, and (b) to measure time/delays. A 9.4 ENOB SAR
ADC is demonstrated in [47] where the input and reference voltages are trans-
formed into time pulses and their duration is compared. Authors in [48] have
extended this technique and demonstrated a 10-bit ADC working at a low supply
voltage of 0.6 V, whereas conventional ADC architectures can go up to only 9
bits of resolution for a comparator noise of standard deviation of half LSB.
Authors in [49] have explained the classical voltage-to-time-to-digital and
voltage-to-delay-to-digital architectures, and presented an implementation pro-
viding 4 bit resolution with power consumption of less than 2.4 mW. Authors
in [50] have also presented a similar idea. By implementing moving average fil-
tering, a resolution of 12 µV per LSB at a sampling rate of 10 kHz is achieved.
Another digital approach which is gaining popularity is the VCO (Voltage
Controlled Oscillator) based approach. In this approach, the voltage to be quan-
tized controls the frequency of the VCO, and the count of edges of the VCO out-
put in a certain measurement time is the quantization of the analog voltage [51].
Authors in [52] propose a sigma-delta ADC with the VCO as the quantizer to
overcome the non-linearity of the voltage to frequency transfer curve.
Authors in [53] propose a ring oscillator ADC where a differential transistor
pair drives two identical ring oscillators as a matched load. The voltage difference
is digitized by the difference between the counters which capture the frequencies
of the two oscillators. They report a bin size of 16 mV with 80 mV range and
consuming a current of 37 µA. However, while implementing two ring oscillators,
there is a possibility of injection locking and adequate care has to be taken to
avoid it.
Chapter 2. State-of-the-Art Analog/RF BIST 16
2.8 Distributed Architecture
With an intention of solving the problem of digitizing multiple analog voltages
distributed over the chip as shown in Fig. 2.1, techniques where the front end
of the digitizer is separated and placed at each test node with the remaining
circuitry shared across multiple nodes have been proposed.
For example, a technique of distributed SAR ADC is described in [54], where
the one-bit comparator is located at each test node while the capacitive DAC is
located centrally, to be shared by multiple test nodes. While this eliminates the
need for multiple accurate DACs, the DAC voltage (which is an analog voltage)
needs to be routed to all the test nodes. A technique wherein the routing of
analog signals is minimized and replaced by routing of digital clock signals is
shown in Fig. 2.3 [55].
Referring to Fig. 2.3, a pair of clock signals (forked from a single source) is
daisy-chained through a series of sampling heads placed at each test node, leading
to a virtual “scan-out” architecture [55]. The job of the sampling head is to act as
a transducer, converting the test voltage to a delay difference between the clock
pair passing through it. To accomplish this act of transduction, the sampling
head present at each test node, consists of a pair of voltage controlled delay
(V2D) cells. The delay of one V2D is controlled by the test voltage VAi, while
that of the other by a fixed voltage Vref . Thus, the voltage difference (VAi−Vref)
is converted as a delay difference between the clock pair. This clock pair is then
centrally processed to extract the delay.
It is to be noted that the analog test voltages are not intentionally perturbed
by the measurement process in contrast to the digital scan chain scenario where
the bits at each node change as per the input serially on the raising edges of a
clock. In this case, the design of the delay cells does not depend on the number
of test nodes. The central digital processing to extract the delay can be done in
different ways to suit the application. It could just be a flop used as a comparator
to get one bit information (similar to SAR architecture, where the reference
voltage Vref is set by a DAC based on the flop decision), or a time-to-digital
converters (TDC) implementation to get upto 10 bits at the rate of few hundred
kHz [56] suited for measuring AC signals, or a statistical converter based on sub-
Chapter 2. State-of-the-Art Analog/RF BIST 17
Figure 2.3: Distributed solution with only digital signals routed over long paths
sampling (described in Chapter 4) to suit low bandwidth signal measurement.
As can be observed from the architecture of Fig. 2.3, the sampling heads
corresponding to test voltages which are not being selected also contribute to the
delay difference between the clock pair, which is not desirable. Such contribution
only adds to the noise without changing the intelligible information. Analysis to
show that the SNR of such information only degrades as the daisy-chain length
increases is presented in Appendix B. To overcome this limitation, it is better
to route the output clock pair of each sampling head directly to the central
measurement unit, instead of daisy-chaining through the other sampling heads.
The details of this modification is presented in the Chapter 4.
Chapter 2. State-of-the-Art Analog/RF BIST 18
2.9 Conclusions
This chapter describes a brief overview of the state-of-the-art in Analog BIST.
As was discussed, time-based designs stand to gain from the technology scaling
which is leading to faster inverter switching and less rise/fall times and thereby
increasing the resolution of time measurements. A brief description of exploiting
time-based designs for ADCs is also provided. The similar technique of time-
based designs are adopted for BIST application, which is described in detail in
Chapter 4.
Chapter 3
State-of-the-Art Time-to-Digital
Converters
3.1 Introduction
Time measurement has played a crucial role in the understanding of nature and
development of science from the earliest times. Starting from techniques of analog
clocks based on solar motion (sun-dials), sand flow (hourglass) and water flow
(ghat.ika-yantra) up to the recent use of precise cesium resonators (especially in
GPS satellites).
As a subset of time keeping technology, time-to-digital converters (TDC) al-
low for precise time measurement digitally between two events. Measurement of
short time intervals with good resolution and accuracy has had a lot of applica-
tions in experimental physics even prior to its popular use in integrated circuits
presently, the important ones being in the areas of - mean lifetime measurements
of excited nuclear states, time-of-flight measurements, particle identification [57],
laser ranging [58] and positive electron tomography (PET) based medical imag-
ing [59]. The first direct predecessor of a TDC was invented in 1942 for the mea-
surement of muon1 lifetimes, actually designed as a time-to-voltage converter;
constantly charging a capacitor during the measured time interval.
With advanced CMOS processes beginning to offer extremely compact and
1from Greek µ, muon is an elementary particle similar to the electron, classified as a lepton.
Chapter 3. State-of-the-Art Time-to-Digital Converters 20
flexible processing power, many applications have begun to replace traditional
analog signal processing blocks with digital signal processing. Such a shift in
architecture places an increased burden on the mixed-signal interface. The TDC
is fast becoming a fundamental element of such an interface, capable of bridging
the gap between continuous-time analog domain and the discrete-time digital
domain (as ADCs [49, 50]), especially in systems that require precise control of
timing signals such as PLLs [60], delay locked loops (DLL) [61] and circuits [62].
In particular, an implementation of temperature sensor using TDC is described
in [63] and a minimally invasive delay slack monitor is presented in [64] that
directly measures the timing margins on critical timing signals, allowing timing
margins due to PVT (process, voltage and temperature) and global variations to
be removed. A technique of measuring skews between leaf nodes of a clock tree by
way of sub-sampling is presented in [62]. In essence, accurate delay measurement
is becoming important for the implementation of important mixed-signal and
sensor blocks in deep-sub-micron processes.
Considering that there is an extensive history of TDC art, and in spite of the
tremendous change in technology from vacuum tubes and ferrite pot-core trans-
formers to present day advanced CMOS processes, the fundamental concepts and
techniques for dividing time into measurable intervals have remarkably remained
more or less the same. Given this context, it is instructive to think of TDC de-
signs conceptually rather than merely in terms of implementation details. This
helps us shape future efforts in TDC developments, in addition to understanding
current practice, considering the simplicity and technology-independence of these
powerful ideas.
Fig. 3.1 shows the basic conceptual idea of a TDC. An estimate of the time
interval Tin[k] = Tstart[k] − Tstop[k] is obtained by counting the number of inter-
mediate reference pulses/events as Tout[k] = Out[k]× Tq, and an error occurs at
both beginning and end of the measurement, given by
Terror[k] = Tin[k]− Tout[k] (3.1)
Chapter 3. State-of-the-Art Time-to-Digital Converters 21
Tq
Reference
t
Start Stop
t
Signals
Tout[k]
Tin[k]
Figure 3.1: Concept of a TDC
or, equivalently the TDC digital output can be represented as
Out[k] =Tin[k]− Terror[k]
Tq=
⌊Tin[k]
Tq
⌋. (3.2)
Since the raw TDC resolution is limited by Tq, a great deal of effort has been
made over the years to reduce it directly through technology advancement and
effectively by use of intelligent design techniques. A technique of precise time
measurement to measure on-chip jitter is presented in [65] where time is first
converted to voltage by a charge pump before digitization. Although this might
be an excellent solution for a particular technology, the architecture is analog-
intensive, not power-efficient and does not leverage the benefits of technology
scaling of modern CMOS which enables fine resolution of digital edges.
In contrast, TDC designed with digital CMOS processes have benefited greatly
from process scaling since the reducing gate delays accompany improvement in
resolution and also lead to compact and fully-integrated solutions. While intrin-
sic delay has continued to decrease, the accuracy of delay also needs to improve
for the traditional TDC architectures to benefit from scaling. But with future
CMOS scaling, transistor and parasitic mismatch leading to increasing delay mis-
match is proving to be the bottleneck for many TDC architectures [66]. This
has necessitated the exploration of different architectures, namely oversampling,
oscillator-based and sub-sampling approaches which are described in the rest of
Chapter 3. State-of-the-Art Time-to-Digital Converters 22
this chapter.
3.2 TDC with gate-delay resolution
A classic TDC architecture comprised of a chain of delay elements is shown in
Fig. 3.2 [49], which works by counting the number of sequential inverter delays
that occur between two rising edges of start and stop, yielding a thermometric
code captured into a register at the rising edge of the stop signal; summing up
which yields the digital output. Although this simple architecture offers moderate
performance by using digital gates, increasing the dynamic range leads to a linear
increase in the number of delay elements, thereby increasing power consumption
and decreasing the maximum sampling rate.
A simple improvement to overcome the limitation of dynamic range is to wrap
the end of the chain back to the beginning through a multiplexer as shown in
D Q
DFF
D Q
DFF
D Q
DFF
D Q
DFF
D D D D
+
Start
Stop
Out
Figure 3.2: Basic TDC with gate delay resolution
Chapter 3. State-of-the-Art Time-to-Digital Converters 23
CountersLogic
Enable
Mux
+
Start
Register
Count
Out
Stop
Figure 3.3: Basic TDC wrapped back as a ring
Fig. 3.3. With larger range, the core of this cyclic TDC does not scale up at all
while the counter size grows logarithmically. Asymmetry in the delay chain due
to the multiplexer degrades the differential non-linearity (DNL) while the integral
non-linearity improves due to reuse of the elements periodically. Techniques to
match the multiplexer delay with the delay element is explored in [67].
While the simple cyclic TDC improves the range, the resolution of an inverter
delay is limited by the process and although technology scaling improves the
intrinsic inverter delay, it will only worsen the mismatch of delay elements. As a
result, it is important to explore architectures which enable the inverter delay to
be divided into smaller measurable intervals.
3.3 TDC with sub-gate-delay resolution
Use of the Vernier delay technique [68] for improving the resolution of digital
CMOS TDC is well understood. As shown in Fig. 3.4, the idea is to delay
both the start and stop signals differently with delay chains; one chain with a
delay of D and the other with D′ = D − ∆ per element; so that the effective
resolution becomes Tq = ∆. But the problems of range limitation and sensitivity
Chapter 3. State-of-the-Art Time-to-Digital Converters 24
to mismatch persist, to reduce which elaborate calibration techniques have been
proposed. A technique of self-calibration to mitigate local process variations
of a 30-bit vernier chain, which generates delays in steps of 5 ps, is presented
in [64] for monitoring delay slack in-situ for high-performance processors, with an
intention to remove design margins due to PVT variations. An all-digital replica
technique to reduce non-linearity due to process variations is proposed in [69],
where measures of central tendency of multiple identical delay chain outputs are
shown to yield improved accuracy.
To reduce the size of practical Vernier TDC, various dual step architectures
based on coarse-fine architecture have been proposed. One such architecture is
to have a simple delay chain TDC (as in Fig. 3.2) followed by a higher resolution
Vernier TDC [70]. Another two step technique is to use the meta-stability prop-
erty of digital gates to amplify time error and an improvement up to a factor of
D Q
DFF
D Q
DFF
D Q
DFF
D Q
DFF
D D D D
D′ D′ D′ D′
+
Start
Stop
Out
Figure 3.4: Vernier TDC with sub-gate delay resolution of D −D′ = ∆
Chapter 3. State-of-the-Art Time-to-Digital Converters 25
20 is reported [71], but the delay-amplifier needs to be calibrated to accurately
determine its gain. A cyclic architecture of vernier chains similar to one shown
in Fig. 3.3 is presented in [72], wherein a dynamic range of 12 bits is reported
with a resolution of 8 ps. Although it leads to an increase in the dynamic range,
it comes at the cost of complicated decoding logic and calibration.
Another technique to improve TDC resolution below that of a gate delay is
to interpolate between the input and output signals of a digital gate. This in-
terpolation may be done in an analog manner using a resistive divider; or in
a digital manner by having output signals driven by more than one delay ele-
ment, where the delay element inputs are staggered in time. The operation of
averaging/interpolation creates a new intermediate signal with a transition that
effectively divides the gate delay into two smaller intervals. All of the new signals
must be registered appropriately, which increases the size of the TDC [66].
For each of the TDC architectures described thus far, which are designed to
operate at Nyquist rate; significant effort is required to reduce the TDC resolution
to less than a gate delay, at the expense of increased complexity, area and/or
mismatch. Although calibration generally improves resolution in the presence
of mismatch, its added complexity increases area and power consumption and
cannot always remove differential non-linearity errors [66].
3.4 Oversampling TDC Considerations
It is well known in the field of data converters that averaging the digital output
improves the SNR, provided the following conditions are satisfied:
• the input signal must be band-limited,
• the input has to be over-sampled corresponding to the number of samples
taken for averaging,
• the quantizer should be linear up to its resolution, and
• the input signal must be busy (and not DC) i.e., it must span at least an
LSB.
The last condition also leads to randomization of the quantization noise, making
it less dependent on the input, which can then be reduced by averaging [73].
Chapter 3. State-of-the-Art Time-to-Digital Converters 26
From the discussion in the previous section, it is clear that we seek TDC with
not only improved resolution but also robustness to mismatch, and proceed to
examine if oversampling can improve TDC performance. As described above, for
oversampling to improve performance, the quantization error should be indepen-
dent of the input signal and also be uniformly distributed over the quantization
step. In a closed-loop system, there are certain conditions in which the system
itself may provide such a scrambling of the TDC as in a fractional-N ∆Σ PLL.
However, there are many applications which do not provide such a dithering;
leading to a situation similar to the classic dead-zone in analog phase detector
known to cause erratic limit-cycle behavior in integer-N PLL. One solution is to
intentionally modulate the TDC input with a noisy signal in order to randomize
the quantization error, and be subtracted from the output later.
Assuming then, that the quantization error is uniformly distributed at all
frequencies, over-sampling a signal of bandwidth WB at an increased sampling
rate of fs makes the effective quantization noise power as
σ2 =T 2q WB
fs, (3.3)
a reduction by a factor of fs/WB � 1. While this fact is impressive, the error
due to mismatch which was earlier negligible, now becomes the bottleneck for
further improvements. A simple oscillator-based TDC is presented next, which
inherently scrambles the qunatization error and mitigates mismatch due to reuse
of delay elements; making it well-suited for oversampling applications [66].
3.5 Oscillator-based TDC
Fig. 3.5 illustrates the classical ring oscillator-based TDC composed of a ring of
delay elements [74], which shares many similarities with the cyclic TDC. The
oscillator transitions for both topologies are counted for a time window of Tm,
designated by the Enable signal. The key difference between the two is that in
the oscillator-based architecture, the starting phase of the oscillator is random,
which leads to the quantization error being uniformly distributed over the interval
[0, Tq]; whereas the starting phase in the cyclic TDC is always fixed at 0.
Chapter 3. State-of-the-Art Time-to-Digital Converters 27
CountersLogicEnable
+
Start
Register
Count
Out
Stop
Figure 3.5: Classical oscillator-based TDC
In the oscillator-based TDC, since the quantization error is uncorrelated with
Tm, with oversampling both resolution and mismatch will be improved. Mismatch
is also improved in this case since the delay elements which transition during Tm
are chosen uniformly randomly (since the starting phase of the oscillator is
random), and thereby mitigated by oversampling and averaging [66].
Although oversampling with oscillator-based TDC offers improved resolution
and mismatch, it comes at the cost of increased bandwidth and power. To ef-
fectively reduce Tq by a factor of 2, the oversampling rate needed increases by
a factor of 4 to provide an improvement of 3 dB in SNR. But in this case, if
the measurement time Tm (when Enable is held high) is a small fraction of the
oscillator period, then the transitions happening continuously in the oscillator
leads to wasted power.
Fig. 3.6 illustrates the concept of a gated ring oscillator (GRO) TDC, which is
similar to cyclic and oscillator-based TDC in the sense that the number of delay
element transitions during a measurement interval Tm are counted. In this GRO-
TDC, the ring oscillator is also gated in addition to the counters with the Enable
signal, thereby preserving the state of the oscillator between measurements [75].
By preserving the oscillator state at the end of the measurement interval Tm[k−1],
Chapter 3. State-of-the-Art Time-to-Digital Converters 28
Gated ring oscillator
Counters
+
Register
Count
Out
Enable
Figure 3.6: Gated Ring Oscillator TDC
the quantization error Terror[k−1] from that measurement is also preserved. As a
result, the previous quantization error Terror[k − 1] is carried over as the starting
phase of the oscillator. This results in first-order noise shaping of the quantization
error.
In the GRO-TDC, the delay mismatch is also first-order shaped in addition
to the quantization error since the switching delay elements shift over successive
measurement intervals. This is similar to the barrel-shift algorithm for dynamic
element matching, which is well known to reduce DNL in data converters [76]. As
a result, the SNR of GRO-TDC improves by 9 dB for a doubling of the sampling
rate (as is well known in first order noise shaping data converters [77]), which is
a significant improvement compared to the 3 dB from a oscillator-based TDC.
Each of the TDC architectures discussed so far are aimed towards obtain-
ing increased resolution either by calibration or oversampling coupled with noise
shaping. Although they offer impressive solutions for stand-alone TDCs, further
simpler architectures are desired when the number of test nodes scales to a very
large number, as the number of leaf nodes in a clock distribution network, which
Chapter 3. State-of-the-Art Time-to-Digital Converters 29
is described next.
3.6 Sub-sampling Approach
Fig. 3.7 illustrates a typical clock distribution network (CDN), wherein a clock
signal from a source is routed to many points called leaf nodes, typically flip-flops
or storage elements, through a buffer structure most commonly connected in a
tree fashion. Various buffers are inserted in the distribution path to ensure signal
integrity at the leaf nodes. Technology scaling in accordance with Moore’s law
and innovations in manufacturing have led to smaller and faster transistors on one
hand, but have also increased variability between transistors on the other. Designs
employing faster clocks warrant tighter timing budgets but increased variability
translates as clock skew at the leaf nodes, and is eating into the already tightened
timing margins [1, 78].
As a result, a technique to measure the relative skew between a pair of leaf
nodes in-situ will be of great value in studying and characterizing skews as well
Figure 3.7: Illustration of a typical Clock Distribution Network with variouscomponents contributing to clock skew. Courtesy: Pratap Kumar Das [1]
Chapter 3. State-of-the-Art Time-to-Digital Converters 30
D Q
DFF1
D Q
DFF2
C1
C2
Samp clk
S1
S2
Figure 3.8: Illustration of Sub-sampling approach
Samp clk
C1
C2
S1
S2
Figure 3.9: Timing diagram
as potentially enabling a closed loop design to reduce the skew. Since the skew
between the clock signals at a pair of leaf nodes is of interest to be measured,
the technique of sub-sampling can be employed since the information of interest
(skew/delay) is on a period signal. Such a sub-sampling technique also greatly
simplifies the implementation of the delay measurement unit (DMU) that follows,
since the components needed (and thereby the area occupied) is independent of
the resolution to be achieved [79].
In this approach of sub-sampling, the sampling rate is about 2× lesser than
Nyquist frequency (or even lesser) which means that the full signal cannot be
reconstructed back. But, if the parameter of interest can be made periodic with
a known frequency, this approach can still be used to reconstruct the parameter
of interest. For example, an on-chip analog oscilloscope is presented in [17] where
a high frequency periodic analog signal is sub-sampled and the samples are digi-
Chapter 3. State-of-the-Art Time-to-Digital Converters 31
tized. The sampling clock has a frequency which is slightly less than that of the
input signal to be displayed, so that the original signal becomes time expanded,
thereby significantly reduces the required bandwidth of the ADC that follows.
The authors in [18] use the sub-sampling technique to display the bit-lines of
SRAM cells by artificially introducing periodicity. An on-chip waveform capturer
with 8.8 bits accuracy and 15 ps time accuracy is demonstrated in [80] in which
offsets and slopes of voltage of a digital-to-analog converter is linearly translated
to generate and extract timing information. The sub-sampling approach to mea-
sure skew/delay between a pair of nodes is briefly described next.
Consider the clock signals C1 and C2 at a pair of leaf nodes in Fig. 3.7, both
of period T and a skew/delay of d between them. Let both the clocks be sampled
by another clock of period T + ∆T as shown in Fig. 3.8. It is important to
note that the skew in sampling clock reaching DFF1 and DFF2 and mismatch
between the two flops directly contribute to errors in the measurement. However,
since such a mismatch is basically a static offset, it can be mitigated by single-
point calibration. Fig. 3.9 shows the timing diagram, which clearly demonstrates
‘amplification’ of time period T and skew d by a factor of T/∆T . Furthermore,
the amplified delay by virtue of it being synchronous to the sampling clock can
be measured using just an up/down counter, which counts
up when S1 = 1 and S2 = 0, and
down when S1 = 0 and S2 = 1.
Such an estimator is shown to yield an unbiased estimate of the delay in Ap-
pendix A. A small caveat is to note that the falling edge skew must be eliminated
in order for the simple up/down counter to yield correct results. This state ma-
chine is implemented on FPGA, as described in 4.4.1.
3.7 Conclusions
The various techniques of time measurement discussed in this chapter are sum-
marized in Table 3.1 at a conceptual level, while the exact numbers of earlier
reported works are presented in [66]. The application of time measurement tech-
niques discussed in this chapter, especially the sub-sampling approach, to the
problem of measuring analog voltages for BIST is described in the next chapter.
Chapter 3. State-of-the-Art Time-to-Digital Converters 32
Tab
le3.
1:C
ompar
ison
ofva
riou
sti
me
mea
sure
men
tar
chit
ectu
res
Sp
ecifi
cati
onB
asic
TD
C[4
9]V
ernie
rT
DC
[49]
Ver
nie
rR
ing
TD
C[4
9]V
CO
-bas
edT
DC
[52]
DM
Ubas
edon
sub-
sam
pling
[62]
Res
oluti
onIn
vert
erdel
ay,
DIn
vert
erdiff
er-
ence
del
ay,
∆(F
ig.
3.4)
Inve
rter
dif
-fe
rence
del
ay,
∆
Inve
rter
del
ayD
Lim
ited
by
mea
sure
-m
ent
tim
e
Mea
sure
men
tti
me
(forb
bit
s)
2bin
vert
erde-
lays∼
2bD
2bD
>2bD·2
(due
toci
rcling)
atle
ast
2bD
2b(T
+∆T
)
Dynam
icra
nge
avai
lable
(Rat
ioof
max
imum
tom
inim
um
mea
-su
rable
del
ays)
nD/D
=n
n∆/∆
=n
No
upp
erlim
iton
mea
sura
ble
del
ay
No
upp
erlim
iton
mea
sura
ble
del
ay
<T/σ
wher
eσ
isth
est
an-
dar
ddev
iati
onof
smal
lest
mea
sure
ddel
ay
Mis
mat
chL
SB
dir
ectl
yaff
ecte
dL
SB
dir
ectl
yaff
ecte
dR
educe
ddue
tore
use
ofin
vert
erst
ages
Red
uce
ddue
tore
use
ofin
vert
erst
ages
App
ears
asoff
set
whic
hca
nb
eca
li-
bra
ted
out
Quan
tity
outp
ut
for
anin
put
del
ayd
⌊ d D
⌋⌊ d ∆
⌋⌊ d ∆
⌋⌊ d D
⌋d T
Info
rmat
ion
nee
ded
toob
tain
abso
lute
del
ay(i
nps)
D,
aver
age
in-
vert
erdel
ay∆
,av
erag
ein
vert
erdiff
er-
ence
del
ay
∆,
inve
rter
dif
-fe
rence
del
ayD
,av
erag
ein
-ve
rter
del
ayor
Tim
ep
erio
dof
VC
O
Tim
ep
erio
dT
Com
pon
ents
re-
quir
ed(f
orb
bit
s)2b
inve
rter
stag
esan
d2b
flop
s
2b+
1in
vert
erst
ages
and
2b
flop
s
2in
vert
erst
ages
,a
flop
and
aco
unte
rat
fre-
quen
cyof
1 2D
2bin
vert
ers
and
aco
unte
rat
VC
Ofr
equen
cy
2flop
san
db-
bit
up/d
own
counte
r
Nat
ure
ofm
ea-
sure
men
tSin
gle-
shot
Sin
gle-
shot
Sin
gle-
shot
Sin
gle-
shot
Per
iodic
(req
uir
este
stdel
ayto
be
pre
sent
ona
per
iodic
sign
al)
Chapter 4
Proposed Architecture
A solution to the problem defined in Chapter 2 is discussed here. The nov-
elty of the solution is to convert voltage information into time delay information
and measure it all-digitally, which is well suited for a distributed architecture
amenable to multiple test nodes.
4.1 Proposed Solution
The solution proposed here is a modified and extended version of the one shown
in Fig. 2.3. As shown in Fig. 4.1, sampling heads are placed at each test node
in order to minimize the routing of analog signals routed over long paths, as
before. But, now each sampling head consists of a pair of flip-flops (DFF) in
addition to a pair of identical delay cells (V2D), as shown in Fig. 4.1(b). A clock
signal is routed serially to all the sampling heads, which is fed to both the delay
cells in the sampling head. The delay of one element of the pair is controlled
by the analog voltage VAi, and that of the other by a reference voltage Vref.
Thus, a voltage difference between the node voltage and reference shows up as
a delay difference in the clocks at the output of the delay cell pair. This pair
of clocks is sampled by a slightly slower sampling clock, giving rise to a pair
of beat frequency signals. We call them the sub-sampled signals and the delay
between them is ‘amplified’ by this process of ‘sub-sampling’ [79]. Hence, there
will be as many pairs of sub-sampled signals as there are test nodes. To measure
a certain test node, the corresponding sub-sampled signal pair has to be fed to
Chapter 4. Proposed Architecture 34
(a) Proposed Scheme for Analog BIST. Connections are indicatedwith dots, just crossing wires not to be treated as connections.(n = log2(N))
(b) Sampling head (SpH)
Figure 4.1: Proposed Architecture for Analog BIST
Chapter 4. Proposed Architecture 35
the DMU with appropriate select signal to the multiplexer. Also, the design of
a sampling head does not depend upon the number of test nodes desired, giving
the advantage of scalability with respect to number of test nodes.
As can be seen from Fig. 4.1, both the input clock (clk) and sampling clock
(samp clk) are ‘picked-up’ from a single point for each sampling head. Hence,
cross-talk and coupling noise which may affect the clocks do not contribute to ad-
ditional noise in the sampling head circuitry. Also, the output sub-sampled signal
pair of the sampling head are low-frequency signals and the delay between them
is already amplified by the ‘sub-sampling’ process, which makes the sub-sampled
signal pair also immune to cross-talk and coupling noise. This technique of ‘sub-
sampling’ provides bandwidth-resolution trade-off, i.e., measurements requiring
coarser resolutions can be done faster whereas finer resolution measurements need
more time.
It is not mandatory that Vref of Fig. 4.1(a) be the same amongst all IPs. If it is
of interest to measure voltage difference between two voltages in the same IP, Vref
can be replaced by that voltage. Such a situation arises when a programmable
current source is employed to achieve current matching in the presence of vari-
ations. Otherwise, Vref can just be grounded. Ground bounce is not a concern
as the technique presented performs averaging over the measurement time deter-
mined by the settings.
4.1.1 Measurement Procedure
(a) Calibration
Because of the possible non-linearity of the delay cells, they will need to be
calibrated apriori. The delay cell pair of sampling head SpHi, corresponding
to VAi is calibrated as follows. MUXcal (Fig. 4.1) is set high so that the
calibration voltage is fed to one of the delay cells (instead of the local node
voltage), while the other delay cell gets the reference voltage. MUXsel is set
to a value so that the multiplexor selects the sub-sampled pair corresponding
to SpHi and feed it to the DMU.
Suppose gi1(·) and gi2(·) are the voltage to delay functions of the two delay
cells of sampling head SpHi respectively, then the delay difference out of this
Chapter 4. Proposed Architecture 36
sampling head, ∆Di is given by
∆Di = gi1(Vcal)− gi2(Vref)
But, we are more interested in ∆Vi = Vcal − Vref. So, we define a function
fi(·) mapping ∆Vi to ∆Di as
∆Di = fi(Vcal − Vref) = fi(∆Vi) (4.1)
The calibration step measures this (potentially non-linear) function f(·) at
few points, which is used later to correct for non-linearity and bias. Calibra-
tion also helps in mitigating mismatches, if any, between the delay cell pairs.
The delay at the input of the DMU is given as:
∆DDMU = ∆Di + ∆Dresidual (4.2)
where ∆Dresidual is the delay difference in the clock pair, accrued in the rest
of the path. This number will be independent of the voltage at node i and
hence can be easily calibrated out. The stability of this calibration over time
is described in 4.8.
(b) Measurement
During the measurement process, MUXcal (Fig. 4.1) is set low. To measure
VAi, the corresponding sub-sampled signal pair is selected by the multiplexer.
Thus, the delay cell pair of sampling head SpHi will create a delay differential
given as:
∆Di = gi1(V Ai)− gi2(Vref) (4.3)
:= fi(VAi − Vref) (4.4)
The input delay difference at the DMU is as given in (4.2). From the cali-
bration data, VAi can be inferred directly or by interpolation.
Chapter 4. Proposed Architecture 37
4.2 Voltage-to-Delay Conversion
Voltage-to-delay conversion is the process of sampling an analog voltage and
converting it into an analog time-difference (on a clock), as shown in Fig. 4.1(b).
A simple technique of voltage-to-time conversion is used in a digital voltmeter,
where the time taken by a negative-ramp signal to reach zero from an unknown
input voltage is monitored by a counter, which produces a digital display accord-
ing to the level of the input voltage signal [81]. Use of alternating voltage-to-time
and time-to-voltage conversions in the design of ADCs is shown to provide natu-
ral error correction due to comparator offset and delay, 1/f noise and switching
charge-injection [82].
The various strategies for voltage-to-delay conversion are as follows:
• Direct Voltage-controlled
• Direct Current-controlled
• Current-starved inverter-based
A comparison between direct voltage-control and current-starved inverter-
based strategies is provided in [83], from which it is evident that while direct
voltage-control strategy can yield large sensitivity of delay to voltage and better
linearity over a wider voltage-range, it occupies more area and consumes higher
power and can go up only to medium frequencies. In [84], a differential design
of a voltage-to-time converter is presented, where the charging time of a pair of
capacitors to (differential) control voltage manifests as the output delay. Such
a voltage-to-time converter is applied in receiver equalization to mitigate inter-
symbol interference (ISI) in mesochronous links, and an overall linearity of 4.3
bits (5.5 bits linearity in a dynamic range of 600mV in simulation) is reported.
Authors in [85] have presented a novel linearization scheme for a voltage-to-pulse-
delay-time converter, suitable for analog-to-digital converters, based on current
starved inverters. The linearity error is demonstrated to be less than 2% over
a dynamic range of 200 mV in simulation. A design of programmable voltage-
to-time converter based on current-starved inverters is described in [86], where
programmability of delay is by way of controlling the bias of a MOS capacitor. A
linearity of 3.7 bits is demonstrated and estimated power consumption of 3.6 mW
Chapter 4. Proposed Architecture 38
in STMicroelectronics 90 nm CMOS process. The architecture used is such that
although each voltage-to-time converter delays only the falling edge, a pair of
such converters delays both edges so as to keep the duty cycle unchanged at the
output. But no particular advantage of delaying both edges is pointed out and
hence dropping the second voltage-to-time converter saves area. A summary of
the listed techniques is provided in Table 4.1
4.2.1 Sample-and-Hold Action
Sample-and-hold circuits are needed in ADCs when the input voltage is a rel-
atively high-frequency signal with respect to the ADC conversion time. But in
V2D converters, the edge of probe clock propagating through the V2D gets de-
layed by an amount dictated by the control voltage. As a result, the control
voltage is sampled at each rising edge of the probe clock. Hence, such a V2D
may act as a sampler provided its conversion rate is sufficiently greater than the
input frequency. The condition when a sample-and-hold circuit may be omitted
is derived as follows. Let the input sinusiod be
Vin = A sin(2πfint) (4.5)
where A is the amplitude and fin is the signal frequency. The maximum slope of
this signal, which occurs at its zero crossing, is given by
dVin
dt
∣∣∣∣max
= 2πAfin (4.6)
Let TC be the conversion time of the technique. Then, the input signal should
not change by more by than one LSB in time TC to avoid errors. This imposes
an upper limit on the maximum slope of the signal and hence on the frequency,
so thatdVin
dt= 2πAfin ≤
VLSB
TC(4.7)
The LSB voltage of an n-bit ADC is given by
VLSB =2A
2n − 1(4.8)
Chapter 4. Proposed Architecture 39
Tab
le4.
1:C
ompar
ison
ofva
riou
svo
ltag
e-to
-tim
eco
nve
rsio
nte
chniq
ues
Met
ric
Dir
ect
Vol
tage
-D
iffer
enti
alC
urr
ent-
Sta
rved
Inve
rter
Pai
rof
Curr
ent-
Sta
rved
Inve
rter
s
Con
trol
[83]
Vol
tage
-C
ontr
ol[8
4][8
3][8
6]F
ig.
4.2
Fig
.7.
3bF
ig.
7.3a
Vol
tage
-to-
Tim
eC
onve
rsio
nF
acto
r7.
2ns/
V-3
20ps/
V0.
5ns/
V15
.3ns/
V4.
65ns/
V5.
95ns/
V
Lin
ear
Input
Vol
t-ag
eR
ange
800
mV
(0.6
V→
1.4
V)
400
mV
(0.8
V→
1.2
V)
100
mV
100
mV
600
mV
(0.4
V→
1.0
V)
600
mV
(0→
0.6
V)
Lin
eari
tyE
rror
±0.
1%4
bit
s±
0.15
%3.
7bit
s5.
29bit
s3.
91bit
s4.
23bit
s
Max
imum
Sam
-pling
Fre
quen
cy10
0M
Hz
6.25
GH
z1.
1G
Hz
5G
Hz
500
MH
z20
0M
Hz
200
MH
z
Pow
erC
onsu
mp-
tion
3.3
mW
7.5
mV
136µ
W3.
6m
W
Are
a48
0µ
m2
14.5µ
m2
6800
µm
269
.72µ
m2
294µ
m2
260µ
m2
Tec
hnol
ogy
Node
180
nm
90nm
180
nm
90nm
130
nm
130
nm
130
nm
VD
D1.
8V
1.2
V1.
8V
1.2
V1.
2V
1.2
V1.
2V
Chapter 4. Proposed Architecture 40
Combining and rearranging (4.7) and (4.8), the limit on input sine frequency for
error-free conversion is given by
fin ≤1
π(2n − 1)TC(4.9)
The inequality (4.9) dictates whether a sample-and-hold is needed or not for the
desired resolution n, conversion time TC and input signal bandwidth fin.
4.2.2 Implementation
Targeting a measurement range of 0 to 100 mV, PMOS controlled current starved
inverters are used, as shown in Fig. 4.2. However, alternative delay cell architec-
tures could be used for other applications as the specifications desire. The area of
the pair of delay cells chosen for this application and taped out is 8.2 µm× 8.4 µm.
As is evident from the circuit, the voltage influences only the delay of the
rising edge, while the delay of the falling edge is uncontrolled. Hence, having
chosen an input clock period of T and duty ratio D, the range of the system is
the value of Vin which gives an absolute delay of Dr, where Dr = D × T , the
maximum rising edge delay possible. For instance, for a clock period of 125 ns,
duty ratio of 0.5, suppose 120 mV gives a delay of 62.5 ns, then the range of the
system is1 120 mV. This range can be increased therefore by increasing the duty
ratio or input clock period or both.
4.3 Design Considerations
With the sizing of the delay cell circuitry, the capacitance between the analog
voltage and the input clock is about 2 fF. With a decoupling capacitor of 4 pF,
the kickback will be less than 0.6 mV. For smaller kickback, either the decou-
pling capacitor has to be increased or cascoding has to be implemented. With a
transistor of gain 20 in cascode, the decoupling capacitor can be as less as 0.1 pF.
Since the delay of every cell is sensitive to supply voltage, variations in supply
voltage directly impacts the voltage measurement. The power supply will have a
1The actual dynamic range will be slightly lesser due to some margin for the falling edgeskew eliminator algorithm.
Chapter 4. Proposed Architecture 41
clkout
clkin
Vin
w=2µm
l=1µm
w=2µm
l=1µm
w=1µm
l=1µm
w=1µm
l=1µm
w=180nm
l=1µm
clkout
clkin
Vin
w=2µm
l=1µm
w=2µm
l=1µm
w=1µm
l=1µm
w=1µm
l=1µm
w=180nm
l=1µm
V2D
Figure 4.2: Schematic Circuit of current-starved Voltage to Delay cell (V2D)
distribution profile across the chip. This profile will get calibrated out provided
the power supply does not change too much with time.
To combat time varying supply voltage, a solution is to make use of delay
cells with a good power supply rejection ratio or to use regulated power supply.
Placing a transistor in cascode also helps to mitigate the effect of power supply
noise on delay.
The measurement of bias voltages is heavily dependent on Vcal for the cali-
bration and interpolation. Hence, the generation of Vcal will have to be accurate
and has to be shielded well so that noise coupled onto Vcal will not impact the
measurement.
The noisy currents of the voltage-to-delay converter contribute to jitter on
the clocks. If N delay cells are used in cascade to generate the delay difference,
assuming the jitter added by each to be independent of one another, the jitter
grows as√N , while the total delay grows as N . Hence, from a noise perspective,
it is advantageous to employ more delay stages. But this leads to bandwidth
limitation1 and increased kick-back to the analog test node.
1Bandwidth is reduced since the test voltage should not change too much so long as theclocks are propagating through the delay stages.
Chapter 4. Proposed Architecture 42
For a given measurement time, the resolution of measurement of delay dif-
ference that can be obtained is say 6σ (where σ is the standard deviation of
the measured delay values). Then, based on the desired voltage resolution, the
required voltage-to-delay ratio is calculated. Based on the total delay required,
one can choose the number of delay stages needed.
For example, suppose that a resolution of 10 ps can be achieved in a given
measurement time and that the voltage resolution desired is 1 mV. Then the
voltage-to-delay converter has to give a differential delay of 10 ps/mV. Suppose
a single delay cell is designed to provide this delay. On the other hand, if 0.1 mV
resolution is desired with the same set-up and measurement time, ten such delay
cells can be used. But, as stated before, the bandwidth of the signal that can
be measured reduces by a factor of ten also with increased kick-back to the test
voltage.
4.4 Hardware Implementation Details
Fig. 4.3 shows the overall block diagram of the implemented system to evaluate
the concept. It consists of two components, the voltage-to-delay circuitry and
sampling flops (making up a sampling head) on-chip and DMU implemented in
a Virtex II development board. The setup is geared to measure a single analog
voltage, shown as the pin named ‘Vin’ in the figure. It is used first for calibration
(by feeding known voltages) and then to measure test voltages. So also, only the
DMU part of the control unit of Fig. 4.1(a) is implemented on FPGA.
A provision is made to select the output of either of a single or a series of 13
voltage-to-delay cells, giving a handle on the voltage-to-delay sensitivity.
4.4.1 Sub-sampling Based Delay Measurement Unit (DMU)
The V2D cells set up a delay between the clocks at nodes Dai and Dbi, which
has to be measured. Since we want to measure this delay digitally, we sample the
clock pair by another clock. A possibility is to use a sampling clock frequency
which is much much larger (say about 100×) that of the clock pair. Although
it sounds reasonable theoretically, practical measurements showed that the stan-
dard deviation of the measured delay, and therefore of voltage, was too high. A
Chapter 4. Proposed Architecture 43
Clk
in
Pe
rio
d=
T
V2
D
V2
D1
V2
D2
V2
D12
1 0
V2
D
V2
D1
V2
D2
V2
D12
DF
FD
clk
Q
DF
FD
clk
Q
DF
FD
clk
Q
DF
FD
clk
Q
0 1
Sa
mp
lin
g C
lk
Pe
rio
d=
T+Δ
T
Vin
On
-Ch
ip F
ron
t-en
d
MU
Xse
lV
ref
+ -
+
Dig
ita
l co
de
for
T
n
up
/dn
cou
nte
r
De
lay
Me
asu
rem
en
t U
nit
(O
n F
PG
A)
Fre
qu
en
cy
div
ide
d b
y 2
k
>>
K
CQ
b
Da
1
Da
0
Db
0
Db
1
Qa
Qb
Fa
llin
g e
dg
e
ske
w
elim
ina
tor
clk
CQ
aL
oo
k u
p
Ta
ble
+
Inte
rpo
lato
r
VV
Fig
ure
4.3:
Blo
ckD
iagr
amof
Imple
men
ted
Set
-up.
Chapter 4. Proposed Architecture 44
sampling frequency which is slightly less or slightly more than that of the clock
pair is used here as per the sub-sampling approach, which was introduced in 3.6.
Referring to Fig. 4.3, if the frequencies of sampling clock and probe clock pair
are rationally related (which happens when one of them is derived from the other),
then the resolution of measurement is lower bounded by a non-zero quantity, de-
termined by the parameters T and ∆T. This means that, in spite of increasing the
measurement time, the accuracy (standard deviation) of measurement does not
improve beyond a certain value. In such cases, adding some additional jitter onto
the clocks, by way of frequency modulation for instance, improves the resolution.
This phenomenon is also reported in a similar setup [62].
On the other hand, when the frequencies of sampling and probe clock pair
are irrationally related, there is no such fundamental limit on resolution. But, it
comes with the rider that this sampling clock has to be generated from a separate
crystal. For this particular application, the tester can be made to feed this second
clock signal. In this setup, the jitter of the clocks comes in as a hindrance and is
mitigated by averaging.
Referring to Fig. 4.4, the input clock pair Dai and Dbi (of period T) is sampled
by an asynchronous1 sampling clock of period2 T+∆T. As a result, the two
outputs will be beat clocks with period given as Tb = (T+∆T)×T/∆T (period
of ideal Qa and Qb in Fig. 4.4). In other words, there is a time “amplification”
by a factor T/∆T. Hence, the skew δ between these two clocks which we intend
to measure will also be amplified by the same amount Tsk = [(T+∆T)×δ/∆T],
shown as skew between ideal Qa and ideal Qb. Due to jitter on the clocks and
meta-stability issues of the samplers, the sampled outputs will be bouncy as
shown in the waveforms for Qa and Qb. Hence, the skew has to be estimated by
averaging the delay difference between the two rising edges of Qa and Qb across
many instances. Since Qa and Qb are synchronous to sampling clock, their delay
difference will always be some multiple of sampling clock period and hence a
simple up/down counter suffices to estimate this difference. Further, the same
counter can be used for averaging across multiple periods of Qa and Qb. Such
an averaging yields an unbiased estimate of the skew as a fraction of the clock
1Asynchronous here means of different frequency and preferably different phase.2Sampling clocks of period T±n∆T (n, integer) can also be used.
Chapter 4. Proposed Architecture 45
Sam Clk
Dai
Dbi
Ideal Qa
Ideal Qb
Qa
Qb
CQa
CQb
T1
Tsk
T1=(T+∆T)*T/∆T T
sk=δ*(T+∆T)/∆T
Skewed
by δ
δ
Figure 4.4: Illustration of timing diagram and the concept of Sub-sampling.
period T [62], the proof of which is described in Appendix A for completeness.
Finally, we would like to ignore the falling edge related bounces and hence, we
generate the clean clocks CQa and CQb which are fed to the up/down counter as
shown in Fig. 4.4. The DMU has an equivalent gate count of 414 NAND2 gates.
4.4.2 Generation and Routing of Clocks
As described in 4.4.1, this approach needs two clocks – a probe clock and a
sampling clock, of slightly different frequencies. Generation and routing of such
close frequency clocks can be challenging due to the phenomenon of injection
locking. In the lab, we used a pair of signal generators independently to provide
these two clocks. In a real-world BIST scenario, the tester can provide one of the
clocks while the other can be generated on-chip. As described in 4.4.1, a sub-
harmonic of the sampling clock does practically no change to this setup of sub-
sampling and hence use of this eliminates the issue of injection-locking. Measured
results from a system where sampling clock is generated from a PLL is described
Chapter 4. Proposed Architecture 46
in Chapter 5.
4.5 Measured Results
Die photo of the chip fabricated in UMC 130 nm process node and a snapshot of
the layout is shown in Fig. 4.5 (area of 70 µm × 153 µm).
Figure 4.5: Die photo along with snapshot of layout.
4.5.1 DC Measurements
Due to jitter and meta-stability issues pointed out in the previous sections, a set
of digital code words correspond to a single voltage. A set of 32 measurements
are taken to compute the mean and standard deviation of the delay count for
each Vin value. The error-bars shown in the plots of Fig. 4.6 represent a value
of ±σ on each side of the mean. The accuracy of voltage measurement is then
defined as the range of voltage values corresponding to ±3σ. This is obtained
by the dividing standard deviation of delay by the local slope at each point in
Fig. 4.6.
Chapter 4. Proposed Architecture 47
Table 4.2: Summary of Measured Results for DC input
Sl. VDD fp fs MT DR σmax Bin-sizeNo. (V) (MHz) (MHz) (s) (ns) (ps) (mV)
1 0.75 8.0 7.6 4.42 6.77 32.1 1.50
2 1.2 10.0 9.8 3.42 1.53 10.1 2.05
3* 1.2 8.0 7.9 4.25 11.25 30.5 0.82
4 0.75 8.0 7.99 4.20 7.80 29.0 1.60
5 0.75 8.0 7.99 16.80 7.30 15.0 0.85
6† 1.0 37.0 36.927 0.056 2.97 9.77 2.10
7† 1.1 37.0 36.927 0.056 2.13 9.92 1.03
8† 1.2 37.0 36.927 0.88 1.20 3.87 1.25
*13 V2D cells† Measurement time strictly integer number of beat periodsDescription: fp - Probe Clock Frequency, fs - Sampling Clock Frequency, MT - MeasurementTime, DR - Dynamic Range of delay, σmax - Maximum standard deviation of delay values,Bin-size - Accuracy of measurement
Table 4.2 and Fig. 4.6 present a summary of measured results, which shows
that delay increases with reducing supply and increasing number of V2D cells.
The delay versus voltage plots of Fig. 4.6 are for the settings presented in the
rows 1, 2 and 3 of Table 4.2. An accuracy of about 1 mV can be obtained by
proper choice of parameters and measurement time. One can easily obtain desired
accuracies by suitably altering the measurement time.
Except entry 3, other entries of Table 4.2 correspond to the single V2D case.
Entries 4 and 5 show that the accuracy improves with measurement time. For a
four-fold increase in measurement time, the accuracy improves by a factor of two,
which goes well with the theory. Entries 6 to 8 are taken for a measurement time
strictly integer number of beat periods. The bin-size obtained in lesser duration
is comparable to other entries acquired over a larger measurement time.
The differences in the delay ranges for the similar settings are present because
they were measured on different days and hence conditions like supply voltage
and temperature could be different. However, the measurements taken with the
same setting about couple of hours apart (after offset-cancellation1) is stable
1Zero difference delay corresponds to zero differential voltage.
Chapter 4. Proposed Architecture 48
0 20 40 60 80 100
0
2
4
6
8
10
12
∆ V (mV)
∆ D
(n
s)
1.2 V supply, 13 V2Ds, #3
1.2 V supply, 1 V2D, #2
0.75 V supply, 1 V2D, #1
Figure 4.6: Plots of ‘offset-canceled’ differential delay versus differential voltagefor the settings mentioned. Refer Table 4.2 for the settings of #1,2,3.
enough 4.5.3.
4.5.2 AC Measurements
A sine wave of frequency 30 Hz is applied to the system directly without sample-
and-hold circuitry. A set of 16,384 data points are collected for the SNR measure-
ments. SNR measurements are determined without calibrating the data points.
The summary of results obtained is shown in Table 4.3.
It is observed that the SNR degrades for both low and high OSR in accordance
with Fig. 5.2. Entries 7 and 8 of the Table 4.3 confirms that the SNR is higher for
a lower bandwidth signal with rest of the settings being similar. Fig. 4.7 shows
the plot of DNL and INL for setting 8 of Table 4.3, obtained by code density
test [33]. The maximum values of DNL and INL are found to be less than 1 LSB.
The theoretical analysis presented in 5.1 shows that a resolution of 12 bits
is possible in this approach (Fig. 5.2) for the used settings, but the measured
results show a maximum of 5.29 bits. The limitation in the resolution is because
Chapter 4. Proposed Architecture 49
Table 4.3: Summary of Measured Results for Sine wave input
Sl. fin fp fs OSR SNR ENOBNo. (Hz) (MHz) (MHz) (dB) (bits)
1 30 37 36.999 16.67 9.61 1.30
2 30 37 36.9975 41.67 17.57 2.63
3* 30 37 36.995 41.67 17.37 2.59
4* 30 37 36.9925 62.50 19.54 2.95
5* 30 37 36.99 83.33 16.51 2.45
6* 30 37 36.98 166.67 9.01 1.20
7 10 37 36.999 50.0 17.76 2.63
8* 10 37 36.927 1825 33.61 5.29
*One measurement over two beat periods.Description: fin - Frequency of input sine wave, fp - Probe clock Frequency, fs - SamplingClock Frequency, OSR - Over-sampling Ratio, SNR - Signal to Noise Ratio, ENOB - EffectiveNumber Of Bits.
0 10 20 30 40−0.1
−0.05
0
0.05
0.1
0.15
0.2
0.25DNL
Digital Codeword
LS
B
0 10 20 30 40−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
Digital Codeword
LS
B
INL
Figure 4.7: DNL and INL plots for the setting of entry 8 in Table 4.3
the delay in this case is a small fraction of the time period of probe clock, i.e., the
dynamic range is limited. So, in order to get a better resolution, a probe clock of
higher frequency has to be used. But the poor rise time of the presently designed
V2D converter limits the frequency of the clock that can be used. In such cases,
the technique described in Chapter 6 can be used to overcome this issue of limited
Chapter 4. Proposed Architecture 50
dynamic range. Also, a better design of voltage-to-delay converter is used for an
example application, which is described in Chapter 7.
Another point to note is that since the oversampling rate is closely related
to the difference of the probe and sampling clock frequencies, both these clock
frequencies need to be accurately controlled. This may be achieved by deriving
this sampling clock from the probe clock and mimicking asynchrony by artificially
introducing frequency modulation [62]. Another approach could be to use a PLL
structure to precisely control the sampling frequency, the details of which are
presented in Chapter 5.
4.5.3 A note on stability of calibration data
The delay of the V2D is a function of temperature and voltage, and as a result is
vulnerable to change. Hence, the calibration data collected may not be accurate
a few seconds later, leading to large errors. But, a simple way of avoiding this
is to do a one-point calibration, setting Vin =Vref . In short, it can be referred
to as ‘offset-cancellation’, also pointed out in Fig. 4.6. Fig. 4.8 shows the re-
sults obtained for the same setup, at two different times separated by 1.5 hours.
Although the actual values of differential delays for the two measurements were
different, the curves almost coincided once the offset was subtracted.
4.6 Conclusions
We propose a scheme for analog BIST which is well-suited to measure voltages
distributed all over a chip by locally converting the test voltage into a delay
between a pair of sub-sampled signals. This is achieved by a sampling head
placed at each test node; each sampling head consisting of a pair of voltage
controlled delay cells and a pair of flip-flops. This approach reduces the routing
of analog signals over long paths to the measurement unit, hence saving chip area
due to absence of shielding lines. Instead, a clock signal, a sampling clock of
slightly different frequency need to be routed serially to each sampling head, and
a pair of low frequency sub-sampled signals from each sampling head needs to be
routed to the central DMU. This chapter gave a comparison of different voltage-
to-delay converters. The implementation details along with measured results were
Chapter 4. Proposed Architecture 51
0 20 40 60 80 100−1
0
1
2
3
4
5
6
7
∆ V (mV)
∆ D
(ns)
Earlier
Later
Figure 4.8: Plot of difference delay versus difference voltage after subtracting thedelay at zero difference voltage from both curves at two time instants separatedby 1.5 hours.
presented. A simple single point calibration is also suggested, which supplements
an extensive one-time calibration, to overcome the variability of delay with time.
Generation of sampling clock from the probe clock along with performance limits
and measured results are presented in the next chapter.
Chapter 5
Performance Limits and
Sampling Clock Generation
5.1 Behavioral model
The difference delay setup on sub-sampled signals by a sampling head is mea-
sured by the DMU, and is mapped to a voltage based on the calibration data of
difference voltage versus difference delay; as shown in Fig. 4.6.
The choice of a measurement time of an integer number of beat periods1 leads
to estimates of reduced variance. This is because one time period of probe clock is
‘amplified’ to one beat period. Hence, across beat periods, the distribution of skew
in a beat period looks identical nominally, except for jitter in the clocks causing
minor differences. This is confirmed by measurement results also in Table 4.2.
Hence, the DMU can be modeled as a quantizer with a conversion time of one
beat period. The resolution of the aforesaid quantizer depends on the duration
of the beat period. If the desired measurement time is multiple, say M, beat
periods; that can be modeled as a moving-average (MA) filter or an ‘integrate
and dump’ filter following the quantizer. More compactly, it can be modeled as
a MA filter followed by decimator. If moving average filtering alone is desired,
then the decimator down-samples at a rate of 1 (no down-sampling), whereas if
‘integrate and dump’ is desired, the decimator down-samples at a rate of M. This
1Beat period is the time-period of sub-sampled signal.
Chapter 5. Performance Limits and Sampling Clock Generation 53
is shown in Fig. 5.1, where an Anti-Aliasing filter (AAF) is shown to limit the
input bandiwdth, a switch is shown to model the sampling at the conversion rate,
and quantization is modeled as an additive noise (εq) followed by averaging. The
anti-aliasing filter is not explicitly implemented in this work, but shown in the
model for the purpose of analysis.
�� ���������� ��
��
��
�
Figure 5.1: Behavioral model of voltage quantization employing the DMU
Let T and T+∆T be the time periods of probe clock and sampling clock re-
spectively. Then, the duration of the beat period is given by Tb = (T+∆T)×T/∆T,
which will consist of T/∆T sampling clock periods. Hence, the resolution of the
quantizer in measuring time, in principle, is ∆T/T. Therefore, the maximum
number of bits of this quantizer, bmax and conversion time Tb are given by:
bmax = log2
( T∆T
)Tb = (T + ∆T )× T/∆T (5.1)
Practically, if dmax is the maximum delay given by the voltage-to-delay converter
corresponding to the maximum voltage input, then the number of bits b available
is given by:
b = log2
(dmax∆T
):= log2
(cT
∆T
)(5.2)
where c is the maximum delay as a fraction of the time period T.
If the total measurement time is M beat periods, the variance reduces by
a factor of M, and hence the SNR increases by about1 3 log2(M) dB. Hence,
doubling the measurement time improves the SNR by about 3 dB. The bandwidth
of measurement is 1/2Tb. Hence, clearly there is a bandwidth-accuracy trade-off
1 10 log10(x) ≈ 3 log2(x)
Chapter 5. Performance Limits and Sampling Clock Generation 54
in the choice of parameters T and ∆T. The overall SNR of the complete system
for a sinusoidal input of amplitude Vinmax is given by:
SNR = 10 log10
(A2
2· 1
12· c2 (∆T )2
T 2·M)
(5.3)
where A=kVinmax with k being the voltage to delay gain (assuming voltage to
delay conversion being linear). But, this assumes that sample-and-hold circuitry
is available which can hold the sampled values for a period of the measurement
time (tm), given by
tm = M · (T + ∆T )× T/∆T (5.4)
Let fp and fs be the frequencies of the probe and sampling clocks respectively.
Then, the frequency of the beat signal is given by fb = fc − fs. The above
equations can now be re-written as:
T
∆T=fsfb
(5.5)
∆T =fbfpfs
(5.6)
b = log2
(cfsfb
)(5.7)
Conversion rate = fb (5.8)
In order to simplify the design in this case of distributed voltage measurement,
the use of a sample-and-hold at the voltage nodes is avoided. This renders data
conversion at Nyquist rate impossible, but the methodology of over-sample–and–
average can be made use of, similar to a delta modulator [56, 73]. Basically,
the sampling rate should be high enough so that the signal of interest does not
change beyond an LSB within the conversion time. Assuming an input sinusoid
Vin = A sin(2πfint), an LSB a and conversion rate fb, the maximum change in
Vin within an interval of 1/fb should not exceed a, as discussed in 4.2.1 i.e.,
2πAfinfb
≤ a (5.9)
Define OSR = fb/(2fin) and 2A/a = 2n1 for n1 effective bits of conversion. Then,
Chapter 5. Performance Limits and Sampling Clock Generation 55
OSR sets a limit on the number of effective bits (n1) that can be obtained, as
given by
n1 ≤ log2
(OSR
π
)+ 1 (5.10)
SNR1 ≤ 6.02 log2
(OSR
π
)+ 7.78
SNR1 ≤ 20 log10
(√6 OSR
π
)(5.11)
(5.11) is the limit due to oversampling, which says says that the SNR that
can be achieved increases by about 6 dB for every doubling of OSR (20 dB per
decade/6 dB per octave).
A second limit on the number of effective bits (n2) available is set by the
number of bits available at each conversion, which reduces with increased over-
sampling. Using (5.7),
n2 ≤ b+1
2log2 (OSR) = log2
(cfs
2fin√
OSR
)(5.12)
SNR2 ≤ 6.02 log2
(cfs
2fin√
OSR
)+ 1.76
≤ 20 log10
( √3 cfs√
8 OSR fin
)(5.13)
(5.13) gives the limit due to quantization, which says that achievable SNR
decreases by about 3 dB for every doubling of OSR (-10 dB per decade/-3 dB
per octave). The actual number of bits (n) that one can obtain is given by
n = min(n1, n2) (5.14)
SNR = min(SNR1, SNR2) (5.15)
Fig. 5.2 shows the plot of (5.10) and (5.12), which indicates the existence of
an optimal OSR. The plot also shows that the optimal OSR reduces for a higher
input bandwidth and also yields lesser number of effective bits. It is important
to note that (5.10) and (5.12) are quite loose bounds since the non-idealities of
Chapter 5. Performance Limits and Sampling Clock Generation 56
jitter are not modeled to keep the analysis simple. The actual measured results
from Table 4.3 are also shown in Fig. 5.2.
5.2 Derivation of Design Parameters
There are two independent parameters, namely the pair of T and ∆T , or alter-
natively fp and fs - the probe and sampling clock frequencies. The parameters
need to be chosen so that the SNR obtainable is maximized.
As mentioned earlier, the minimum measurement time is taken to be one beat
period Tb. Hence, the over-sampling ratio (OSR) for a sine wave of time-period
0 5 10 15 20 25−5
0
5
10
15
20
25
log2(OSR)
n
n1
n2
fs/f
in=10
7
fs/f
in=10
6
fin
= 10Hz
fin
= 30Hz
Figure 5.2: Plot of OSR versus n, showing the existence of optimal OSR for agiven fin. Effective n = min(n1, n2) (5.14). The other parameters of the equationsare taken from the settings described in Table 4.3. The dots indicate the resultssummarized in Table 4.3. The gap between the modeled and measured behavioris because the differential delay generated is a small fraction of the clock timeperiod, and the resolution improves as the ratio of differential delay to time periodincreases. An explicit way of ensuring it to speed-up measurement and achievedSNR is described in Chapter 6.
Chapter 5. Performance Limits and Sampling Clock Generation 57
Tin is given by
Tb =T (T + ∆T )
∆T
OSR =Tin
2Tb=
Tin∆T
2T (T + ∆T )(5.16)
Replacing OSR in (5.11) and (5.13) using (5.16), we have
SNR1 = 20 log10
(√6 OSR
π
)= 20 log10
( √6 ∆T Tin
2π T (T + ∆T )
)(5.17)
SNR2 = 10 log10
(3 c2TinT
16 ∆T (T + ∆T )
)(5.18)
For optimal SNR, we have equating (5.17) and (5.18)
6 (∆T )2 T 2in
4π2T 2(T + ∆T )2=
3 c2TinT
16 ∆T (T + ∆T )(∆T
T
)3
=c2π2 (T + ∆T )
8Tin(5.19)
T + ∆T =8Tinc2π2
(∆T
T
)3
(5.20)
From (5.20) and (5.18),
SNR = 10 log10
(3 c4π2 T 4
128 (∆T )4
)(5.21)
∆T
T=
(3 c4π2
128
)1/4
10−SNR/40 (5.22)
T + ∆T =8 c Tinπ2
(3π2
128
)3/4
10−3 SNR/40 (5.23)
With a bit of algebra (taking MATLAB’s help), the parameters T and ∆T for
optimal SNR are given by:
T = Tinc√π
(3
8
)3/4
10−3SNR/40
(1 +
(3c4π2
128
)1/4
10−SNR/40
)−1
(5.24)
Chapter 5. Performance Limits and Sampling Clock Generation 58
∆T =3c2Tin 10−3SNR/40
16(
10SNR/40 +(
3π2c4
128
)1/4) (5.25)
In terms of frequencies, the parameters fp and fs for optimal SNR are given by:
fs = fin
√π
c
(8
3
)3/4
103 SNR/40 (5.26)
fp = fs
(1 +
(3c4π2
128
)1/4
10−SNR/40
)(5.27)
(5.27) and (5.26) give the frequencies of probe and sampling clocks for optimal
SNR. Some example numbers for fp and fs in accordance with (5.27) and (5.26)
are provided in Table 5.1. As expected, the values of fp and fs are small for
smaller SNR, smaller input bandwidth (fin) and larger dynamic range (c).
5.3 Use of PLL to Generate Sampling Clock
As mentioned in 4.4.1, the methodology of delay measurement via sub-sampling
needs a probe clock (of frequency fp) which captures the analog (time) informa-
tion on it and is sampled by a sampling clock (of frequency fs) whose frequency
is slightly less than that of the probe clock. Having separate crystals for the
two clocks is one solution, but increases the cost of the system due to the extra
crystal. Hence, the technique of employing a PLL to generate the sampling clock
from the probe clock is explored in this section.
From (5.7), maximum SNR is obtainable when the difference between fp and
fs is small (i.e., the beat frequency fb is small). Hence, it is better to employ a
PLL to generate the sampling clock from the probe clock as given by:
fs =N − 1
Nfp (5.28)
where N is an integer. In such a case where a PLL is used, the sampling clock
will not be asynchronous from the probe clock (since the phases of the two are
related).
From (5.28) and (5.16), the clock periods and over-sampling ratio (OSR) are
Chapter 5. Performance Limits and Sampling Clock Generation 59
Table 5.1: Example numbers for parameters discussed in (5.27) and (5.26)
c fin SNR (dB) fp (Hz) fs (Hz)
0.50
10 Hz
20.00 2.60×103 2.34×103
40.00 7.65×104 7.40×104
60.00 2.36×106 2.34×106
1 kHz
20.00 2.60×105 2.34×105
40.00 7.65×106 7.40×106
60.00 2.36×108 2.34×108
1 MHz
20.00 2.60×108 2.34×108
40.00 7.65×109 7.40×109
60.00 2.36×1011 2.34×1011
0.80
10 Hz
20.00 1.72×103 1.46×103
40.00 4.88×104 4.62×104
60.00 1.49×106 1.46×106
1 kHz
20.00 1.72×105 1.46×105
40.00 4.88×106 4.62×106
60.00 1.49×108 1.46×108
1 MHz
20.00 1.72×108 1.46×108
40.00 4.88×109 4.62×109
60.00 1.49×1011 1.46×1011
given by
T =(N − 1)
N(T + ∆T ) (5.29)
Tb = NT (5.30)
OSR =Tin
2Tb=
Tin
2NT(5.31)
Chapter 5. Performance Limits and Sampling Clock Generation 60
From (5.11) and (5.31), the limit on SNR due to over-sampling is given by
SNR1 ≤ 20 log10
(√6Tin
2πNT
)(5.32)
The dynamic range is given by
dmax = cT (0 < c < 1) (5.33)
As mentioned in 4.4.1, ∆T is the basic quantization step and better resolution is
obtained by averaging. The limit on SNR due to quantization is derived below.
Noise variance, σ2 =(∆T )2
12
1
OSR=
NT 3
6(N − 1)2Tin
Signal power, P = d2max = c2T 2 (5.34)
SNR2 = 10 log10
(P
σ2
)= 10 log10
(6c2 · (N − 1)2Tin
NT
)(5.35)
= 20 log10
(c(N − 1)
√6TinNT
)(5.36)
For optimum SNR, the desired N to be chosen is given by
√6Tin
2πNT= c(N − 1)
√6TinNT
(5.37)
N(N − 1)2 =1
4c2π2
TinT
(5.38)
The exact analytical solution for N (with the help of MATLAB) is given by:
N = α +1
9α+
2
3, where (5.39)
α =
√( Tin
8c2π2T− 1
27
)2
− 1
729+
Tin
8c2π2T− 1
27
1/3
, (5.40)
ignoring the other two complex roots. Since it is not a simple expression, the
Chapter 5. Performance Limits and Sampling Clock Generation 61
lower and upper bounds are calculated as follows:
(N − 1)3 < 14c2π2
TinT< N3(
Tin4c2π2T
)1/3
< N <
(Tin
4c2π2T
)1/3
+ 1 (5.41)
We will use a value of c = 0.8 for the rest of the discussion below as it matches
with our experimental setup discussed in the next section.
0.34
(fpfin
)1/3
< Nopt < 0.34
(fpfin
)1/3
+ 1 (5.42)
Choosing Nopt ≈ 0.34(fpfin
)1/3
, the optimal SNR is approximately given by:
SNRopt ≈ 10 log10
(6c2TinT·(
Tin4c2π2T
)1/3)
(5.43)
=40
3log10
(TinT
)+ 1.17 (5.44)
=40
3log10
(fpfin
)+ 1.17 (5.45)
For a desired SNR at a given frequency fin, the design parameters are given by
fp = fin · 103 (SNR−1.17)/40 (5.46)
N = 0.34 ·(10(SNR−1.17)/40
)(5.47)
T =1
fp(5.48)
∆T =T
N − 1(5.49)
Example numbers for the design parameters for desired SNR at given fre-
quency fin is given in Table 5.2.
Chapter 5. Performance Limits and Sampling Clock Generation 62
Table 5.2: Example numbers for design parameters fp and N for desired SNR atgiven frequency fin
fin SNR (dB)Approximate Exact
fp (Hz) N N SNR (dB)
10 Hz
20.00 2.58×102 1.01 1.76 15.14
40.00 8.17×103 3.18 3.89 38.27
60.00 2.58×105 10.05 10.75 59.43
80.00 8.17×106 31.79 32.53 79.82
100 Hz
20.00 2.58×103 1.01 1.76 15.14
40.00 8.17×104 3.18 3.89 38.27
60.00 2.58×106 10.05 10.75 59.43
80.00 8.17×107 31.79 32.53 79.82
1 kHz
20.00 2.58×104 1.01 1.76 15.14
40.00 8.17×105 3.18 3.89 38.27
60.00 2.58×107 10.05 10.75 59.43
80.00 8.17×108 31.79 32.53 79.82
1 MHz
20.00 2.58×107 1.01 1.76 15.14
40.00 8.17×108 3.18 3.89 38.27
60.00 2.58×1010 10.05 10.75 59.43
80.00 8.17×1011 31.79 32.53 79.82
5.4 Experimental Validation
The PLL on the Virtex-5 development board provides an output clock of fre-
quency fo from an input clock of frequency fi related by
fo =p
qfi (5.50)
Chapter 5. Performance Limits and Sampling Clock Generation 63
Figure 5.3: A typical PWM signal - the modulating sine wave is also shown indotted lines.
with the limits on the parameters as [87]
p ≤ 64, q ≤ 99, fo ≤ 600MHz, fi ≥ 20MHz (5.51)
Demonstration of time measurement via sub-sampling with a sampling clock
generated from the probe clock itself using a PLL is described in this section. The
analog information is encoded in the duty cycle of the clock i.e., the ON-time
of the probe clock in each period represents the analog information. Such a way
of information representation is popularly referred to as pulse-width modulation
[PWM]). The pulse-width modulated signal is easy to generate using a comparator
(with acceptable input offset and linearity) with one of the inputs being the test
analog voltage and the other input being a periodic ramp or saw-tooth wave1.
For the purpose of this work, a PWM signal generated by part number 33220A
is used. Refer Fig. 5.3 for a typical pulse-width modulated wave, with the test
analog voltage being shown in dotted lines. The duty cycle of the probe clock
is varied between 10% and 90% as determined by the sine wave. A sine wave
is used here so that SNR (signal-to-noise ratio) can be a reasonable metric to
evaluate performance. The duty cycle is measured as the ratio of the ON-time
to the time-period of the sub-sampled signal.
1A saw-tooth wave can be generated by integrating a square wave.
Chapter 5. Performance Limits and Sampling Clock Generation 64
5.4.1 Implementation of duty-cycle measurement unit
The block diagram of the system implemented in the Virtex 5 development board
is shown in Fig. 5.4. The PWM input signal is given from function generator,
bearing part number 3320A. The pin marked X is tied to PWM or another crystal
for different settings. The state machine block shown is implemented as shown in
Fig. 5.5. The vector (·, ·, ·) shown on the edges of the figure is the tuple (S,eN ,eD),
where S is the sub-sampled signal output of the flop; eN is the enable for ON-
time counter and eD is the enable for period counter. S is the input to the state
machine while eN and eD are the outputs. The debounce logic is not shown in
Fig. 5.5 to keep it simple, but similar logic explained in 4.4.1 can be used.
The system implemented on Virtex-5 development board consists of sub-
sampling flops, duty-cycle measurement along with the PLL (with frequency scal-
ing factor of N−1N
). A PWM signal described above is input to the system. For
a comparative study on the choice of input clock feeding the PLL, in one case a
D Q
DFF
PWM
÷PLLX
StateMachine
Counter1
Counter2
S eN
eD
ON time
Period
FPGA
Figure 5.4: Block diagram of system implemented in Virtex 5 development board
q0start q1 q2
(0,0,0)
(1,1,1)
(1,1,1)
(0,0,1)
(0,0,1)
(1,1,1)
Figure 5.5: State Machine of System Implemented in FPGA (Fig. 5.4)
Tuple: (S,eN ,eD). Input: S. Outputs: eN , eD.
Chapter 5. Performance Limits and Sampling Clock Generation 65
different clock source is input to the PLL, while in the other case, the pulse-width
modulated probe clock itself is directly fed to the PLL.
D Q
DFF
PLL
PWM
Source
S
(a) Asynchronous case
D Q
DFF
PLL
PWM S
(b) Synchronous case
D Q
DFF
PLL ÷
PWM S
(c) Dithered case
Figure 5.6: The sources of probe and sampling clocks for different cases
S: Sub-sampled signal
In summary, there are three methods of generating the sampling clock, as
shown in Fig. 5.6:
• A separate clock source/crystal (asynchronous case)
• Sampling clock derived from probe clock with frequency scaling by a factor
of N−1N
(synchronous case)
• Sampling clock derived from probe clock with frequency scaling by a factor
of N−1N
, but with dithered division ratio (dithered case)
Chapter 5. Performance Limits and Sampling Clock Generation 66
5.4.2 Large divide ratios and dithered divide ratio
Due to limitation of frequency scaling in the PLL, a frequency scaling factor of
120/121 cannot be readily implemented with the available PLL. In such a case,
the PLL is used to multiply the input clock frequency by a factor of 30, and the
PLL output is subsequently divided using a frequency divider by factors of {30,
30, 30, 31} in succession (using the delta-sigma modulation of representing 30.25
by the integer sequence {30, 30, 30, 31} so that the average is 30.25), in effect
making the frequency scaling factor 30/30.25 = 120/121.
5.5 Measurement Results
The values of measured duty-cycle values for the PLL setting of N=16 and an
input sine wave of frequency 10 Hz is shown in Fig. 5.7, and its FFT is shown
in Fig. 5.8. Fig. 5.7 shows that there is an offset as the ideal range of duty cycle
should have been between 0.1 and 0.9.
The measured results are summarized in Table 5.3 and Table 5.4. Fig. 5.11
shows the theoretical limits on SNR due to oversampling and quantization; and
also the points obtained by measurement. At each value of N , for a given setting
of fin and fp, the mean and standard deviation of SNR from 16 measurements
are reported in Table 5.3 and Table 5.4.
The measured results confirm the existence of an optimal N value for a spec-
ified fin and chosen fp, as predicted by theory. But, influence of jitter on the
quantization levels was not modeled for the analysis which shows up as the gap
between maximum attainable SNR and actually obtained numbers from measure-
ments. Another reason for the gap from oversampling limit is because this system
is employing simple averaging as against noise-shaping (as in sigma-delta ADCs).
However, the trend in the SNR numbers matches closely with that predicted by
theory.
The fourth column (under heading maximum limit) of Tables 5.3 and 5.4
contains the minimum of the oversampling and quantization limits and shows a
difference of 3 dB between settings of first and second row at small N values in
Table 5.3, while the said difference becomes 6 dB at large N in Table 5.4. It
Chapter 5. Performance Limits and Sampling Clock Generation 67
Table 5.3: Summary of Measured Results comparing asynchronous and syn-chronous cases of sampling clock generation
Table shows mean value of SNR in dB with standard deviation in parenthesis
fp = 5 MHz, fin = 10 Hz
NSNR (dB)
Async Sync Max. Limit
6 12.36 (1.62) 18.59 (5.61) 69.03
11 37.89 (1.84) 35.59 (0.00) 72.42
16 55.96 (1.92) 37.41 (3.07) 74.31
31 48.99 (0.63) 47.76 (2.76) 75.97
41 46.87 (0.50) - 73.54
42 - 46.84 (0.11) 73.33
Table 5.4: Summary of Measured Results comparing asynchronous and dithered(synchronous) cases of sampling clock generation
Table shows mean value of SNR in dB with standard deviation in parenthesis
fp = 5 MHz, fin = 10 Hz
NSNR (dB)
Async Dithered Max. Limit
61 37.79 (0.31) 44.08 (0.44) 70.09
121 24.20 (0.12) 35.53 (1.00) 64.14
is easy to see that the difference between the settings for measurement and the
parameters obtained from design equations is least at largest values of SNR, as
shown in Fig. 5.10. As mentioned previously, this gap exists as the effects of jitter
on linearity and quantization is not modeled.
A least squares linear curve fit is of maximum and measured SNR against
the logarithm of PLL division parameter N is shown in Fig. 5.9. The entries of
Async case of both Tables 5.3 and 5.4 are chosen for the fit. The SNR values
Chapter 5. Performance Limits and Sampling Clock Generation 68
corresponding to small and large values of N are off, leading to slopes different
from the expected values; whereas the local slope of mean values of SNR between
N values of 31 and 41 is -5.26, which matches well with the theoretical value.
As explained earlier, the PLL cannot readily implement large divide ratios of
more than 60 for the synchronous case. As a result to implement a divide ratio
of 61, or equivalently a frequency scaling factor of 60/61, the divider count of
the divider in Fig. 5.6(c) needs to alternately switch between 30 and 31 (so that
30/30.5 = 60/61).
From Table 5.3, there is not much to choose from between the asynchronous
and synchronous cases for relatively low values of N , except for the outlier at
N = 16. But in the case of large values of N , Table 5.4 shows that the dithered
case performs better than the asynchronous case. However, since the SNR values
attained in the dithered case does not fall outside the range of SNR values ob-
tained by synchronous case at lower values of N , the synchronous case itself can
be made use of at appropriate choices of N .
5.6 Conclusions
A method of generating sampling clock from probe clock for the purpose of time
measurement on a Virtex-5 development (FPGA) board is discussed in this chap-
ter. Analog information in the form of a sine wave is used to modulate the ON-
time of probe clock, yielding a PWM signal. Measurement of duty cycle of the
PWM signal by the sub-sampling approach of delay (time) measurement gives
a quantized version of the modulating sine wave. Measurement results of the
system with different ways of generating the sampling clock are reported, with
the maximum attained SNR being 55.16 dB. The variation of SNR with N (PLL
divide ratio) is investigated theoretically and the measured results confirm the
existence of an optimal N which yields maximum SNR. There is a gap between
the settings of actual measurement and the parameters computed theoretically,
however the gap is lesser for larger values of SNR, obtained at moderate values
of N .
In conclusion, the synchronous case of clock generation described performs
as well as the asynchronous case for low values of N , and achieves higher SNR
Chapter 5. Performance Limits and Sampling Clock Generation 69
values than that attained by dithered case. It is therefore the system of choice also
owing to its simple and low cost implementation as it avoids an additional crystal
needed by asynchronous case, and the implementation of alternating divide ratios
as needed for the dithered case.
Chapter 5. Performance Limits and Sampling Clock Generation 70
0 1 2 3 4 5 6 7 8 9 10
x 104
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
No. of Samples
Du
ty c
yc
le
Figure 5.7: Samples of duty-cycle measurement - Quantized values of the inputsine wave
100
101
102
103
104
105
106
10−14
10−12
10−10
10−8
10−6
10−4
10−2
100
Frequency (Hz)
Ma
gn
itu
de
(d
B)
Figure 5.8: Spectrum of measured duty-cycle samples, showing a clear peak at10 Hz, the input sine frequency
Chapter 5. Performance Limits and Sampling Clock Generation 71
0 20 40 60 80 100 120 14010
20
30
40
50
60
70
80
90
N
SN
R (
dB
)
Measured Data
Linear fit
Maximum limit
−6.02 log2(N) + 105.80
−11.02 log2(N) + 102.60
3.77 log2(N) + 59.31
30.66 log2(N) −67.24
Figure 5.9: Linear curve fitting of SNR versus log(N)
0 20 40 6010
1
102
103
104
105
106
107
SNR (dB)
f p (
Hz)
0 20 40 6010
−1
100
101
102
103
SNR (dB)
N
Theory
Measured
Figure 5.10: Gap between theoretically predicted parameters and actual mea-surement settings. Note that the difference is least at large values of SNR.
Chapter 5. Performance Limits and Sampling Clock Generation 72
10
01
01
10
21
03
0
50
10
0
SNR (dB)
f in =
10
Hz,
f p =
5 M
Hz
10
01
01
10
21
03
0
50
10
0
SNR (dB)
f in =
10
Hz,
f p =
2.5
MH
z
10
01
01
10
21
03
0
50
10
0
f in =
20
Hz,
f p =
5 M
Hz
N
SNR (dB)
OS
R L
imit
Qu
an
tiza
tio
n L
imit
As
yn
c
Sy
nc
Dit
he
red
Fig
ure
5.11
:P
lot
show
ing
theo
reti
cal
lim
its
onSN
Ran
dre
sult
s(m
ean
SN
R)
obta
ined
from
mea
sure
men
t.
Chapter 6
Multiphase technique to
speed-up delay measurement via
sub-sampling
Consider the problem of delay measurement by the sub-sampling approach intro-
duced in 3.6. A delay d is present between a pair of probe clock signals of period
T . This clock pair is sampled by another clock of period T + ∆T . Since this
sampling frequency is lesser than Nyquist rate, the original signal cannot be fully
reconstructed back, but there is “amplification” of the delay and the time period
in the resulting sampled signal. Fig. 4.4 shows the timing diagram of the various
signals. Hence, the delay d between the clock pair Da and Db is also amplified to
become d(T + ∆T )/∆T between Qa and Qb.
It is well known that when the probe and sampling clocks are not generated
from the same crystal, it leads to the case of random sampling, which is demon-
strated in evaluating ADC performance [33] and in calibrating delay between two
clock phases [89]. To understand this process of random sampling, let the probe
clock be modeled as a circle since it is periodic with period T (the circumference
corresponding to T ). Then, the delay between the clock pair can be modeled as
a sector in this circle. Since the time period of the sampling clock is greater than
the input clocks, the sampling clock edges precesses around the circle. Please
refer Fig. 6.2. The problem is to estimate the size d of the sector and an estimate
Chapter 6. Multiphase technique to speed-up delay measurement viasub-sampling 74
Figure 6.1: Block diagram of DMU (Delay Measurement Unit) based on sub-sampling
is given by:
d =No. of points in the sector corresponding to delay
Total no. of points corresponding to time-period× T (6.1)
This is shown to yield an unbiased estimate of delay in Appendix A and is easy
to note that this estimate has least variance if the total number of points span
the length of an integer multiple of the circumference [62]. To be able to calcu-
late (6.1) practically, two counters are needed; one each for delay and time-period
counts.
If the delay d is a small fraction of the time period T , then the majority of
the sampling points do not contribute to the numerator of (6.1) limiting the use
of such data points. Also, in such cases the obtained SNR will be lesser than
the maximum possible, as is evident from Fig. 5.2. This problem is analogous
to the situation of rare events in a Monte-Carlo simulation [90] and to the need
of automatic gain control (AGC) in ADCs to prevent its under-utilization. A
straight-forward solution is to use high frequency clocks (low T ) which makes it
consume more power and not work at low supply voltages. Hence, it is interesting
to apply techniques similar to rare-event simulation and AGC for quicker delay
estimation. Such a technique will also improve the SNR achieved.
Chapter 6. Multiphase technique to speed-up delay measurement viasub-sampling 75
Point in first
round
Sector of interest
Point in second
round
Figure 6.2: Illustration of the sampling clock precessing around the input clock.The circumference represents the time period of input clock while the sectorrepresents the delay to be measured. The asterisk shaped points are the edges ofsampling clock.
6.1 Proposed Solution
Consider a toy example where the periods of input clock and sampling clock are
16 and 17 units respectively, with the test delay being 2 units. The duration of
a beat period is 16 × 17 = 272 unit and the delay is amplified to 2 × 17 = 34
unit. As a result, the period counter goes till 16 (= 272/17) and delay counter till
2 (= 34/17); which means that 14 samples out of 16 do not contribute much to
delay measurement. If a two-phase clock were available, having detected that the
delay counter is not changing, the other phase is fed to the counter. Hence, the
delay counter now will count upto 4; but has to be divided by 2 since two-phases
were used. Similarly, use of a 4-phase clock will yield a delay count of 8, which
needs to be divided by 4 to get the actual delay. Fig. 6.3 illustrates this example.
In general, suppose an N-phase clock is available and we are interested in
measuring the delay of a DUT (Device Under Test). At the end of the first beat
period, the counts for the delay and time-period will be available. In each of the
subsequent beat periods, once the delay count saturates, the next appropriate
Chapter 6. Multiphase technique to speed-up delay measurement viasub-sampling 76
0 5 10 15 20 25 30 350
10
Peri
od
co
un
t
0 5 10 15 20 25 30 350
2
4
Dela
yco
un
t Single phase
0 5 10 15 20 25 30 350
5
Dela
yco
un
t 2 phases
0 5 10 15 20 25 30 350
5
10
Dela
yco
un
t
Time (No. of sampling periods)
4 phases
Figure 6.3: Counts corresponding to period and delay. Two-phase and four-phase clocks measure delay twice and four times in a beat period respectively,thus providing more accuracy in the same measurement time.
phase is calculated and fed to the DUT. As a result, the sector of interest is
scanned multiple times in a beat period, leading to reduced measurement time
for same accuracy or improved accuracy for the same measurement time. The
flowchart describing this process is shown in Fig. 6.5.
An issue, though, is that the phase spacings of the N-phases may not be equal.
Let φ1, . . . , φN be the phases and let di1, . . . , din and pi1, . . . , p
in be the counts of
delay and period respectively for each phase in the ith beat period (Here n < N
since all the available phases may not be used). We need an estimate for the
delay d based on di1, . . . , din and pi1, . . . , p
in. One possibility is to use
d =
(di1pi1
+ · · ·+ dinpin
)× T
n(6.2)
This estimate is not accurate since the counts pi1, . . . , pin can be different due to
Chapter 6. Multiphase technique to speed-up delay measurement viasub-sampling 77
0 0.1 0.2 0.3 0.4 0.50
5
10
15
20
25
30
35
Delay as a fraction of time−period (d/T)
Sp
eed
−u
p
N=32
N=16
N=8
N=4
N=2
Plot of y=1/x
Figure 6.4: Plot of speed-up obtained corresponding to fraction of delay to time-period. Here N is the number of phases of clock available.
unequal phase-spacings. A better estimate is given by:
d =di1 + · · ·+ dinpi1 + · · ·+ pin
× T
n
=di
pi× T
n(6.3)
where di = di1 + · · · + din and pi = pi1 + · · · + pin. Estimate of (6.3) is better
since the sum of all phase-spacings is very close to an integer multiple of the time
period modulo the jitter. The speed-up of this scheme over the single phase case
is given by:
Speed− up, n = min
(N,
⌊T
d
⌋)(6.4)
The plot of speed-up n versus the delay (d as a fraction of time-period T ) is
shown in Fig. 6.4. As can be seen from the plot, the speed-up normally goes as
the inverse of d/T and saturates at N , the number of phases available. Hence,
Chapter 6. Multiphase technique to speed-up delay measurement viasub-sampling 78
the improvement achieved by this scheme is larger for small delays and smaller
for large delays. Since the scheme based on DMU inherently provides resolution-
bandwidth trade-off, the speed-up obtained can be used to reduce measurement
time or increase accuracy over the single phase scheme.
There will be an error in the measurement if a certain phase does not span
the delay completely. For instance, with an 8-phase clock, suppose the delay to
be measured corresponds to one-fourth of the time-period, and the rising edge of
7th phase is chosen. Since, only 1/8th of the time-period is left, only half the
delay will be counted, leading to an error in the measurement. Such cases can be
avoided by having a conservative algorithm, wherein the delay count is updated
if and only if it gets saturated which cannot happen unless a chosen phase fully
covers the delay. If a certain phase fails to span the delay completely, the count
corresponding to that phase is discarded. This approach also takes care of phase
mismatch by being conservative, but might lose out slightly on speed-up.
6.2 Simulation
To verify the proposed idea, the system described above with the algorithm ex-
plained in flowchart Fig. 6.5 is implemented in MATLAB Simulink environment,
the block diagram of which is shown in Fig. 6.6. An option is made to select one
of single phase, 4-phase or 8-phase clock input to the DUT, which gives out a
pair of clock signals between which a delay is setup. This clock pair is sampled
by a sampling clock of slightly lesser frequency than the input clock. A pair of
counters, one each for delay and period, are setup. The delay count is the accu-
mulated arithmetic difference between the pair of sub-sampled outputs, while the
period count is the accumulated number of rising edges of the sampling clock in
a period of the sub-sampled output. The ratio of delay count to the period count
is an unbiased estimate of the delay d as a fraction of the time-period T [62].
Period count and Delay count implement the appropriate counters, while Deter-
mine phase shift subsystem determines the phase-shift to be applied each time
and generates control signals to reset the counters for every new measurement.
Chapter 6. Multiphase technique to speed-up delay measurement viasub-sampling 79
6.3 Simulation Results
6.3.1 Case of Fixed Input Delay
Simulation results for the case of fixed delay are tabulated in Table 6.1 for the
parameters mentioned below the table. The point to be noted is that in entries
4, 5, 6 of Table 6.1, the test delay of 0.5 ns is much smaller than the basic quan-
tization step of 4.72 ns. Mean and standard deviation of 100 measurements are
taken with the same setting. A larger standard deviation of the measured de-
lay in the single phase case means that although the average of measured delays
across measurements is quite close to the input delay, certain individual measure-
ments may be off. Hence, the measurements with single phase clock will need
more averaging to guarantee better accuracy, leading to increased measurement
time. However, the standard deviation is lesser when multi-phase input clock is
employed, leading to reduced measurement time for a given accuracy.
In general, the variance of measured delay decreases inversely as the measure-
ment time. Hence, if σ21 and σ2
N be the variances of a certain measured delay, and
T1 and TN be the respective measurement times, then
σ21
σ2N
=TNT1
(6.5)
But, in this particular scheme, use of N-phase input clock yields a measurement
of lesser variance. Hence, to achieve a given variance of the measured delay, the
scheme with N-phase input clock takes lesser time. This speedup is roughly given
by
Speedup, n =σ2
1
σ2N
(6.6)
Entries of Table 6.1 confirm that speedup is more for small delays and is in close
agreement with theoretically expected values.
6.3.2 Cose of Slowly Varying Input Delay
It is shown in [56] that such delay measurement schemes can also handle slowly
varying delays without the explicit use of sample-and-hold circuitry. However,
Chapter 6. Multiphase technique to speed-up delay measurement viasub-sampling 80
Table 6.1: Summary of Measured Results for fixed input delay
Sl. Input Delay No. of Measured Delay (ns) Speedup overNo. (ns) phases Mean Std. deviation single phase
120
1 20.00 1.71 1
2 4 19.94 0.87 3.86
3 8 20.05 0.85 4.054
0.51 0.55 1.48 1
5 4 0.55 0.64 5.35
6 8 0.50 0.50 8.36
Setting:Input clock frequency, fc = 10 MHzSampling clock frequency, fs = 9.55 MHzDuration of beat period, Tb = 2.218µsBasic quantization step, ∆T = 4.72 nsJitter in both input and sampling clocks = 100ps.
Table 6.2: Summary of Measured Results for slowly varying delay
Sl. Input Delay Amplitude No. of Measured Delay Improvement in SNRNo. (ns) phases SNR (dB) over single phase (dB)
15
1 06.50 0
2 4 12.99 06.49
3 8 17.59 11.094
151 15.56 0
5 4 18.85 03.29
6 8 18.63 03.07
Setting:Input sine frequency, fin = 484.15 MHzOversampling ratio, OSR = 465.45
the varying input, say a sine wave, should be suitably oversampled so that the
test delay will not change by more than an LSB during the measurement time.
Chapter 6. Multiphase technique to speed-up delay measurement viasub-sampling 81
If d = A sin(2πfint) is the test delay input and Tb is the measurement time, then
2πAfinTb ≤ a
fin ≤a
2πATb(6.7)
where a is the LSB. With Tb = 2.218µs for 8 bits, fin can be atmost 560 Hz.
Choosing1 fin = 484.15Hz, we have the results for the different number of input
clock phases summarized in Table 6.2, with the other settings same as that of
Table 6.1. The SNR numbers reported here are before low pass filtering, and
hence the SNR will improve after filtering and decimation. The results clearly
show that SNR for small test delays improves in the case of multiple phases,
although the increase may not be substantial for large test delays.
6.4 Conclusions
The method of time measurement using sub-sampling based DMU is briefly re-
visited and its limitation in measuring small delays is described. A solution to
improve the speed (and/or accuracy) by making use of a multiphase probe clock
is described. Simulation results from MATLAB Simulink environment demon-
strate a speed-up of upto a factor of eight achieved by an eight-phase input clock
for fixed test delays and an improvement in SNR of upto 11dB for slowly varying
test delays.
1Chosen to ease computation of FFT by eliminating spectral leakage. [91]
Chapter 6. Multiphase technique to speed-up delay measurement viasub-sampling 82
Get p0, d0 and N
Start phase,k = 0
Start countersdi = 0, pi = 0
n = 0.
Update coun-ters di and pi
Did di
saturate?
Advance phase,
k = k +⌈N di
p0
⌉.
n = n + 1
Is k > N?Save di, pi and n.
Next samplei = i + 1
Yes
No
Yes
No
Figure 6.5: Flowchart for the proposed scheme
N Number of clock phases available
p0 Period count in first beat period
d0 Delay count in first beat period
pi Period count in ith beat period
di Delay count in ith beat period
n Number of clock phases used
k Phase counter, ranges from 0 to N − 1
Chapter 6. Multiphase technique to speed-up delay measurement viasub-sampling 83
Variable
Tra
nsport
Dela
y2
To
Variable
Tra
nsport
Dela
y1
To
Variable
Tra
nsport
Dela
y
To
Triggere
d
Subsyste
m1
In1
In2
In3
Ou
t1
Ou
t2
Ou
t3
Subtr
act
Sin
e W
ave
Scope
S/w
2
S/w
1
S/w
Period c
ount
In2
Rst
Ou
t2
No
. ph
.
4
Multip
ort
Sw
itch
1
Multip
ort
Sw
itch
Multip
hase C
lock
1
8−P
hase
Clo
ck
Multip
hase C
lock
4−P
hase
Clo
ck
Mem
ory
4M
em
ory
3M
em
ory
2
Mem
ory
1G
et period
In1
Ou
t1
Dete
rmin
e
next pahse
No
. p
h.
De
lay
Pe
rio
d
Rst
Ph
o
OF
o
n Sa
t
Dla
y C
nt
Dela
y c
ount
In2
Rst
Sw
Rst
Ou
t2
D F
lip−F
lop
2
D CL
K
!CL
R
Q !Q
D F
lip−F
lop
1
D CL
K
!CL
R
Q !Q
D F
lip−F
lop
D CL
K
!CL
R
Q !Q
Counte
r2
Clk
Rst
Cnt
Hit
UpConsta
nt1
1
Com
pare
To C
onsta
nt
==
2C
lock
;1::
AN
D:.
AN
D
:
boole
an
.;1
.:1
...0
..
double .
double
Figure 6.6: Block diagram implemented in MATLAB Simulink
Chapter 7
Example Application
The technique of observing internal analog voltages described earlier in Chapter 4
is used in the power scalable receiver implementation in UMC 130 nm process,
the block diagram of which is shown in Fig. 7.1.
7.1 Power Scalable Receiver Implementation
Fig. 7.1 shows the block diagram of a low-IF receiver designed for the ZigBee
standard, the IF (intermediate frequency) being 3 MHz. The key innovation of
the design is in sensing the strength of the signal and interference, and accordingly
switching the receiver to low power modes.
The power-scalable receiver has an RSSI (Received Signal Strength Indica-
tor) block, which senses the strength of received signal and an RISI (Received
Interference Strength Indicator) block, which senses the strength of interference.
In case the received signal is strong and interference is weak, the various blocks
in the receiver chain, namely the LNA, VGA and ADC, switch to low power
modes as indicated by the RSSI and RISI blocks. The details of the approach
and implementation is discussed in [36, 92].
The receiver chain is designed to work at a supply of 0.8 V and as a result the
common voltage is expected to be around 400 mV. In this test chip, the common
mode voltage at the output of the mixer was not controlled as the common mode
feedback (CMFB) circuitry was implemented as part of the VGA. As a result,
knowledge of the common mode voltage at the output of the mixer helps the
Chapter 7. Example Application 85
designer while testing/de-bugging. As shown in Fig. 7.1, the I (in-phase) path
of the mixer output is sampled using the sampling head (SpH), reproduced in
Fig. 7.2 for convenience. If the common mode was found to be far different than
what was desired, the common mode has to be controlled from outside the chip.
Similarly, at the input of the ADC, the VGA sets the common mode. But
if this common mode is far off from what is desired, the dynamic range of the
ADC gets affected. Again, as shown in Fig. 7.1, the common mode voltage of I
(in-phase) and Q (quadrature) paths is found (using two large resistors so that
they do not load the paths) and is sampled using the sampling head of Fig. 7.2.
Hence, again knowledge of the common mode voltage at the input of the ADC
comes in handy, and if found to be far off can be adjusted externally.
In summary, the placement of sampling heads (SpH) at internal test nodes
helps in testing/de-bugging during design phase and can be used for production
testing or in-use monitoring post manufacturing.
7.2 BIST Implementation
The mentioned method needs a voltage-controlled delay to be designed which
takes up minimal area and is reasonably linear in the range of test voltages. A
comparison of different voltage-to-delay converters are given in Table 4.1. The
popular architecture of current-starved inverters is made use of, to design the
voltage-controlled delay cells, one each for voltages close to GND and close to
VDD; as shown in Fig. 7.3a and 7.3b respectively. The cell of Fig. 7.3a is used at
both the places shown in Fig. 7.1 since the common mode of 400 mV is closer to
GND than VDD. The cell of Fig. 7.3b is used a monitor an internal bias voltage
node in the RSSI block of Fig. 7.1 (not shown explicitly to avoid clutter), which
is expected to be at 800 mV .
Transistors M6 of Fig. 7.3a and M4 of Fig. 7.3b are tied to the power rails to
make sure that the delay does not blow up for test voltages close to either power
rail, and also improves the linearity of the delay cell. Transistors M7 and M8 of
both cells help improve the slew rate and make the edges better immune to noise.
The above mentioned delay cells are employed in the system as shown in
Fig. 7.4. The pair of delay cells convert the voltage difference (Vin - Vref) to a
Chapter 7. Example Application 86
MIXER
ANTENNA
LNA
I
Q
RISI
RSSI
To
Dig
ita
l B
ase
ba
nd
ADCVGAVGA
Interference Strength Dependent Controls
Signal Strength Dependent Controls
PLL
VCO
FILTER
SpH
SpH
Figure 7.1: Block diagram of power-scalable receiver. Courtesy: Kaushik Ghosal
Figure 7.2: Sampling head (SpH)
delay difference between the pair of delayed clock outputs. To measure this delay
difference accurately, both the delayed clock outputs are sampled by another
clock, whose frequency is slightly less than that of the probe clock. As a result, the
outputs of the pair of flip-flops are a pair of sub-sampled signals whose frequency
is the difference between that of the probe and sampling clocks. It also turns out
that the delay difference setup by the pair of delay cells is now expanded and can
be measured using an up/down counter. For the purposes of this tape-out, the
Chapter 7. Example Application 87
M1Vin
M2
M3
M4
M5
M6
M7
M8
M9 M0
Probe clock In
Probe clock Out
(a) Voltage controlled delay cell - for voltages close to GND.
M1
M2
Vin M3
M4
M5
M6
M7
M8
M9 M0
Probe clock In
Probe clock Out
(b) Voltage controlled delay cell - for voltages close to VDD.
Figure 7.3: Architecture of voltage controlled delay cells
pair of sub-sampled signals are brought out through the pin and will be analyzed
off-chip using FPGA. Also, to correct for the possible non-linearity of the delay
cells, a provision is made for calibration prior to actual testing by providing an
analog multiplexer to select between calibration voltage and test voltage.
As shown in Fig. 7.4, the extra pins needed for this testing procedure are as
follows:
• Calibration Voltage (Vcal)
Chapter 7. Example Application 88
Vin Clkout
V2D
Vin Clkout
V2D
D Q
DFF
D Q
DFF
Vref
S1
S2
Sampling Clock
Probe Clock
Clock Enable
Vin
Vtest
Vcal
Cal/Test
Figure 7.4: Block Diagram of the BIST Setup
(S: Sub-sampled signal)
• Probe Clock
• Sampling Clock
• Pair of output beat signals
• Multiplexer select signals
However, it is to be noted that calibration voltage and probe clock inputs need to
come from outside. Sampling clock can be generated from the probe clock using
a PLL as described in Chapter 5. The pair of output beat signals feed into the
on-chip DMU, described in Chapter 4, the output of which is read out via the
digital test infrastructure. The multiplexer select signals need to be given from
the BIST control unit. The total area of the BIST setup shown in Fig. 7.4 is
about 60 µm × 23 µm.
Chapter 7. Example Application 89
7.3 Simulation Results
The die micrograph along with the layout snapshot of the implemented BIST
block is shown in Fig. 7.5. The plots of differential delay versus differential
voltage for the delay cells of Fig. 7.3a and Fig. 7.3b is shown in Fig. 7.6. The
plots clearly reveal that the circuit of Fig. 7.3a gives higher sensitivity for voltages
near GND and that of Fig. 7.3b at voltages near VDD. A sample of the variation
of this delay across process and temperature is also shown in the plots of Fig. 7.6.
Figure 7.5: Die micrograph of the power scalable receiver implementation withthe layout snapshots of BIST blocks inserted
7.4 Conclusions
Implementation of a BIST scheme to observe the common mode voltage of analog
circuitry in a test chip of a power-scalable receiver fabricated in UMC 130 nm
is described in this chapter. The design of the voltage-to-delay cells and control
circuitry are described along with simulation results from Cadence environment.
The voltage-to-delay cell described here is better than the one used earlier, as is
clear from the simulation results.
Chapter 7. Example Application 90
0 0.2 0.4 0.6 0.8 1 1.2 1.4−1
0
1
2
3
4
5
Vin
(V)
De
lay
Dif
fere
nc
e (
ns
)
27°C FNSP
27°C TT
127°C TT
127°C FNSP
(a) Plot of difference delay versus differential voltage
0 0.2 0.4 0.6 0.8 1 1.2 1.4−1
0
1
2
3
4
5
6
Vin
(V)
De
lay
Dif
fere
nc
e (
ns
)
127°C TT
27°C TT
27°C FNSP
127°C FNSP
(b) Plot of difference delay versus differential voltage
Figure 7.6: Simulation Results
Chapter 8
Conclusions
The overall problem of analog testing in an SoC environment which generalizes
well across different classes of analog circuits and offers concurrent testing is still
an open issue. The availability of processing power, especially in terms of digital
processing can be leveraged to design low cost test strategies. In this thesis, a
method of enabling BIST for analog IPs in an SoC setting is developed. The
main goal of the solution was to go towards an all-digital approach to benefit
from technology scaling.
8.1 Contributions
In order to meet the said goals, a simple low cost ‘digitizer’ is developed instead
of a full blown ADC. This ‘digitizer’ is composed of two parts - a sampling head
(SpH) to convert test voltage to delay on a pair of low frequency clock signals,
and a DMU to measure the delay thus setup. Owing to the simplicity and less
area overhead of the SpH, multiple test points could be observed by replicating
the SpH at each test node. Therefore, the sampling heads give rise to as many
low frequency clock pairs as there are test nodes. A multiplexer selects the pair
of low frequency signals to be fed to the DMU based on the test node to be
tested/monitored. A key feature of such an approach is that the test voltage is
always connected to the SpH, thereby avoiding insertion of switches in the signal
path which can potentially degrade system performance.
The sub-sampling approach of delay measurement, introduced in Chapter 3
Chapter 8. Conclusions 92
and applied to the problem of measuring analog voltages in Chapter 4, requires
a probe clock as a carrier of the delay and a sampling clock (of slightly different
frequency) to sample this clock pair carrying the delay between them. A strategy
of generating the sampling clock from the probe clock, so as to minimize the
number of pins that connect to the tester, are described in Chapter 5 along with
derivation of design parameters. A technique of speeding-up such a measurement
using multiple phases of a clock is described in Chapter 6.
8.2 Scope for future work
The technique of measuring low-frequency analog voltages described in this thesis
offers very low area overhead, and is all-digital in nature and therefore benefits
from further technology scaling. It also provides very fine resolution for the testing
of about a miilivolt as demonstrated from measured results from a test chip.
Also, the technique offers a trade-off between measurement time and resolution
achieved, thereby can potentially be sped up for quicker and coarser testing.
Although these features make the proposed technique promising for deep sub-
micron CMOS process, there are a few limitations of this technique, which are
presented next.
Production testing of integrated chips needs techniques to quickly determine
if the chip is a ‘pass’ or ‘fail’. Every millisecond spent on the tester to make
this decision costs, as pointed out in 1.2. Although a technique of speeding up
measurements was described in Chapter 6, testing one node after another serially
as described in this thesis may be costly in such situations. A work-around
for this limitation can be to use multiple control units, like the one shown in
Fig. 4.1, controlling different sets of test nodes; similar to the approach of using
multiple scan chains to speed up digital circuit testing avoiding additional pin
overheads [93].
Another limitation of the dynamic range of analog voltages available for test-
ing is limited by the linearity of the V2D cell used. As it is hard to get V2D cells
to behave linearly over a wide range of voltages, different V2D designs have to
be adopted based on the range of particular test node. One way of mitigating
this non-linearity is calibration as suggested in the thesis. But such a calibration
Chapter 8. Conclusions 93
would need a few analog voltages to be generated. It would be desirable to gen-
erate those voltages in a digital manner, by controlling the duty cycle of a clock
for example, or by similar approaches. Such a technique on-chip would reduce
the burden on the tester to provide these calibration voltages.
Another solution to the said problem of non-linearity of V2D can be to quickly
check if the test delay corresponding to the test voltage is the range of say (dlow,
dhigh), where dlow and dhigh are the delays that correspond to voltages Vlow and
Vhigh respectively. Such a method, which can be called a Go/No-go test, circum-
vents the non-linearity of the V2D cell. However, the voltages Vlow and Vhigh
need to be generated or given by the tester.
The technique based on sub-sampling presented in the absence of sample-and-
hold circuitry works well for near-DC signals and does not suit high frequency
signals directly. In order to test high frequency signals, they can be directly sub-
sampled if the signal is periodic as in [17]. Otherwise, the information in the
signal needs to be converted to a DC signal as in [5] or impressed on to a periodic
signal. Such a scheme can be extended to characterize the frequency response of
DUTs.
Although a behavioral model is developed for the system and analyzed to
obtain the design parameters for best performance, the jitter in the clocks which
directly affects the quantization levels is not modeled in this thesis. Such a
modeling of jitter would help a better understanding of the gap present between
analytical and measured results; and also enable the designer to come up with
acceptable jitter numbers for such techniques.
Appendix A
Unbiased Delay Estimator
In this appendix, we will substantiate the claim that the asynchronous sub-
sampling approach described in 4.4.1 leads to an unbiased estimate of delay.
Let T1, T2 be the times within a clock period when the clock pairDai, Dbi cross
the logic high threshold and let Ts be the time when the sampling clock crosses
the sampling threshold. Due to jitter, these are random variables. Without loss
of generality, let the mean of T1 be zero. The mean of T2 is δ, the quantity to be
estimated.
Let
T2 = T2 − δ
and let
Ts = ts + Ts (A.1)
where ts is the mean value of Ts, and Ts is the random component.
It is of interest to determine the probability that the samplers sample a logic
high. A sampler samples a logic high if the sampling edge occurs earlier than the
clock edge. Hence,
P (q1 = 1) = P (T1 < Ts) = P (T1 − Ts < ts) (A.2)
Let Z1 = T1 − Ts. Let Φ1(·) be the CDF of Z1. From (A.2),
P (q1 = 1) = P (Z1 < ts) = Φ1(ts) (A.3)
Chapter A. Unbiased Delay Estimator 95
Let Z2 = T2 − Ts. Let Φ2(·) be the CDF of Z2. Then,
P (q2 = 1) = P (T2 + δ < Ts) = P (Z2 < ts − δ) (A.4)
= Φ2(ts − δ) (A.5)
The output of the delay measurement unit is given as
S =1
2k
2k∑i=1
Xi (A.6)
with Xi = qi1 − qi2, the difference of the ith samples. Hence,
Xi =
1 if qi1 = 1 & qi2 = 0
0 if qi1 = qi2
−1 if qi1 = 0 & qi2 = 1
(A.7)
The expectation of Xi is given by
E[Xi] = P (qi1 = 1, qi2 = 0)− P (qi1 = 0, qi2 = 1)
= Φ1(tis)− Φ2(tis − δ) (A.8)
assuming qi1 and qi2 are independent of one another.
The variance of Xi is given by
var[Xi] = P (qi1 = 1, qi2 = 0) + P (qi1 = 0, qi2 = 1)− (E[Xi])2
= Φ1(tis)(1− Φ1(tis)) + Φ2(tis − δ)(1− Φ2(tis − δ))
≤ 1
2(A.9)
Let the clock period be T and the sampling clock period be T + ∆T , where
T = N∆T +α, where N is an integer and 0 < α < ∆T . This causes the sampling
edge to fall uniformly across the entire period of the sampled signal to create one
beat period. Let the measurement be taken over M beat periods, so MN = 2k.
Chapter A. Unbiased Delay Estimator 96
Hence, (A.6) can be rewritten as:
S =1
MN
∑j
∑k
Xjk (A.10)
Let α = (α1, α2, . . . αM) be the starting phases in each beat period. Then
ES|α[S|α] =1
MN
∑j
∑k
E[Xjk(αj + k∆T )] (A.11)
Substituting from (A.8), applying the law of iterated expectation1 and reordering
the summation, we get
E[S] = Eα[ES|α[S|α]] =1
N
∑k
1
M
∑j
E[Φ1(αj + k∆T )− Φ2(αj + k∆T − δ)]
(A.12)
Since αjs are uniform over 0 to ∆T , (with PDF of 1∆T
), the inner expectation is
identical for each j and can be evaluated as the following integral:
E[S] =1
N
∑k
1
∆T
∫ (k+1)∆T
k∆T
[Φ1(t)− Φ2(t− δ)] dt (A.13)
The above summation can be replaced by an integral over the entire clock period
T , as follows:
E[S] =1
N∆T
∫〈T 〉
[Φ1(t)− Φ2(t− δ)] dt (A.14)
However, if we assume that the skew δ and the jitter of the clocks are small
compared to the period T , then the limits of the integration can be replaced by
±∞, and noting that N ∆T ≈ T , the integral becomes
E[S] =1
T
∫ ∞−∞
[Φ1(t)− Φ2(t− δ)] dt (A.15)
But, if the skew δ and/or jitter σ is not a small fraction of T , then (A.15) yields
an under-estimate of δ. In general, evaluating this integral is difficult. However,
in this particular case, we can revert to the following trick of ‘differentiating under
1E[X] = EY
[EX|Y [X|Y ]
][94]
Chapter A. Unbiased Delay Estimator 97
the integral sign’ (Leibniz integral rule) [95]:
d
dx
(∫ b
a
f(x, t) dt
)=
∫ b
a
∂
∂xf(x, t) dt (A.16)
Taking x = δ, f = Φ1(t)− Φ(t− δ), from (A.15) and (A.16), we have
dE[S]
dδ=
1
T
∫ ∞−∞
Φ′2(t− δ) dt =1
T(A.17)
Since the term inside the integral is a PDF and integrates to unity, we get
E[S] =δ
T(A.18)
The variance of the estimator S is given by
var[S] =1
22k
2k∑i=1
var[Xi] (A.19)
From (A.9),
var[S] ≤ 1
22k2k × 1
2=
1
2k+1(A.20)
The standard deviation of the estimator is given by
σ[S] ≤ 1√2k+1
(A.21)
Appendix B
Noise in Inverter Chain
As described in Chapter 4, current starved inverters are employed as the voltage-
controlled-delay elements. The voltage information is thus encoded in the delay
of the clock signal passing through the element. Thus, jitter accumulated on the
clock signal while propagating through the delay element manifests eventually as
noise in voltage domain.
The variance of jitter added by a delay element is given by [96]
σ2j =
4kTγN tdIN(VDD − Vt)
(B.1)
where
σj Standard deviation of jitter
k Boltzmann’s constant
T Temperature in Kelvin
td Propagation delay of inverter
IN Inverter current
VDD Supply voltage
Vt Threshold voltage
Let σ2j = β2 td. Suppose in a series of m inverters, one is selected to provide
differential delay while the others nominally do not provide differential delay. Let
Chapter B. Noise in Inverter Chain 99
τ1 and τ2 be the delays seen by the pair of clocks. Let D and d respectively be
the maximum and minimum delay of each delay cell and let ξj ∼ N(0, 1) be unit
variance zero-mean Gaussian random variables. Then,
τ1 = D + (m− 1)d+ β√D ξ1 + β(
√(m− 1)d) ξ2 (B.2)
= D + (m− 1)d+ β(√
D + (m− 1) d)ξ3 (B.3)
τ2 = md+ β(√md) ξ4 (B.4)
The quantity of interest is
τ = τ1 − τ2 = D − d+ β(√
D + (m− 1)d+md)ξ5 (B.5)
= D − d+ β(√
D − d+ 2md)ξ5 (B.6)
The signal-to-noise ratio of this setup is given by
SNR =(D − d)2
β2(D − d+ 2md)(B.7)
=(D − d)
β2
(1
1 + 2mdD−d
)(B.8)
≈ D − dβ2
(1− 2md
D − d
)(B.9)
which clearly shows that SNR decreases as m increases.
References
[1] P. K. Das, “Precise on-chip clock skew measurement using subsampling and
applications,” Ph.D. dissertation, Indian Institute of Science, Bangalore,
Karnataka, India, February 2012. xiv, 29
[2] P. Kabisatpathy, A. Barua, and S. Sinha, Fault Diagnosis of Analog Inte-
grated Circuits. Springer, 2005. 1
[3] M. L. Bushnell and V. D. Agrawal, Essentials of Electronic Testing for Digi-
tal, Memory and Mixed-Signal VLSI Circuits. Kluwer Academic Publishers,
2000. 1, 2, 6, 13
[4] M. Abramovici, M. A. Breuer, and A. D. Friedman, Digital Systems Testing
and Testable Design. IEEE Press, 2001. 2
[5] G. Banerjee, M. Behera, M. A. Zeidan, R. Chen, and K. Barnett, “Analog/rf
built-in-self-test subsystem for a mobile broadcast video receiver in 65-nm
cmos,” IEEE Journal of Solid-State Circuits, vol. 46, no. 9, pp. 1998–2008,
September 2011. 2, 6, 9, 93
[6] G. W. Roberts and S. Aouini, “Mixed-signal production test: A measure-
ment principle perspective,” IEEE Design and Test of Computers Magazine,
vol. 26, pp. 48–62, September/October 2009. 3, 5, 6, 12
[7] S. Sindia, “High sensitivity signatures for test and diagnosis of analog, mixed-
signal and radio-frequency circuits,” Ph.D. dissertation, Auburn University,
August 2013. 5
[8] W. M. Lindermier, “Design of robust test criteria in analog testing,” in
ICCAD Digest of Technical Papers, 1996, pp. 604–611. 6
REFERENCES 101
[9] S. Sunter and N. Nagi, “Test metrics for analog parametric faults,” in Pro-
ceedings of the 17th IEEE VLSI Test Symposium, 1999, pp. 226–234. 6
[10] N. B. Hamida and B. Kaminska, “Multiple fault analog circuit testing by
sensitivity analysis,” Journal of Electronic Testing: Theory and Applications
- Joint special issue on analog and mixed-signal testing, vol. 4, no. 4, pp. 331–
343, November 1993. 6
[11] Z. Guo and J. Savir, “Analog circuit test using transfer function coefficient
estimates,” in Proceedings of the IEEE International Test Conference, 2003,
pp. 1155–1163. 6
[12] B. G. Streetman and S. Banerjee, Solid State Electronic Devices. PHI
Learning, 2009. 6
[13] C.-L. Wey and S. Krishnan, “Built-in self-test (bist) structure for analog
circuit fault diagnosis,” IEEE Transactions on Instrumentation and Mea-
surement, vol. 39, no. 3, pp. 517–521, June 1990. 7
[14] D. Vazquez, J. L. Huertas, and A. Rueda, “Reducing the impact of dft on
the performance of analog integrated circuits: improved sw-op amp design,”
in Proceedings of the 14th VLSI Test Symposium, 1996, pp. 42–47. 7
[15] L. S. Milor, “A tutorial introduction to research on analog and mixed-signal
circuit testing,” IEEE Transactions on Circuits and SystemsII: Analog and
Digital Signal Processing, vol. 45, no. 10, pp. 1389–1407, October 1998. 8, 9
[16] D. D. Venuto and M. J. Ohletz, “On-chip test for mixed-signal asics using
two-mode comparators with bias-programmable reference voltages,” Journal
of Electronic Testing: Theory and Applications, vol. 17, no. 3-4, pp. 243–253,
June 2001. 8
[17] Y. Zheng and K. L. Shepard, “On-chip oscilloscopes for noninvasive time-
domain measurement of waveforms in digital integrated circuits,” IEEE
Transactions on Very Large Scale Integration (VLSI) Systems, vol. 11, no. 3,
pp. 336–344, June 2003. 8, 30, 93
REFERENCES 102
[18] R. Ho, B. Amrutur, K. Mai, B. Wilburn, T. Mori, and M. Horowitz, “Appli-
cations of on-chip samplers for test and measurement of integrated circuits,”
in Symposium on VLSI Circuits Digest of Technical Papers, 1998, pp. 138–
139. 8, 31
[19] A. Chatterjee, B. C. Kim, and N. Nagi, “Dc built-in self-test for linear analog
circuits,” IEEE Design & Test of Computers, vol. 13, no. 2, pp. 26–33,
Summer 1996. 8
[20] M. Negreiros, “Low cost bist techniques for linear and non-linear analog
circuits,” Ph.D. dissertation, Universidade Federal Do Rio Grande Do Sul,
July 2005. 9, 11
[21] M. Slamani and B. Kaminska, “Multifrequency analysis of faults in analog
circuits,” IEEE Design & Test of Computers, vol. 12, no. 2, pp. 70–80,
Summer 1995. [Online]. Available: http://dx.doi.org/10.1109/54.386008 9
[22] M. J. Ohletz, “Hybrid built-in self-test (hbist) for mixed analog/digital in-
tegrated circuits,” in European Test Symposium, 1991. 9
[23] B. Provost and E. Sanchez-Sinencio, “On-chip ramp generators for mixed-
signal bist and adc self-test,” IEEE Journal of Solid-State Circuits, vol. 38,
no. 2, pp. 263–273, February 2003. 9
[24] S. Sasho and M. Shibata, “Multi-output one-digitizer measurement,” in Pro-
ceedings IEEE International Test Conference 1998, Washington, DC, USA,
October 18-22, 1998. IEEE Computer Society, 1998, p. 258. 9
[25] S. Callegari, F. Pareschi, G. Setti, and M. Soma, “Complex oscillation-based
test and its application to analog filters,” IEEE Transactions on Circuits and
SystemsI: Regular Papers, vol. 57, no. 5, pp. 956–969, May 2010. 9, 11
[26] J. M. da Silva and J. S. Matos, “Evaluation of irmDD/vout cross-correlation
for mixed current-voltage testing of analogue and mixed-signal circuits,” in
Proceedings of the European Design and Test Conference, 1996, pp. 264–268.
9
REFERENCES 103
[27] M. M. Hafed, N. Abaskharoun, and G. W. Roberts, “A 4-ghz effective sample
rate integrated test core for analog and mixed-signal circuits,” IEEE Journal
of Solid-State Circuits, vol. 37, no. 4, pp. 499–514, April 2002. 10, 11
[28] M. Lubaszewski, S. Mir, A. Rueda, and J. L. Huertas, “Concurrent error
detection in analog and mixed-signal integrated circuits,” in Proceedings of
the 38th Midwest Symposium on Circuits and Systems, 1995, pp. 1151–1156.
11
[29] E. F. Cota, M. Negreiros, L. Carro, and M. Lubaszewski, “A new adaptive
analog test and diagnosis system,” IEEE Transactions on Instrumentation
and Measurement, vol. 49, no. 2, pp. 223–227, April 2000. 11
[30] A. Chatterjee, “Concurrent error detection and fault-tolerance in linear ana-
log circuits using continuous checksums,” IEEE Transactions on Very Large
Scale Integration (TVLSI) Systems, vol. 1, no. 2, pp. 138–150, June 1993.
11
[31] M. Mahoney, DSP-Based Testing of Analog and Mixed-Signal Circuits. Los
Alamitos, California: IEEE Computer Society Press, 1987. 11
[32] M. Negreiros, L. Carro, and A. A. Susin, “Testing analog circuits using
spectral analysis,” Microelectronics Journal, vol. 34, no. 10, pp. 937–944,
October 2003. 11
[33] J. Doernberg, H.-S. Lee, and D. A. Hodges, “Full-speed testing of a/d con-
verters,” IEEE Journal of Solid-State Circuits, vol. SC-19, no. 6, December
1984. 12, 48, 73
[34] L. Jin, H. Haggag, R. Geiger, and D. Chen, “Testing of precision dacs using
low-resolution adcs with dithering,” in Proceedings of the International Test
Conference, 2006, pp. 1–10. 12
[35] S. Aouini, “Extending test signal generation using sigma-delta encoding be-
yond the voltage/amplitude domain,” Ph.D. dissertation, McGill University,
Montreal, August 2011. 12
REFERENCES 104
[36] K. Ghosal, T. Anand, V. Chatrvedi, and B. Amrutur, “A power-scalable rf
cmos receiver for 2.4 ghz wireless sensor network applications,” in 12th IEEE
International Conference on Electronics, Circuits and Systems (ICECS) Di-
gest of Technical Papers, 2012, pp. 161–164. 12, 84
[37] B. Razavi, RF Microelectronics. Prentice Hall, 2012. 12, 13, 14
[38] J. D. Kraus, Radio Astronomy. McGraw-Hill, 1966. 13
[39] M. J. Burbidge, A. Lechner, G. Bell, and A. M. D. Richardson, “Motivations
towards bist and dft for embedded charge-pump phase-locked loop frequency
synthesisers,” IEE Proceedings on Circuits, Devices and Systems, vol. 151,
no. 4, pp. 337–348, August 2004.
[40] C. Weinraub, “Analog built-in self-test module,” Patent US7 327 153 B2, 02
05, 2008. [Online]. Available: http://www.google.com/patents/US7327153
13
[41] D. Lupea, U. Pursche, and H.-J. Jentschel, “Rf-bist: loopback spectral signa-
ture analysis,” in Proceedings of the Design, Automation and Test in Europe
Conference and Exhibition, 2003, pp. 478–483. 13
[42] A. Halder, S. Bhattacharya, G. Srinivasan, and A. Chatterjee, “A system-
level alternate test approach for specification test of rf transceivers in loop-
back mode,” in Proceedings of the 18th International Conference on VLSI
Design, 2005, pp. 289–294. 13
[43] M. Negreiros, L. Carro, and A. A. Susin, “Low cost analog testing of rf
signal paths,” in Proceedings of the Design, Automation and Test in Europe
Conference and Exhibition, 2004, pp. 292–297. 14
[44] ——, “Noise figure evaluation using low cost bist,” in Proceedings of the
Design, Automation and Test in Europe Conference, 2005, pp. 158–163. 14
[45] L. Gammaitoni, P. Hanggi, P. Jung, and F. Marchesoni, “Stochastic
resonance,” Rev. Mod. Phys., vol. 70, pp. 223–287, Jan 1998. [Online].
Available: http://link.aps.org/doi/10.1103/RevModPhys.70.223 14
REFERENCES 105
[46] R. B. Staszewski, K. Muhammad, D. Leipold, C.-M. Hung, Y.-C. Ho, J. L.
Wallberg, C. Fernando, K. Maggio, R. Staszewski, T. Jung, J. Koh, S. John,
I. Y. Deng, V. Sarda, O. Moreira-Tamayo, V. Mayega, R. Katz, O. Friedman,
O. E. Eliezer, E. de Obaldia, and P. T. Balsara, “All-digital tx frequency
synthesizer and discrete-time receiver for bluetooth radio in 130-nm cmos,”
IEEE Journal of Solid-State Circuits, vol. 39, no. 12, pp. 2278–2291, 2004.
15
[47] A. Agnes, E. Bonizzoni, P. Malcovati, and F. Maloberti, “A 9.4-enob 1v
3.8µw 100ks/s sar adc with time domain comparator,” in ISSCC Digest of
Technical Papers, 2008, pp. 246–247,610. 15
[48] S.-K. Lee, S.-J. Park, H.-J. Park, and J.-Y. Sim, “A 21 fj/conversion-step
100 ks/s 10-bit adc with a low-noise time-domain comparator for low-power
sensor interface,” IEEE Journal of Solid-State Circuits, vol. 46, no. 3, March
2011. 15
[49] G. Li, Y. M. Tousi, A. Hassibi, and E. Afshari, “Delay-line-based analog-to-
digital converters,” IEEE Transactions on Circuits and Systems-II: Express
Briefs, vol. 56, no. 6, pp. 464–470, June 2009. 15, 20, 22, 32
[50] T. Watanabe, T. Mizuno, and Y. Makino, “An all-digital analog-to-digital
converter with 12µv/lsb using moving-average filtering,” IEEE Transactions
on Circuits and Systems-II: Express Briefs, vol. 38, no. 1, pp. 120–125, April
2004. 15, 20
[51] J. Bergs, Design of a VCO based ADC in a 180 nm CMOS Process for use
in Positron Emission Tomography. Master’s thesis, 2010. 15
[52] M. Z. Straayer and M. H. Perrott, “A 12-bit, 10-mhz bandwidth, continuous-
time σδ adc with a 5-bit, 950-ms/s vco-based quantizer,” IEEE Journal of
Solid-State Circuits, vol. 43, no. 4, pp. 805–814, April 2008. 15, 32
[53] J. Xiao, A. V. Peterchev, J. Zhang, and S. R. Sanders, “A 4-µa quiescent-
current dual-mode digitally controlled buck converter ic for cellular phone
applications,” IEEE Journal of Solid-State Circuits, vol. 39, no. 12, pp. 2342–
2348, December 2004. 15
REFERENCES 106
[54] R. K. Dash and H. Parthasarathy, “Low dropout regulator testing system
and device,” Patent US 8 054 057 B2, 11 08, 2011. [Online]. Available:
http://www.google.com/patents/US8054057 16
[55] R. Vasudevamurthy, P. K. Das, and B. Amrutur, “A mostly-digital analog
scan-out chain for low bandwidth voltage measurement for analog ip test,”
in ISCAS Digest of Papers, 2011, pp. 2035–2038. 16
[56] C. S. Taillefer and G. W. Roberts, “Delta-sigma a/d conversion via time-
mode signal processing,” IEEE Transactions on Circuits and Systems-I: Reg-
ular Papers, vol. 56, no. 9, pp. 1908–1920, September 2009. 16, 54, 79
[57] D. I. Porat, “Review of sub-nanosecond time-interval measurements,” IEEE
Transactions on Nuclear Science, vol. 20, no. 5, pp. 36–51, October 1973. 19
[58] J. Kostamovaara, S. Kurtti, and J.-P. Jansson, “A receiver - tdc chip set
for accurate pulsed time-of-flight laser ranging,” in CDNLive! EMEA 2012
Conference Proceedings, 2012, pp. 1–6. 19
[59] B. K. Swann, B. J. Blalock, L. G. Clonts, D. M. Blinkley, J. M. Rochelle,
E. Breeding, and K. M. Baldwin, “A 100-ps time-resolution cmos time-to-
digital converter for positron emission tomography imaging applications,”
IEEE Journal of Solid-State Circuits, vol. 39, no. 11, pp. 1839–1852, Novem-
ber 2004. 19
[60] V. Kratyuk, P. K. Hanumolu, K. Ok, U.-K. Moon, and K. Mayaram, “A
digital pll with a stochastic time-to-digital converter,” IEEE Transactions
on Circuits and Systems-I: Regular Papers, vol. 56, no. 8, pp. 1612–1621,
August 2009. 20
[61] P. K. Das and B. Amrutur, “An accurate fractional period delay generation
system,” IEEE Transactions on Instrumentation and Measurement, vol. 61,
no. 7, pp. 1924–1932, July 2012. 20
[62] B. Amrutur, P. K. Das, and R. Vasudevamurthy, “0.84ps resolution clock
skew measurement via sub-sampling,” IEEE Transactions on Very Large
REFERENCES 107
Scale Integration (VLSI) Systems, vol. 99, pp. 1–9, November 2010. 20, 32,
44, 45, 50, 74, 78
[63] P. Chen, C.-C. Chen, C.-C. Tsai, and W.-F. Lu, “A time-to-digital converter-
based cmos smart temperature sensor,” IEEE Journal of Solid-State Cir-
cuits, vol. 40, no. 8, pp. 1642–1648, August 2005. 20
[64] D. Fick, N. Liu, Z. Foo, M. Fojtik, J. s. Seo, D. Sylvester, and D. Blaauw, “In
situ delay-slack monitor for high-performance processors using an all-digital
self-calibrating 5ps resolution time-to-digital converter,” in ISSCC Digest of
Technical Papers. IEEE, 2010, pp. 188–189. 20, 24
[65] T. Xia and J.-C. Lo, “Time-to-voltage converter for on-chip jitter measure-
ment,” IEEE Transactions on Instrumentation and Measurement, vol. 52,
no. 6, pp. 1738–1748, December 2003. 21
[66] M. A. Z. Straayer, “Noise shaping techniques for analog and time to digi-
tal converters using voltage controlled oscillators,” Ph.D. dissertation, Mas-
sachusetts Institute of Technology, June 2008. 21, 25, 26, 27, 31
[67] S. Henzler, S. Koeppe, W. Kamp, H. Mulatz, and D. Schmitt-Landsiedel,
“90nm 4.7ps-resolution 0.7-lsb single-shot precision and 19pj-per-shot lo-
cal passive interpolation time-to-digital converter with on-chip characteriza-
tion,” in Digest of Technical Papers. IEEE International Solid-State Circuits
Conference, 2008, pp. 548–549. 23
[68] R. G. Baron, “The vernier time-measuring technique,” Proceedings of the
IRE, pp. 21–30, January 1957. 23
[69] S. Sindia, F. F. Dai, and V. D. Agrawal, “All-digital replica techniques for
managing random mismatch in time-to-digital converters,” in Proceedings of
the 44th IEEE Southeastern Symposium on System Theory, 2012, pp. 1–5.
24
[70] C.-S. Hwang, P. Chen, and H.-W. Tsao, “A high-precision time-to-digital
converter using a two-level conversion scheme,” IEEE Transactions on Nu-
clear Science, vol. 51, no. 4, pp. 1349–1352, August 2004. 24
REFERENCES 108
[71] M. Lee and A. A. Abidi, “A 9 b, 1.25 ps resolution coarse-fine time-to-digital
converter in 90 nm cmos that amplifies a time residue,” IEEE Journal of
Solid-State Circuits, vol. 43, no. 4, pp. 769–777, April 2008. 25
[72] J. Yu, F. F. Dai, and R. C. Jaeger, “A 12-bit vernier ring time-to-digital
converter in 0.13 µm cmos technology,” IEEE Journal of Solid-State Circuits,
vol. 45, no. 4, pp. 830–842, April 2010. 25
[73] R. J. Baker, CMOS Mixed-Signal Circuit Design. IEEE Press, 2002. 25, 54
[74] I. Nissinen, A. Mntyniemi, and J. Kostamovaara, “A cmos time-to-digital
converter based on a ring oscillator for a laser radar,” in Proceedings of the
29th IEEE European Solid-State Circuits Conference, 2003, pp. 469–472. 26
[75] M. Z. Straayer and M. H. Perrott, “A multi-path gated ring oscillator tdc
with first-order noise shaping,” IEEE Journal of Solid-State Circuits, vol. 44,
no. 4, pp. 1089–1098, April 2009. 27
[76] R. J. V. D. Plassche, “Dynamic element matching for high-accuracy mono-
lithic d/a converters,” IEEE Journal of Solid-State Circuits, vol. 11, no. 6,
pp. 795–800, December 1976. 28
[77] R. Schreier and G. C. Temes, Understanding Delta-Sigma Data Converters.
John Wiley & Sons, 2005. 28
[78] E. G. Friedman, “Clock distribution networks in synchronous digital inte-
grated circuits,” vol. 89, no. 5, May 2001, pp. 665–692. 29
[79] P. K. Das, B. Amrutur, J. Sridhar, and V. Visvanathan, “On-chip clock
network skew measurement using sub-sampling,” in IEEE ASSCC Digest of
Technical Papers, November 2008, pp. 401–404. 30, 33
[80] T. Hashida and M. Nagata, “An on-chip waveform capturer and application
to diagnosis of power delivery in soc integration,” IEEE Journal of Solid-
State Circuits, vol. 46, no. 4, April 2011. 31
[81] A. S. Morris, Measurement and Instrumentation Principles. Butterworth-
Heinemann, 2001. 37
REFERENCES 109
[82] H. Y. Yang and R. Sarpeshkar, “A time-based energy-efficient analog-to-
digital converter,” IEEE Journal of Solid-State Circuits, vol. 40, no. 8, pp.
1590–1601, August 2005. 37
[83] C. Taillefer, “Analog-to-digtal conversion via time-mode signal processing,”
Ph.D. dissertation, McGill University, Montreal, August 2007. 37, 39
[84] S. Song and V. Stojanovic, “A 6.25 gb/s voltage-time conversion based frac-
tionally spaced linear receive equalizer for mesochronous high-speed links,”
IEEE Journal of Solid-State Circuits, vol. 46, no. 5, pp. 1183–1197, May
2011. 37, 39
[85] H. Pekau, A. Yousif, and J. W. Haslett, “A cmos integrated linear voltage-
to-pulse-delay-time converter for time based analog-to-digital converters,” in
ISCAS Digest of Technical Papers. IEEE, 2006, pp. 2373–2376. 37
[86] A. R. Macpherson, K. A. Townsend, and J. W. Haslett, “A 5gs/s voltage-
to-time converter in 90nm cmos,” in 4th European Microwave Integrated
Circuits Conference, 2009, pp. 254–257. 37, 39
[87] X. Inc., Xilinx UG190 Virtex 5 FPGA User Guide. Xilinx, 2006. 63
[88] M. Mansuri and C.-K. K. Yang, “Jitter optimization based on phase-locked
loop design parameters,” IEEE Journal of Solid-State Circuits, vol. 37,
no. 11, pp. 1375–1382, November 2002.
[89] L.-M. Lee, D. Weinlader, and C.-K. K. Yang, “A sub10-ps multiphase
sampling system using redundancy,” IEEE Journal of Solid-State Circuits,
vol. 41, no. 1, pp. 265–273, September 2006. 73
[90] J. A. Bucklew, Introduction to Rare Event Simulation. Springer, 2004. 74
[91] R. G. Lyons, Understanding Digital Signal Processing. Pearson, 2011. 81
[92] S. Dwivedi, “Low power receiver architecture and algorithms for low data
rate wireless personal area networks,” Ph.D. dissertation, Indian Institute of
Science, Bangalore, Karnataka, India, December 2010. 84
REFERENCES 110
[93] K.-J. Lee, J.-J. Chen, and C.-H. Huang, “Using a single input to
support multiple scan chains,” in Proceedings of the 1998 IEEE/ACM
international conference on Computer-aided design, ser. ICCAD ’98.
New York, NY, USA: ACM, 1998, pp. 74–78. [Online]. Available:
http://doi.acm.org/10.1145/288548.288563 92
[94] A. Papoulis and S. U. Pillai, Probability, Random Variables and Stochastic
Processes. Tata McGraw - Hill Education, 2002. 96
[95] R. P. Feynman, Surely You’re Joking Mr. Feynman. Random House, 1992.
97
[96] A. A. Abidi, “Phase noise and jitter in cmos ring oscillators,” IEEE Journal
of Solid-State Circuits, vol. 41, no. 8, pp. 1803–1816, August 2006. 98