Time-based All-Digital Technique for Analog Built-in Self Test · Maruthi, Vinay, Shantanu,...

Time-based All-Digital Technique for Analog

Built-in Self Test

A thesis submitted for the degree of

Doctor ofPhilosophyin the Faculty of Engineering

Submitted by

Rajath Vasudevamurthy

Department of Electrical Communication Engineering

Indian Institute of Science

JULY 2013

mailto:[email protected]

http://www.ece.iisc.ernet.in

http://www.iisc.ernet.in

To

The Lotus feet

of the

Goddess of Learning

Acknowledgements

I am grateful for the opportunity of spending the last six years at the wonderful

campus of the Indian Institute of Science, the stay replete with ample opportu-

nities of acquiring knowledge and to all the friends who made the stay enjoyable.

Firstly, I would like to thank my advisor Dr. Bharadwaj Amrutur for being a

constant source of inspiration and support. I would like to thank the chairman of

the ECE department, Prof. P. Vijay Kumar, and all the faculty members for offer-

ing interesting courses and innovative teaching methods. I would like thank all the

ECE office staff, especially Mr. Srinivas Murthy and Mr. C. T. Nagaraj; and lab

staff Mrs. Subhashini and Mrs. Radhika for handling all the paper work smoothly.

I would thank all the staff members of Systems Lab, CeNSE and Micro-Nano

Characterization Facility (MNCF), CeNSE, especially Manikant Singh, Dr. Vi-

jay Mishra, Ms. Ashwini and Dr. Girish Kunte for letting me borrow equipment

enabling me to complete my work on time.

I would like to thank Dr. Rubin Parekhji, Dr. Devanathan, Dr. Shrivaths Ravi

of Texas Instruments, India for giving me the opportunity of learning from prac-

ticing industry experts and also providing me valuable TA (Teaching Assistant)

experience. Special thanks again to Dr. Rubin Parekhji for arranging my pre-

sentation at TI, and to Lakshmanan and Chethan for detailed discussions and

feedback.

I would like to thank all my lab mates - Viveka, Kaushik, Manikandan,

Pushkar, Sagar, Nagaraju, Doney; and members of neighbouring lab - Javed,

Vishal, Immaneul, Zaira and Manas, for the excellent work environment and

timely help when needed. Special thanks to Vishal and Manikandan for or-

ganizing weekly lab talks; and to Vikram, Janaki and Siva for similar efforts

earlier; giving us an occasion to share knowledge and hone presentation skills. I

iv

would like to specially thank BT for patiently resolving linux issues. I would also

like to thank the alumni of lab - Vikram, Pratap, Janaki, Satyam, Siva, Balaji,

Raghavendra, Syam, Karthik, Nandish. Special thanks to Satyam for removing

my post-submission uncertainty. Special thanks to Syam for help with PCB de-

sign and Yasasvi for soldering help and die micrograph of chapter 7. I would like

to thank the students of our department - Neeraj, Abhinav, Rakshith, Mohan,

Ajay, Harshavardhan, Harsha, Harish, Govind, Haricharan, Aarthi, Prashanth,

Arun, Nischal for valuable and informal informative discussions. I would like to

thank all the masters students who joined with me - Suraj Sindia, Shashidhar,

Vinay N S, V T Arun, Rakesh Kumar, Virag Shah, Pramod, Girish and many oth-

ers for excellent company and stimulating discussions during coursework. Special

thanks to Suraj Sindia for helping me in writing out chapter 2 and to V T Arun

for the die micrograph of chapter 4.

I would like to thank H. L. Prasad for teaching me Sanskrit, and along with

Santhosh providing me valuable start-up experience. I would like to thank mem-

bers of the Samskr.ta Sangha - G. P. R. Yasasvi, Rudra Murthy, Hari Pavan

Kumar, Siva Rama Krishna, Abhinav, Navin, Suraj K, Chennakeshava, Bhar-

gava, Ankur Raina, Vishvas Acharya, Abhiram B, Vivekanand Mannangi, Hari

Ganesh, Subramanian T R for inspiring Sanskrit learning and organizing vari-

ous programmes. I would like to thank members of Kannada Sangha - Vadi-

raj, Shivaprasad, Baburao Sherikar, Venkatesh, Pradeepa T K, Suryaprakash,

Maruthi, Vinay, Shantanu, Shivanand, Bhaskar, Prasanna, Nagaraj, Deepak

Paramashivan, Shivamogga Rakesh, B. S. Sheshachala, Smt. Nandini, Smt. Udaya-

kumari for organizing various programmes and giving me an opportunity to be

in the organizing committee. Special thanks to Vadiraj for letting me use his

cycle, greatly reducing my commuting time. I would like to thank members of

the Vivekananda Study Circle - Rajasekhar, Prasad, Sushant, Sonal, Abheek,

Goutham, Durga Datta and specially Sri Gokulmuthu, for the wonderful dis-

cussions we had. Thanks are due to members of Telugu Samskr.tika Samiti -

G. P. R. Yasasvi, Hari Pavan Kumar, Sainath B., Sheshadri, Rakesh Kande for

organizing wonderful programmes and being patient enough to clear my doubts

of the language. I would like to thank all the S-block friends - Srinidhi, Bharath,

Shivananju, Naveen, Avinash Achar, Keshav, Sushrutha, Venu, Pradeepa, Prashanth,

v

Premkumar, Laxman for being with me during tough times. Special thanks to

members of Prasthuta and Praharshini, namely Abhiram Soori, Raghavendra,

Dharmesh, Varun, Krishna for organizing various programmes and stimulating

discussions. Thanks are due to Jaishankar, Souren Misra and Srinath for tea

time simulating discussions. I would like to thank the members of Papyrus - BT,

Janaki, Vijayanth, Chetana, Gokul, for giving me an opportunity to help them.

I would like to thank Rohit Vallam, PhD scholar, CSA dept., for coordinating

Prof. Vittal Rao’s lectures and giving me an opportunity to typeset some of the

lectures, honing my LATEX skills. Thanks to all the Students’ Council chairmen -

Brijesh Bhatt, Sreevalsa, Pramod Kumar Verma and Ganesh for warm company

and kind co-operation.

I would also like to thank Anvesha, Jaidev, Jayanarayan, Naveen, Nandaku-

mar, Siva Rama Krishna all of IIT Bombay for the excellent company at the

26th VLSI Design Conference in Pune. I thank Shivaprasad and Viswanath

for accommodating me a night each and taking me out in Bombay, and to

Prof. Rushikesh Joshi for excellent hospitality and gift of books on our visit to

IIT Bombay for the DIT project review. Special thanks to Pramod for being an

excellent room-mate in Hyderabad while attending the 25th VLSI Design Confer-

ence. Special thanks to BT, Manodipan, Nandish, Vikram for excellent company

at Delhi and trip to National Brain Research Institute (NBRI), Manesar while

attending the 23rd VLSI Design Conference.

I would like to thank Mr. Sravan Kumar Gampa, lawyer at K&S Partners, who

interacted with me and drafted our patent application; and to the panel consist-

ing of Prof. Anurag Kumar, Prof. S. A. Shivashankar and Prof. Navakanta Bhat

for listening to our presentation and approving the patent application. I would

also like to thank Prof. P S Sastry, Prof. Rajesh Sundaresan, Prof. K. J. Vi-

noy and Prof. Navakanta Bhat for quizzing me at my comprehensive exam and

the panel of Prof. Anurag Kumar, Prof. Utpal Mukherji, Prof. P. Vijay Kumar,

Prof. A Chockalingam besides my advisor for quizzing me in the interview for

admission.

Last but not the least, I am deeply indebted to my family for letting me

embark on the uncertain journey of a PhD and fully supporting me throughout.

Abstract

A scheme for Built-in-Self-Test (BIST) of analog signals with minimal area over-

head, for measuring on-chip voltages in an all-digital manner is presented in this

thesis. With technology scaling, the inverter switching times are becoming shorter

thus leading to better resolution of edges in time. This time resolution is observed

to be superior to voltage resolution in the face of reducing supply voltage and

increasing variations as physical dimensions shrink. In this thesis, a new method

of observability of analog signals is proposed, which is digital-friendly and scal-

able to future deep sub-micron (DSM) processes. The low-bandwidth analog test

voltage is captured as the delay between a pair of clock signals. The delay thus

setup is measured digitally in accordance with the desired resolution.

Such an approach lends itself easily to distributed manner, where the rout-

ing of analog signals over long paths is minimized. A small piece of circuitry,

called sampling head (SpH) placed near each test voltage, acts as a transducer

converting the test voltage to a delay between a pair of low-frequency clocks. A

probe clock and a sampling clock is routed serially to the sampling heads placed

at the nodes of analog test voltages. This sampling head, present at each test

node consists of a pair of delay cells and a pair of flip-flops, giving rise to as many

sub-sampled signal pairs as the number of nodes. To measure a certain analog

voltage, the corresponding sub-sampled signal pair is fed to a Delay Measurement

Unit (DMU) to measure the skew between this pair. The concept is validated by

designing a test chip in UMC 130 nm CMOS process. Sub-mV accuracy for static

signals is demonstrated for a measurement time of few milliseconds and ENOB of

5.29 is demonstrated for low bandwidth signals in the absence of sample-and-hold

circuitry.

The sampling clock is derived from the probe clock using a PLL and the design

vii

equations are worked out for optimal performance. To validate the concept, the

duty-cycle of the probe clock, whose ON-time is modulated by a sine wave, is

measured by the same DMU. Measurement results from FPGA implementation

confirm 9 bits of resolution.

List of publications from this

thesis

Patent

1. Rajath Vasudevamurthy and Bharadwaj Amrutur, “System and Method for

Built-in Self Test (BIST) in an Integrated Circuit,” filed on 28th September

2012, bearing application number 4068/CHE/2012.

Journals

1. Rajath Vasudevamurthy, Pratap Kumar Das and Bharadwaj Amrutur,

“Time-Based All-Digital Technique for Analog Built-in-Self-Test,” IEEE

Transactions on Very Large Scale Integration (VLSI) Systems, In early ac-

cess

2. Bharadwaj Amrutur, Pratap Kumar Das and Rajath Vasudevamurthy,

“0.84 ps Resolution Clock Skew Measurement via Subsampling,” IEEE

Transactions on Very Large Scale Integration (VLSI) Systems, vol. 19,

No. 12, Dec. 2011, pp. 2267 - 2275.

Conferences

1. Rajath Vasudevamurthy and Bharadwaj Amrutur, “Multiphase technique

to speed-up delay measurement via sub-sampling,” 26th International Con-

ference on VLSI Design, 5th-10th January 2013, Pune, India. (Nominated

for Best Student Paper award)

http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=6464607&contentType=Early+Access+Articles&sortType%3Dasc_p_Sequence%26filter%3DAND%28p_IS_Number%3A4359553%29%26rowsPerPage%3D100

http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=6464607&contentType=Early+Access+Articles&sortType%3Dasc_p_Sequence%26filter%3DAND%28p_IS_Number%3A4359553%29%26rowsPerPage%3D100

ix

2. Rajath Vasudevamurthy, Pratap Kumar Das and Bharadwaj Amrutur, “A

Mostly-Digital Analog Scan-out Chain for Low Bandwidth Voltage Mea-

surement for Analog IP Test,” in proceedings of 44th IEEE International

Symposium on Circuits and Systems (ISCAS), May 2011, pp. 2035 - 2038.

Contents

Acknowledgements iii

Abstract vi

List of publications from this thesis viii

Contents x

List of Figures xiv

List of Tables xvii

Acronyms xviii

1 Introduction 1

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Testing Economics . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.3 Scope of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.4 Organization of the thesis . . . . . . . . . . . . . . . . . . . . . . 4

2 State-of-the-Art Analog/RF BIST 5

2.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.2 Increasing observability of analog circuits . . . . . . . . . . . . . . 6

2.2.1 Analog Routing . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2.2 Analog Routing with a Digital Interface . . . . . . . . . . 8

2.2.3 Analog Waveform capturers . . . . . . . . . . . . . . . . . 8

2.3 BIST methods for Analog Circuits . . . . . . . . . . . . . . . . . . 8

CONTENTS xi

2.3.1 Vector based methods . . . . . . . . . . . . . . . . . . . . 8

2.3.2 Vectorless methods . . . . . . . . . . . . . . . . . . . . . . 9

2.3.3 BIST in the SoC context . . . . . . . . . . . . . . . . . . . 10

2.3.4 Concurrent test techniques . . . . . . . . . . . . . . . . . . 11

2.4 Spectral analysis based tests . . . . . . . . . . . . . . . . . . . . . 11

2.5 Mixed Signal Test . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.5.1 Testing of Data Converters . . . . . . . . . . . . . . . . . . 12

2.5.2 Clock signal testing . . . . . . . . . . . . . . . . . . . . . . 12

2.6 RF Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.6.1 RF Design Considerations . . . . . . . . . . . . . . . . . . 12

2.6.1.1 Linearity of LNA-mixer . . . . . . . . . . . . . . 13

2.6.1.2 Noise Figure of LNA . . . . . . . . . . . . . . . . 13

2.6.2 RF testing approaches . . . . . . . . . . . . . . . . . . . . 13

2.6.2.1 Loopback technique . . . . . . . . . . . . . . . . 13

2.6.2.2 Statistical Sampler . . . . . . . . . . . . . . . . . 14

2.6.2.3 Noise figure measurement . . . . . . . . . . . . . 14

2.7 Time-based ADC design . . . . . . . . . . . . . . . . . . . . . . . 14

2.8 Distributed Architecture . . . . . . . . . . . . . . . . . . . . . . . 16

2.9 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3 State-of-the-Art Time-to-Digital Converters 19

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3.2 TDC with gate-delay resolution . . . . . . . . . . . . . . . . . . . 22

3.3 TDC with sub-gate-delay resolution . . . . . . . . . . . . . . . . . 23

3.4 Oversampling TDC Considerations . . . . . . . . . . . . . . . . . 25

3.5 Oscillator-based TDC . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.6 Sub-sampling Approach . . . . . . . . . . . . . . . . . . . . . . . 29

3.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4 Proposed Architecture 33

4.1 Proposed Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

4.1.1 Measurement Procedure . . . . . . . . . . . . . . . . . . . 35

4.2 Voltage-to-Delay Conversion . . . . . . . . . . . . . . . . . . . . . 37

CONTENTS xii

4.2.1 Sample-and-Hold Action . . . . . . . . . . . . . . . . . . . 38

4.2.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . 40

4.3 Design Considerations . . . . . . . . . . . . . . . . . . . . . . . . 40

4.4 Hardware Implementation Details . . . . . . . . . . . . . . . . . . 42

4.4.1 Sub-sampling Based Delay Measurement Unit (DMU) . . . 42

4.4.2 Generation and Routing of Clocks . . . . . . . . . . . . . . 45

4.5 Measured Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

4.5.1 DC Measurements . . . . . . . . . . . . . . . . . . . . . . 46

4.5.2 AC Measurements . . . . . . . . . . . . . . . . . . . . . . 48

4.5.3 A note on stability of calibration data . . . . . . . . . . . 50

4.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

5 Performance Limits and Sampling Clock Generation 52

5.1 Behavioral model . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

5.2 Derivation of Design Parameters . . . . . . . . . . . . . . . . . . . 56

5.3 Use of PLL to Generate Sampling Clock . . . . . . . . . . . . . . 58

5.4 Experimental Validation . . . . . . . . . . . . . . . . . . . . . . . 62

5.4.1 Implementation of duty-cycle measurement unit . . . . . . 64

5.4.2 Large divide ratios and dithered divide ratio . . . . . . . . 66

5.5 Measurement Results . . . . . . . . . . . . . . . . . . . . . . . . . 66

5.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

6 Multiphase technique to speed-up delay measurement via sub-

sampling 73

6.1 Proposed Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

6.2 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

6.3 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . 79

6.3.1 Case of Fixed Input Delay . . . . . . . . . . . . . . . . . . 79

6.3.2 Cose of Slowly Varying Input Delay . . . . . . . . . . . . . 79

6.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

7 Example Application 84

7.1 Power Scalable Receiver Implementation . . . . . . . . . . . . . . 84

7.2 BIST Implementation . . . . . . . . . . . . . . . . . . . . . . . . . 85

CONTENTS xiii


7.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

8 Conclusions 91

8.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

8.2 Scope for future work . . . . . . . . . . . . . . . . . . . . . . . . . 92

A Unbiased Delay Estimator 94

B Noise in Inverter Chain 98

References 100

List of Figures

2.1 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2 Possible BIST Architecture . . . . . . . . . . . . . . . . . . . . . . 10

2.3 Distributed solution with only digital signals routed over long paths 17

3.1 Concept of a TDC . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.2 Basic TDC with gate delay resolution . . . . . . . . . . . . . . . . 22

3.3 Basic TDC wrapped back as a ring . . . . . . . . . . . . . . . . . 23

3.4 Vernier TDC with sub-gate delay resolution of D −D′ = ∆ . . . . 24

3.5 Classical oscillator-based TDC . . . . . . . . . . . . . . . . . . . . 27

3.6 Gated Ring Oscillator TDC . . . . . . . . . . . . . . . . . . . . . 28

3.7 Illustration of a typical Clock Distribution Network with various

components contributing to clock skew. Courtesy: Pratap Kumar

Das [1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.8 Illustration of Sub-sampling approach . . . . . . . . . . . . . . . . 30

3.9 Timing diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4.1 Proposed Architecture for Analog BIST . . . . . . . . . . . . . . . 34

4.2 Schematic Circuit of current-starved Voltage to Delay cell (V2D) . 41

4.3 Block Diagram of Implemented Set-up. . . . . . . . . . . . . . . . 43

4.4 Illustration of timing diagram and the concept of Sub-sampling. . 45

4.5 Die photo along with snapshot of layout. . . . . . . . . . . . . . . 46

4.6 Plots of ‘offset-canceled’ differential delay versus differential volt-

age for the settings mentioned. Refer Table 4.2 for the settings of

#1,2,3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

4.7 DNL and INL plots for the setting of entry 8 in Table 4.3 . . . . . 49

LIST OF FIGURES xv

4.8 Plot of difference delay versus difference voltage after subtracting

the delay at zero difference voltage from both curves at two time

instants separated by 1.5 hours. . . . . . . . . . . . . . . . . . . . 51

5.1 Behavioral model of voltage quantization employing the DMU . . 53

5.2 Plot of OSR versus n, showing the existence of optimal OSR for a

given fin. Effective n = min(n1, n2) (5.14). The other parameters

of the equations are taken from the settings described in Table 4.3.

The dots indicate the results summarized in Table 4.3. The gap

between the modeled and measured behavior is because the differ-

ential delay generated is a small fraction of the clock time period,

and the resolution improves as the ratio of differential delay to

time period increases. An explicit way of ensuring it to speed-up

measurement and achieved SNR is described in Chapter 6. . . . . 56

5.3 A typical PWM signal - the modulating sine wave is also shown in

dotted lines. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

5.4 Block diagram of system implemented in Virtex 5 development board 64

5.5 State Machine of System Implemented in FPGA (Fig. 5.4) . . . . 64

5.6 The sources of probe and sampling clocks for different cases . . . 65

5.7 Samples of duty-cycle measurement - Quantized values of the input

sine wave . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

5.8 Spectrum of measured duty-cycle samples, showing a clear peak at

10 Hz, the input sine frequency . . . . . . . . . . . . . . . . . . . 70

5.9 Linear curve fitting of SNR versus log(N) . . . . . . . . . . . . . 71

5.10 Gap between theoretically predicted parameters and actual mea-

surement settings. Note that the difference is least at large values

of SNR. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

5.11 Plot showing theoretical limits on SNR and results (mean SNR)

obtained from measurement. . . . . . . . . . . . . . . . . . . . . . 72

6.1 Block diagram of DMU (Delay Measurement Unit) based on sub-

sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

LIST OF FIGURES xvi

6.2 Illustration of the sampling clock precessing around the input clock.

The circumference represents the time period of input clock while

the sector represents the delay to be measured. The asterisk

shaped points are the edges of sampling clock. . . . . . . . . . . . 75

6.3 Counts corresponding to period and delay. Two-phase and four-

phase clocks measure delay twice and four times in a beat period

respectively, thus providing more accuracy in the same measure-

ment time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

6.4 Plot of speed-up obtained corresponding to fraction of delay to

time-period. Here N is the number of phases of clock available. . 77

6.5 Flowchart for the proposed scheme . . . . . . . . . . . . . . . . . 82

6.6 Block diagram implemented in MATLAB Simulink . . . . . . . . 83

7.1 Block diagram of power-scalable receiver. Courtesy: Kaushik Ghosal 86

7.2 Sampling head (SpH) . . . . . . . . . . . . . . . . . . . . . . . . . 86

7.3 Architecture of voltage controlled delay cells . . . . . . . . . . . . 87

7.4 Block Diagram of the BIST Setup . . . . . . . . . . . . . . . . . . 88

7.5 Die micrograph of the power scalable receiver implementation with

the layout snapshots of BIST blocks inserted . . . . . . . . . . . . 89


List of Tables

3.1 Comparison of various time measurement architectures . . . . . . 32

4.1 Comparison of various voltage-to-time conversion techniques . . . 39

4.2 Summary of Measured Results for DC input . . . . . . . . . . . . 47

4.3 Summary of Measured Results for Sine wave input . . . . . . . . . 49

5.1 Example numbers for parameters discussed in (5.27) and (5.26) . 59

5.2 Example numbers for design parameters fp and N for desired SNR

at given frequency fin . . . . . . . . . . . . . . . . . . . . . . . . . 62

5.3 Summary of Measured Results comparing asynchronous and syn-

chronous cases of sampling clock generation . . . . . . . . . . . . 67

5.4 Summary of Measured Results comparing asynchronous and dithered

(synchronous) cases of sampling clock generation . . . . . . . . . . 67

6.1 Summary of Measured Results for fixed input delay . . . . . . . . 80

6.2 Summary of Measured Results for slowly varying delay . . . . . . 80

Acronyms

AC Alternating Current

ADC Analog-to-Digital Converter

AGC Automatic Gain Control

ATE Automatic Test Equipment

BIST Built-In Self Test

CAD Computer Aided Design

CDN Clock Distribution Network

CDR Clock and Data Recovery

CMFB Common Mode Feed-Back

CMOS Complementary Metal Oxide Semiconductor

DAC Digital-to-Analog Converter

DC Direct Current

DfT Design for Testability

DLL Delay Locked Loop

DMU Delay Measurement Unit

DNL Differential Non-Linearity

DPPM Defective Parts Per Million

DSM Deep Sub-Micron

DSP Digital Signal Processing

DUT Device Under Test

Acronyms xix

ENOB Effective Number Of Bits

FPGA Field Programmable Gate Array

GPS Global Positioning System

GRO Gated Ring Oscillator

IEEE Institute of Electrical and Electronics Engi-

neers

IF Intermediate Frequency

INL Integral Non-Linearity

IP Intellectual Property

LFSR Linear Feedback Shift Register

LNA Low Noise Amplifier

LO Local Oscillator

LSB Least Significant Bit

NF Noise Figure

OSR Over-Sampling Ratio

PET Positive Electron Tomography

PLL Phase-Locked Loop

PSD Power Spectral Density

PVT Process, Voltage and Temperature (varia-

tions)

RISI Received Interference Strength Indicator

RSSI Received Signal Strength Indicator

SAR Successive Approximation Register

SNR Signal to Noise Ratio

Acronyms xx

SoC System on-chip

SpH Sampling Head

SRAM Static Random Access Memory

TDC Time-to-Digital Converter

UMC United Microelectronics Corporation

V2D Voltage to Delay

VCO Voltage Controlled Oscillator

VGA Variable Gain Amplifier

Chapter 1

Introduction

1.1 Motivation

System on-chip (SoC) designs are becoming increasingly popular owing to the

tremendous integration capability enabled by CMOS technology scaling [2]. With

increasing integration, designers fabricate analog, digital and mixed-signal cir-

cuits on the same chip to reduce packaging and assembly costs. With technology

scaling, majority of the required signal processing (especially nonlinear) is imple-

mented digitally and minimal analog circuitry is used, mainly to interface with

the external world [3]. But with shrinking physical dimensions, increasing process

variability is offering a design challenge. While CAD (Computer Aided Design)

tools are deployed increasingly for digital designs, analog and mixed-signal cir-

cuits are getting tougher to design due to reduction of available voltage headroom

and increasing variability.

With shrinking physical dimensions, automated production testing of chips

also becomes very essential. Lot of work has been to modify digital designs

in a way conducive to ease testing, motivated by the feature of observability

and controllability of critical internal nodes on-chip. Commonly used techniques

include

• insertion of observe and control flops at critical internal nodes and at input

and output ports of memory,

• stitching all the flops into a scan chain,

Chapter 1. Introduction 2

• testing of memory by writing in various patterns and reading out.

Such automated techniques are well developed for digital designs since the

fault classes are well defined, such as

• stuck-at faults,

• path delay or transition faults,

• coupling faults, etc.

whereas, defining such fault classes for analog circuits is not straight-forward [4].

As a result, one has to test the analog circuit to check if it meets the specifi-

cations desired, leaving little room for automation. Moreover, analog designers

themselves put margins in their designs to ensure robustness. For these rea-

sons, automated analog and mixed-signal testing had taken a back seat thus far,

but can no longer be so with the increasing popularity of SoC designs and deep

sub-micron (DSM) processes.

1.2 Testing Economics

While high quality of test procedure is needed, it is also very essential to reduce

the cost of testing. With technology scaling, the cost of testing a transistor is

already about a third of the manufacturing cost and is on the increase. The testers

interfacing to the devices are themselves very costly, and every millisecond spent

by the chip on the tester costs. The research opportunity is to ensure test quality

with minimal impact to test time. More details of test economics are described

in [3].

The total testing cost can be split up into two parts - time spent on the tester

and extra area in silicon to enable testing. Let CSi be the cost of silicon per unit

area and CT be the cost per unit time spent on the tester. The extra area taken

up on silicon to enable testing must be compensated by the reduction in testing

time and the cost incurred thereof.

Suppose T1 is the time spent on the tester without extra on-chip built-in self-

test (BIST) and if that reduces to T2 as result of putting on-chip BIST taking up

an area ASi, then [5]

T1CT ≥ ASiCSi + T2CT (1.1)


In other words, the maximum area of the on-chip BIST feasible without increasing

the total testing cost is given by

ASi ≤CTCSi

(T1 − T2) (1.2)

1.3 Scope of the thesis

The cost of designing analog and mixed-signal circuits is mainly limited by the

cost of testing [6]. Hence, design of low cost test strategies is critical to the

manufacturing of analog and mixed-signal circuits.

A technique of digitizing on-chip analog test voltages by way of time-based

processing is presented in this thesis. The technique is well tuned for observing

analog voltages internal to the chip, either for production testing or in-use mon-

itoring. This technique allows a distributed architecture, wherein a small piece

of circuitry called sampling head (SpH) is placed at each test node while the

measurement unit common to all is located centrally. Such an approach avoids

analog routing over long paths along with shielding, thereby saving area and mak-

ing the design digital-friendly. Also, since the test voltage is always connected

to the sampling head, this approach avoids insertion of switches into the signal

path which can potentially degrade system performance. The said sampling head

(SpH) locally converts test voltages into time delay on a pair of low frequency

clock signals, which is then routed to the central measurement unit. Since it is

the digital signals which are routed over long paths, the test power estimates can

be calculated easily.

The mentioned approach needs a pair of clocks - a probe clock to carry the

delay information and a sampling clock to ease the measurement of the delay

thus setup. Measured results from an 130 nm test chip implemented in UMC

CMOS process confirms the ability of resolving voltages of less than a milli-volt

and yields an ENOB (effective number of bits) of 5.29 bits in the dynamic range

of 100 mV.

Performance limits based on the frequencies of probe and sampling clocks

are derived. An FPGA implementation of a time based ADC (Analog-to-Digital

Converter) is described using the said method, wherein the said sampling clock


is derived from the probe clock, with an intention of minimizing the number of

pins needed to talk to the tester. Measured results from FPGA implementation

confirms achievable SNR of 55 dB. A technique of speeding up the measurement

using multiple phases of the probe clock is described.

1.4 Organization of the thesis

A review of the state-of-the-art BIST techniques is presented in Chapter 2. A

review of the state-of-the-art time measurement techniques are presented in Chap-

ter 3. An overview of the proposed architecture is presented in Chapter 4. Anal-

ysis of the system for performance limits and FPGA implementation of sampling

clock generation system is presented in Chapter 5. Chapter 6 describes a tech-

nique to overcome the limitation of dynamic range, Chapter 7 presents an exam-

ple of the proposed technique used in a test chip manufactured in UMC 130nm

process and Chapter 8 concludes.

Appendix A provides a proof that the delay estimator used yields an unbiased

estimate and Appendix B shows that the SNR of the voltage information prop-

agating in the form of delay through an inverter chain decreases with increasing

chain length.

Chapter 2

State-of-the-Art Analog/RF

BIST

2.1 Background

When the design of analog circuits using discrete components were in vogue,

the problem of analog testing or diagnosis was one of fault localization, i.e., of

identifying the fault site and taking appropriate corrective measures. Such faults

are said to be either

parametric where the component values are different from nominal values, or

catastrophic such as short, open, stuck-at or coupling faults leading to failure.

But with the recent pervasiveness of integrated circuits and the popularity of

SoC designs enabled by the tremendous integration, analog testing is increasingly

becoming a necessity for manufacturing high quality devices and reducing time-

to-market [6]. Such an analog testing may be broadly categorized into

structural test fault model based testing

functional test specification based testing

alternate test signature/checksum based testing [7]

Historically, analog circuits have been tested functionally against their specifi-

cations, owing to fewer number of primary inputs and outputs. Although fault

models are well understood in the context of digital designs, it is still in its

Chapter 2. State-of-the-Art Analog/RF BIST 6

nascent stage in analog circuits. Design of robust test criteria for analog testing

is derived in [8] employing tools from machine learning and using feature extrac-

tion. A number of fault models for different analog components are presented

in [9] for structural testing, the most important models being sensitivity (of out-

put parameters to circuit elements) based test [10] and transfer function based

test [11].

Both structural and functional tests are mainly used for production testing

and are concerned predominantly with the design of automatic test equipment

(ATE). But the electronics in an ATE test head lags behind the performance

capacity of the device under test (DUT), thus compromising the information

available at the ATE. In order to circumvent this limitation, test engineers are

increasingly opting for design for testability (DfT) and BIST methodologies [6].

BIST techniques can be used both for production testing, making the required

testers (ATE) simpler and therefore cheaper; and also to monitor circuits in-field

during normal operation. This two-fold application is increasingly making BIST

techniques the most preferred choice for high volume and low test cost strategies.

A review of a representative set of these techniques is described next.

2.2 Increasing observability of analog circuits

Analog designers are increasingly operating transistor circuits in the saturation

region, since the prevalent channel length modulation provides a linear variation

of drain current versus the drain-to-source voltage, simply modeled as a current

source in parallel with a resistor [12]. But this mode of designing necessitates

the additional testing requirement of testing for DC biasing faults to ensure that

the transistor operates in the intended mode [3, 5]. With the popularity of IP-

based designs, where certain IP cores are embedded into SoCs, the accessibility to

individual IPs, especially analog circuits is reduced. As a result, an architecture

which can observe a few test nodes possibly distributed all over the chip, as shown

in Fig. 2.1 is sought.


Figure 2.1: Problem Definition

2.2.1 Analog Routing

A technique of analog routing, wherein voltages and/or currents to be measured

in some internal circuitry are literally “scanned” out to test pins have been pro-

posed [13], but it is suitable only for low frequency signals. An approach of using

an op-amp in one of two modes is presented in [14]; where the op-amp operates

as a voltage follower in ‘test’ mode and as an amplifier in the ‘functional’ mode.

It is argued that such an approach does not degrade performance as is caused

by insertion of many switches. The IEEE 1149.4 standard, which is an extended

version of the boundary scan standard, defines an analog test bus architecture

where several test points inside the system are addressable. Two extra pins - a

primary input and a primary output - are needed for excitation of analog test sig-

nals and measurement of outputs respectively. These signals are routed through

the system by an analog bus of two wires, and all other interconnection points

are connected by analog switches.

But, in all these techniques, analog circuits are used to route analog volt-

ages/currents, which can themselves lead to signal distortion while propagation

in the signal path, possibly caused by coupling and parasitic loads. It is hence


desirable to have testing circuitry which are simpler than those being tested.

2.2.2 Analog Routing with a Digital Interface

Analog routing with digital interface has also been proposed [15], where an ana-

log voltage is digitized and the bits are scanned out serially through a single

pin. Similarly, one can also scan in digital bits and excite circuits with ana-

log voltages using a digital-to-analog converter (DAC). But with reducing power

supply voltages in the deep sub micron technology nodes, leading to reduction of

available voltage headroom, designing conventional ADC architectures for such

applications is becoming increasingly difficult.

A technique of using comparators in either static mode for DC signals or

clocked mode for dynamic signals is presented in [16], which enables the ‘verifi-

cation’ of voltage levels in analog circuits. But the variable reference needed for

the comparator is said to come from bias voltages of other IPs, which may not

scale well to the DSM process due to increased variability of bias voltages.

2.2.3 Analog Waveform capturers

Authors in [17, 18] suggest a technique of displaying analog signal waveforms

using the technique of sub-sampling. Such a technique works well for periodic

signals, otherwise periodicity has to be introduced artificially, as done in [18].

Although the method is well-suited for viewing waveforms in a laboratory, it

cannot be used directly for automated testing.

2.3 BIST methods for Analog Circuits

The methods proposed to address the need for built-in self-test of circuits can be

classified based on the need for application of test vectors.

2.3.1 Vector based methods

A technique of signature-based (check-sum) fault checking is introduced in [19]

which uncovers faults that affect a circuit’s DC transfer function. While use of

sinusoidal signals with frequency scan enables the characterization of a linear


system, application of a large number of sinusoidal components is a slow process,

and especially if the system dynamics are slow [20]. An optimal choice of sinu-

soidal stimuli can be made based on the sensitivity analysis presented in [21]. A

technique of applying pseudo-random noise by way of a sequence generated by

an LFSR (Linear Feedback Shift Register) converted by the existing DAC and

an ADC capturing the response is presented in [22]. One can also make use of

signature analysis of the data captured by ADC as reported in [15]. Techniques

of on-chip ramp generation with precisely controlled slopes are presented in [23]

with an intention of testing ADCs.

2.3.2 Vectorless methods

A multi-tone testing technique is presented in [24], where the DC outputs of

multiple IPs modulate different tones which are added together and analyzed

by a single digitizer. The oscillation-based technique described in [25] allows

the testing of amplifiers and filters without needing an external stimulus, as the

circuit under test is converted to an oscillator in the test mode. The frequency

and amplitude characteristics of the oscillator deviate in the presence of faults.

But this method necessitates the insertion of switches for reconfigurability, which

could lead to performance degradation.

A method of analyzing supply current by use of current sensors, exploiting

the cross-correlation between supply current and output dynamics is described

in [26]; but calibration is needed for each specific circuit as the consumption is

technology dependent.

Authors in [5] have proposed an architecture, where the DC voltages (of BIST

sensors) of test nodes are all tied together to a common bus and digitized centrally

through a 12-bit ADC, shown in Fig. 2.2. Even AC signals are converted to DC

through an envelope detector circuit, but calibration is required in this case to

map the digitized values to analog amplitudes. In such a case, one has to know

the number of sensor nodes in advance or design for a worst-case scenario to

ensure that the value on the bus settles within a specified time. Also, for the

AC case, calibration puts a lower bound on the testing time required. It would

be beneficial to have an approach where the design of driver for the bus can


Figure 2.2: Possible BIST Architecture

be made independent of the number of test nodes, and also eliminate the need

for calibration in testing AC signals, which would reduce the time required for

testing.

2.3.3 BIST in the SoC context

The additional hardware overhead for waveform generators and analyzers for

BIST can be minimized if the test circuitry is used by other components in the

system, as is likely in an SoC context. A scheme of generating arbitrary waveforms

for stimulus by way of applying a high-speed bit stream filtered by an analog low-

pass filter, and digitizer by using a voltage comparator with a variable reference is

reported in [27]. Although an elegant solution for BIST, the filters needed for the

signal generator and digitizer, and requirement of high precision synchronization

are the overheads for such a scheme.


2.3.4 Concurrent test techniques

Some of the techniques discussed earlier has the circuit topology modified dur-

ing test [25] or input signal is being controlled by the test scheme [27]. But for

deep sub-micron process, which are very sensitive to noise and radiation effects,

development of test strategies that evaluate the circuit during normal operation,

referred to as on-line or concurrent test, is of interest [20]. A strategy to enable

such a concurrent testing by use of duplicates of the circuit under test is pre-

sented in [28], where a comparison mechanism verifies the similarity between the

programmable reference block and the block under test. But the programmable

reference block may be difficult to obtain for a variety of analog circuits.

A technique of digital replication is proposed in [29], where a filter learns a

model of the fault-free circuit and compares it with the actual circuit under test.

The area overhead of such a scheme is large, since at least two ADCs are needed

to sample the inputs and outputs of the circuit being modeled, and the compute

power to learn the model is also huge. A method of concurrent error detection for

linear analog circuits using continuous checksums is proposed in [30], where the

specifying parameters of the circuit under test change due to presence of faults.

Clearly, such a solution is dependent on circuit topology and needs digitizers to

evaluate checksums.

2.4 Spectral analysis based tests

A number of DSP-based techniques of testing Analog and Mixed Signal circuits

are described in [31], especially the Fourier Voltmeter (FVM) where the magni-

tude and phase of any arbitrary spectral component of a periodic waveform can

be measured by a pair of quadrature correlators.

A method where the power spectral density (PSD) of the output of a system

excited with white noise and used as a signature of the test is described in [32],

and a distance measure between two PSDs is used to decide whether the circuit

under test is faulty or not. A similar technique is used in [20] to measure the

deviation of quality factor of a biquad filter when excited with a low amplitude

stimulus and output captured by a low resolution ADC.


2.5 Mixed Signal Test

2.5.1 Testing of Data Converters

A technique of full speed testing of ADC based on histogram of the output in

response to sinusoidal excitation is presented in [33], where gain and measures of

non-linearity can be derived from the said histogram. A technique of precision

DAC testing is described in [34] using low resolution ADCs with dithering. Static

linearity test of better than 1 LSB for a 14-bit DAC is demonstrated using a 6-bit

ADC.

2.5.2 Clock signal testing

A technique of characterizing jitter of clocks is described in [6], where the test

clock is mixed with a reference clock and the jitter or phase noise of the clock is

determined from the statistics of the error signal at the output of the mixer. But

generation of an accurate reference clock is the bottleneck in this scheme. How-

ever, such a method will be useful in characterizing data dependent deterministic

jitter, periodic jitter, bounded and uncorrelated random jitter [35].

2.6 RF Test

A typical RF link is composed by a receiver and a transmitter section. A block

diagram of a power scalable receiver implemented in a 130 nm CMOS UMC

process is shown in Fig. 7.1, the important blocks of which are LNA (Low Noise

Amplifier), mixer, ADC, VGA (Variable Gain Amplifier) and filter along with

a PLL (phase-locked loop) for local oscillator (LO) as needed [36]. Similarly, a

transmitter consists of DAC, mixer, filter, power amplifier and a PLL for LO.

2.6.1 RF Design Considerations

The parameters a designer needs to keep in mind while designing circuits to work

at radio frequencies is explained as a design hexagon in [37]. A few of those

parameters which are critical from a testing perspective are presented here.


2.6.1.1 Linearity of LNA-mixer

Linearity is an important issue in the receiver front-end because strong interfer-

ences which may be present at the antenna can potentially ‘drown’ the signal

of interest. Third order non-linear distortion is particularly important, because

intermodulation products may fall in the desired signal band. A performance pa-

rameter called “third order intercept point” (IP3) is defined to characterize this

specific behaviour [37].

2.6.1.2 Noise Figure of LNA

The SNR of the signal degrades continuously as it passes through the receiver

chain. A parameter called “noise figure” is defined to characterize this degrada-

tion in SNR. According to Friis’ formula [38], the noise figure of the first stage

in the receiver contributes the most to the overall noise figure, and hence the

first stage of a receiver chain is typically a ‘low-noise amplifier’, with the critical

specification being its noise figure (NF).

2.6.2 RF testing approaches

2.6.2.1 Loopback technique

One of most important strategies used traditionally is the loopback technique,

which routes the signal from the transmitter back to the receiver without using

a wireless link. As mentioned in 1.1, designers try to minimize analog circuitry

in the system and do the signal processing digitally, invariably using ADCs and

DACs to interface to the external world. Such ADCs and DACs, which will

be present on-chip, can be made use of for controlling and observing analog

signals [3, 40].

While it poses lesser test overhead as every block is not separately tested,

an issue is the possibility of specific faults getting masked, since the entire RF

path is tested. A loopback strategy suitable for transceivers is proposed in [41]

where a certain spectral signature is fed to the transmitter and the output of the

receiver chain is captured and analyzed for faults. In [42], optimized periodic

bit streams modulated at baseband are used in a loopback configuration and


functional parameters such as gain and IIP3 of the transmitter and receiver are

estimated from the captured receiver response.

2.6.2.2 Statistical Sampler

A technique of evaluating the spectrum at specific test points in the signal path

is described in [43] where the test node is compared with noise signal using

a single-bit comparator, and digitally processing the single bit output. This

technique is also demonstrated to compute the IP3 of mixer using the two-tone

stimulus. Although the sampler itself has low area overhead and can be replicated

at multiple test nodes, implementation of well controlled noise generator can be

an issue.

2.6.2.3 Noise figure measurement

As mentioned in 2.6.1.2, noise figure is the degradation in SNR between the input

and output of a block. Hence, it can be measured as the difference between input

and output SNR of the DUT. An alternate definition of noise figure is given

by [37], as

Noise figure (dB) = 10 log10

(Total Output Noise Power

Output Noise Power due to Input Signal Only

)With a view to measure noise figure, [44] presents a technique of measuring noise

power by comparing it with a low amplitude periodic reference signal using a

single bit comparator, exploiting the phenomenon described in [45]. The range

of amplitudes of the reference signal for acceptable error is also described, in

accordance with [45].

2.7 Time-based ADC design

Quite a few approaches of testing re-using ADCs that may present in the system

were described in this chapter. The design of ADCs in the deep sub-micron pro-

cesses are getting increasingly difficult due to reducing voltage headroom and in-

creasing process variations. However, in the case of time based architectures, time


resolution has improved since the transition time of digital signals has reduced

with technology scaling [46]. The all-digital nature of time-based approaches of-

fers itself for scaling and suits stringent area and power specifications. A lot of

recent research activity has focused on designing ADCs based on this method-

ology of time based architectures. Although use of such time-based ADCs for

testing applications is not explicitly described in literature, such ADCs can po-

tentially be made use of for testing applications too. Hence, a brief survey of

such time-based ADCs is presented next.

The two main parts of such solutions are (a) ‘transducer’ to convert voltages

into time pulses or delays, and (b) to measure time/delays. A 9.4 ENOB SAR

ADC is demonstrated in [47] where the input and reference voltages are trans-

formed into time pulses and their duration is compared. Authors in [48] have

extended this technique and demonstrated a 10-bit ADC working at a low supply

voltage of 0.6 V, whereas conventional ADC architectures can go up to only 9

bits of resolution for a comparator noise of standard deviation of half LSB.

Authors in [49] have explained the classical voltage-to-time-to-digital and

voltage-to-delay-to-digital architectures, and presented an implementation pro-

viding 4 bit resolution with power consumption of less than 2.4 mW. Authors

in [50] have also presented a similar idea. By implementing moving average fil-

tering, a resolution of 12 µV per LSB at a sampling rate of 10 kHz is achieved.

Another digital approach which is gaining popularity is the VCO (Voltage

Controlled Oscillator) based approach. In this approach, the voltage to be quan-

tized controls the frequency of the VCO, and the count of edges of the VCO out-

put in a certain measurement time is the quantization of the analog voltage [51].

Authors in [52] propose a sigma-delta ADC with the VCO as the quantizer to

overcome the non-linearity of the voltage to frequency transfer curve.

Authors in [53] propose a ring oscillator ADC where a differential transistor

pair drives two identical ring oscillators as a matched load. The voltage difference

is digitized by the difference between the counters which capture the frequencies

of the two oscillators. They report a bin size of 16 mV with 80 mV range and

consuming a current of 37 µA. However, while implementing two ring oscillators,

there is a possibility of injection locking and adequate care has to be taken to

avoid it.


2.8 Distributed Architecture

With an intention of solving the problem of digitizing multiple analog voltages

distributed over the chip as shown in Fig. 2.1, techniques where the front end

of the digitizer is separated and placed at each test node with the remaining

circuitry shared across multiple nodes have been proposed.

For example, a technique of distributed SAR ADC is described in [54], where

the one-bit comparator is located at each test node while the capacitive DAC is

located centrally, to be shared by multiple test nodes. While this eliminates the

need for multiple accurate DACs, the DAC voltage (which is an analog voltage)

needs to be routed to all the test nodes. A technique wherein the routing of

analog signals is minimized and replaced by routing of digital clock signals is

shown in Fig. 2.3 [55].

Referring to Fig. 2.3, a pair of clock signals (forked from a single source) is

daisy-chained through a series of sampling heads placed at each test node, leading

to a virtual “scan-out” architecture [55]. The job of the sampling head is to act as

a transducer, converting the test voltage to a delay difference between the clock

pair passing through it. To accomplish this act of transduction, the sampling

head present at each test node, consists of a pair of voltage controlled delay

(V2D) cells. The delay of one V2D is controlled by the test voltage VAi, while

that of the other by a fixed voltage Vref . Thus, the voltage difference (VAi−Vref)

is converted as a delay difference between the clock pair. This clock pair is then

centrally processed to extract the delay.

It is to be noted that the analog test voltages are not intentionally perturbed

by the measurement process in contrast to the digital scan chain scenario where

the bits at each node change as per the input serially on the raising edges of a

clock. In this case, the design of the delay cells does not depend on the number

of test nodes. The central digital processing to extract the delay can be done in

different ways to suit the application. It could just be a flop used as a comparator

to get one bit information (similar to SAR architecture, where the reference

voltage Vref is set by a DAC based on the flop decision), or a time-to-digital

converters (TDC) implementation to get upto 10 bits at the rate of few hundred

kHz [56] suited for measuring AC signals, or a statistical converter based on sub-


Figure 2.3: Distributed solution with only digital signals routed over long paths

sampling (described in Chapter 4) to suit low bandwidth signal measurement.

As can be observed from the architecture of Fig. 2.3, the sampling heads

corresponding to test voltages which are not being selected also contribute to the

delay difference between the clock pair, which is not desirable. Such contribution

only adds to the noise without changing the intelligible information. Analysis to

show that the SNR of such information only degrades as the daisy-chain length

increases is presented in Appendix B. To overcome this limitation, it is better

to route the output clock pair of each sampling head directly to the central

measurement unit, instead of daisy-chaining through the other sampling heads.

The details of this modification is presented in the Chapter 4.


2.9 Conclusions

This chapter describes a brief overview of the state-of-the-art in Analog BIST.

As was discussed, time-based designs stand to gain from the technology scaling

which is leading to faster inverter switching and less rise/fall times and thereby

increasing the resolution of time measurements. A brief description of exploiting

time-based designs for ADCs is also provided. The similar technique of time-

based designs are adopted for BIST application, which is described in detail in

Chapter 4.

Chapter 3

State-of-the-Art Time-to-Digital

Converters

3.1 Introduction

Time measurement has played a crucial role in the understanding of nature and

development of science from the earliest times. Starting from techniques of analog

clocks based on solar motion (sun-dials), sand flow (hourglass) and water flow

(ghat.ika-yantra) up to the recent use of precise cesium resonators (especially in

GPS satellites).

As a subset of time keeping technology, time-to-digital converters (TDC) al-

low for precise time measurement digitally between two events. Measurement of

short time intervals with good resolution and accuracy has had a lot of applica-

tions in experimental physics even prior to its popular use in integrated circuits

presently, the important ones being in the areas of - mean lifetime measurements

of excited nuclear states, time-of-flight measurements, particle identification [57],

laser ranging [58] and positive electron tomography (PET) based medical imag-

ing [59]. The first direct predecessor of a TDC was invented in 1942 for the mea-

surement of muon1 lifetimes, actually designed as a time-to-voltage converter;

constantly charging a capacitor during the measured time interval.

With advanced CMOS processes beginning to offer extremely compact and

1from Greek µ, muon is an elementary particle similar to the electron, classified as a lepton.

Chapter 3. State-of-the-Art Time-to-Digital Converters 20

flexible processing power, many applications have begun to replace traditional

analog signal processing blocks with digital signal processing. Such a shift in

architecture places an increased burden on the mixed-signal interface. The TDC

is fast becoming a fundamental element of such an interface, capable of bridging

the gap between continuous-time analog domain and the discrete-time digital

domain (as ADCs [49, 50]), especially in systems that require precise control of

timing signals such as PLLs [60], delay locked loops (DLL) [61] and circuits [62].

In particular, an implementation of temperature sensor using TDC is described

in [63] and a minimally invasive delay slack monitor is presented in [64] that

directly measures the timing margins on critical timing signals, allowing timing

margins due to PVT (process, voltage and temperature) and global variations to

be removed. A technique of measuring skews between leaf nodes of a clock tree by

way of sub-sampling is presented in [62]. In essence, accurate delay measurement

is becoming important for the implementation of important mixed-signal and

sensor blocks in deep-sub-micron processes.

Considering that there is an extensive history of TDC art, and in spite of the

tremendous change in technology from vacuum tubes and ferrite pot-core trans-

formers to present day advanced CMOS processes, the fundamental concepts and

techniques for dividing time into measurable intervals have remarkably remained

more or less the same. Given this context, it is instructive to think of TDC de-

signs conceptually rather than merely in terms of implementation details. This

helps us shape future efforts in TDC developments, in addition to understanding

current practice, considering the simplicity and technology-independence of these

powerful ideas.

Fig. 3.1 shows the basic conceptual idea of a TDC. An estimate of the time

interval Tin[k] = Tstart[k] − Tstop[k] is obtained by counting the number of inter-

mediate reference pulses/events as Tout[k] = Out[k]× Tq, and an error occurs at

both beginning and end of the measurement, given by

Terror[k] = Tin[k]− Tout[k] (3.1)


Tq

Reference

t

Start Stop

t

Signals

Tout[k]

Tin[k]

Figure 3.1: Concept of a TDC

or, equivalently the TDC digital output can be represented as

Out[k] =Tin[k]− Terror[k]

Tq=

⌊Tin[k]

Tq

⌋. (3.2)

Since the raw TDC resolution is limited by Tq, a great deal of effort has been

made over the years to reduce it directly through technology advancement and

effectively by use of intelligent design techniques. A technique of precise time

measurement to measure on-chip jitter is presented in [65] where time is first

converted to voltage by a charge pump before digitization. Although this might

be an excellent solution for a particular technology, the architecture is analog-

intensive, not power-efficient and does not leverage the benefits of technology

scaling of modern CMOS which enables fine resolution of digital edges.

In contrast, TDC designed with digital CMOS processes have benefited greatly

from process scaling since the reducing gate delays accompany improvement in

resolution and also lead to compact and fully-integrated solutions. While intrin-

sic delay has continued to decrease, the accuracy of delay also needs to improve

for the traditional TDC architectures to benefit from scaling. But with future

CMOS scaling, transistor and parasitic mismatch leading to increasing delay mis-

match is proving to be the bottleneck for many TDC architectures [66]. This

has necessitated the exploration of different architectures, namely oversampling,

oscillator-based and sub-sampling approaches which are described in the rest of


this chapter.

3.2 TDC with gate-delay resolution

A classic TDC architecture comprised of a chain of delay elements is shown in

Fig. 3.2 [49], which works by counting the number of sequential inverter delays

that occur between two rising edges of start and stop, yielding a thermometric

code captured into a register at the rising edge of the stop signal; summing up

which yields the digital output. Although this simple architecture offers moderate

performance by using digital gates, increasing the dynamic range leads to a linear

increase in the number of delay elements, thereby increasing power consumption

and decreasing the maximum sampling rate.

A simple improvement to overcome the limitation of dynamic range is to wrap

the end of the chain back to the beginning through a multiplexer as shown in

D Q

DFF

D Q

DFF

D Q

DFF

D Q

DFF

D D D D

+

Start

Stop

Out

Figure 3.2: Basic TDC with gate delay resolution


CountersLogic

Enable

Mux

+

Start

Register

Count

Out

Stop

Figure 3.3: Basic TDC wrapped back as a ring

Fig. 3.3. With larger range, the core of this cyclic TDC does not scale up at all

while the counter size grows logarithmically. Asymmetry in the delay chain due

to the multiplexer degrades the differential non-linearity (DNL) while the integral

non-linearity improves due to reuse of the elements periodically. Techniques to

match the multiplexer delay with the delay element is explored in [67].

While the simple cyclic TDC improves the range, the resolution of an inverter

delay is limited by the process and although technology scaling improves the

intrinsic inverter delay, it will only worsen the mismatch of delay elements. As a

result, it is important to explore architectures which enable the inverter delay to

be divided into smaller measurable intervals.

3.3 TDC with sub-gate-delay resolution

Use of the Vernier delay technique [68] for improving the resolution of digital

CMOS TDC is well understood. As shown in Fig. 3.4, the idea is to delay

both the start and stop signals differently with delay chains; one chain with a

delay of D and the other with D′ = D − ∆ per element; so that the effective

resolution becomes Tq = ∆. But the problems of range limitation and sensitivity


to mismatch persist, to reduce which elaborate calibration techniques have been

proposed. A technique of self-calibration to mitigate local process variations

of a 30-bit vernier chain, which generates delays in steps of 5 ps, is presented

in [64] for monitoring delay slack in-situ for high-performance processors, with an

intention to remove design margins due to PVT variations. An all-digital replica

technique to reduce non-linearity due to process variations is proposed in [69],

where measures of central tendency of multiple identical delay chain outputs are

shown to yield improved accuracy.

To reduce the size of practical Vernier TDC, various dual step architectures

based on coarse-fine architecture have been proposed. One such architecture is

to have a simple delay chain TDC (as in Fig. 3.2) followed by a higher resolution

Vernier TDC [70]. Another two step technique is to use the meta-stability prop-

erty of digital gates to amplify time error and an improvement up to a factor of

D Q

DFF

D Q

DFF

D Q

DFF

D Q

DFF

D D D D

D′ D′ D′ D′

+

Start

Stop

Out

Figure 3.4: Vernier TDC with sub-gate delay resolution of D −D′ = ∆


20 is reported [71], but the delay-amplifier needs to be calibrated to accurately

determine its gain. A cyclic architecture of vernier chains similar to one shown

in Fig. 3.3 is presented in [72], wherein a dynamic range of 12 bits is reported

with a resolution of 8 ps. Although it leads to an increase in the dynamic range,

it comes at the cost of complicated decoding logic and calibration.

Another technique to improve TDC resolution below that of a gate delay is

to interpolate between the input and output signals of a digital gate. This in-

terpolation may be done in an analog manner using a resistive divider; or in

a digital manner by having output signals driven by more than one delay ele-

ment, where the delay element inputs are staggered in time. The operation of

averaging/interpolation creates a new intermediate signal with a transition that

effectively divides the gate delay into two smaller intervals. All of the new signals

must be registered appropriately, which increases the size of the TDC [66].

For each of the TDC architectures described thus far, which are designed to

operate at Nyquist rate; significant effort is required to reduce the TDC resolution

to less than a gate delay, at the expense of increased complexity, area and/or

mismatch. Although calibration generally improves resolution in the presence

of mismatch, its added complexity increases area and power consumption and

cannot always remove differential non-linearity errors [66].

3.4 Oversampling TDC Considerations

It is well known in the field of data converters that averaging the digital output

improves the SNR, provided the following conditions are satisfied:

• the input signal must be band-limited,

• the input has to be over-sampled corresponding to the number of samples

taken for averaging,

• the quantizer should be linear up to its resolution, and

• the input signal must be busy (and not DC) i.e., it must span at least an

LSB.

The last condition also leads to randomization of the quantization noise, making

it less dependent on the input, which can then be reduced by averaging [73].


From the discussion in the previous section, it is clear that we seek TDC with

not only improved resolution but also robustness to mismatch, and proceed to

examine if oversampling can improve TDC performance. As described above, for

oversampling to improve performance, the quantization error should be indepen-

dent of the input signal and also be uniformly distributed over the quantization

step. In a closed-loop system, there are certain conditions in which the system

itself may provide such a scrambling of the TDC as in a fractional-N ∆Σ PLL.

However, there are many applications which do not provide such a dithering;

leading to a situation similar to the classic dead-zone in analog phase detector

known to cause erratic limit-cycle behavior in integer-N PLL. One solution is to

intentionally modulate the TDC input with a noisy signal in order to randomize

the quantization error, and be subtracted from the output later.

Assuming then, that the quantization error is uniformly distributed at all

frequencies, over-sampling a signal of bandwidth WB at an increased sampling

rate of fs makes the effective quantization noise power as

σ2 =T 2q WB

fs, (3.3)

a reduction by a factor of fs/WB � 1. While this fact is impressive, the error

due to mismatch which was earlier negligible, now becomes the bottleneck for

further improvements. A simple oscillator-based TDC is presented next, which

inherently scrambles the qunatization error and mitigates mismatch due to reuse

of delay elements; making it well-suited for oversampling applications [66].

3.5 Oscillator-based TDC

Fig. 3.5 illustrates the classical ring oscillator-based TDC composed of a ring of

delay elements [74], which shares many similarities with the cyclic TDC. The

oscillator transitions for both topologies are counted for a time window of Tm,

designated by the Enable signal. The key difference between the two is that in

the oscillator-based architecture, the starting phase of the oscillator is random,

which leads to the quantization error being uniformly distributed over the interval

[0, Tq]; whereas the starting phase in the cyclic TDC is always fixed at 0.


CountersLogicEnable

+

Start

Register

Count

Out

Stop

Figure 3.5: Classical oscillator-based TDC

In the oscillator-based TDC, since the quantization error is uncorrelated with

Tm, with oversampling both resolution and mismatch will be improved. Mismatch

is also improved in this case since the delay elements which transition during Tm

are chosen uniformly randomly (since the starting phase of the oscillator is

random), and thereby mitigated by oversampling and averaging [66].

Although oversampling with oscillator-based TDC offers improved resolution

and mismatch, it comes at the cost of increased bandwidth and power. To ef-

fectively reduce Tq by a factor of 2, the oversampling rate needed increases by

a factor of 4 to provide an improvement of 3 dB in SNR. But in this case, if

the measurement time Tm (when Enable is held high) is a small fraction of the

oscillator period, then the transitions happening continuously in the oscillator

leads to wasted power.

Fig. 3.6 illustrates the concept of a gated ring oscillator (GRO) TDC, which is

similar to cyclic and oscillator-based TDC in the sense that the number of delay

element transitions during a measurement interval Tm are counted. In this GRO-

TDC, the ring oscillator is also gated in addition to the counters with the Enable

signal, thereby preserving the state of the oscillator between measurements [75].

By preserving the oscillator state at the end of the measurement interval Tm[k−1],


Gated ring oscillator

Counters

+

Register

Count

Out

Enable

Figure 3.6: Gated Ring Oscillator TDC

the quantization error Terror[k−1] from that measurement is also preserved. As a

result, the previous quantization error Terror[k − 1] is carried over as the starting

phase of the oscillator. This results in first-order noise shaping of the quantization

error.

In the GRO-TDC, the delay mismatch is also first-order shaped in addition

to the quantization error since the switching delay elements shift over successive

measurement intervals. This is similar to the barrel-shift algorithm for dynamic

element matching, which is well known to reduce DNL in data converters [76]. As

a result, the SNR of GRO-TDC improves by 9 dB for a doubling of the sampling

rate (as is well known in first order noise shaping data converters [77]), which is

a significant improvement compared to the 3 dB from a oscillator-based TDC.

Each of the TDC architectures discussed so far are aimed towards obtain-

ing increased resolution either by calibration or oversampling coupled with noise

shaping. Although they offer impressive solutions for stand-alone TDCs, further

simpler architectures are desired when the number of test nodes scales to a very

large number, as the number of leaf nodes in a clock distribution network, which


is described next.

3.6 Sub-sampling Approach

Fig. 3.7 illustrates a typical clock distribution network (CDN), wherein a clock

signal from a source is routed to many points called leaf nodes, typically flip-flops

or storage elements, through a buffer structure most commonly connected in a

tree fashion. Various buffers are inserted in the distribution path to ensure signal

integrity at the leaf nodes. Technology scaling in accordance with Moore’s law

and innovations in manufacturing have led to smaller and faster transistors on one

hand, but have also increased variability between transistors on the other. Designs

employing faster clocks warrant tighter timing budgets but increased variability

translates as clock skew at the leaf nodes, and is eating into the already tightened

timing margins [1, 78].

As a result, a technique to measure the relative skew between a pair of leaf

nodes in-situ will be of great value in studying and characterizing skews as well

Figure 3.7: Illustration of a typical Clock Distribution Network with variouscomponents contributing to clock skew. Courtesy: Pratap Kumar Das [1]


D Q

DFF1

D Q

DFF2

C1

C2

Samp clk

S1

S2

Figure 3.8: Illustration of Sub-sampling approach

Samp clk

C1

C2

S1

S2

Figure 3.9: Timing diagram

as potentially enabling a closed loop design to reduce the skew. Since the skew

between the clock signals at a pair of leaf nodes is of interest to be measured,

the technique of sub-sampling can be employed since the information of interest

(skew/delay) is on a period signal. Such a sub-sampling technique also greatly

simplifies the implementation of the delay measurement unit (DMU) that follows,

since the components needed (and thereby the area occupied) is independent of

the resolution to be achieved [79].

In this approach of sub-sampling, the sampling rate is about 2× lesser than

Nyquist frequency (or even lesser) which means that the full signal cannot be

reconstructed back. But, if the parameter of interest can be made periodic with

a known frequency, this approach can still be used to reconstruct the parameter

of interest. For example, an on-chip analog oscilloscope is presented in [17] where

a high frequency periodic analog signal is sub-sampled and the samples are digi-


tized. The sampling clock has a frequency which is slightly less than that of the

input signal to be displayed, so that the original signal becomes time expanded,

thereby significantly reduces the required bandwidth of the ADC that follows.

The authors in [18] use the sub-sampling technique to display the bit-lines of

SRAM cells by artificially introducing periodicity. An on-chip waveform capturer

with 8.8 bits accuracy and 15 ps time accuracy is demonstrated in [80] in which

offsets and slopes of voltage of a digital-to-analog converter is linearly translated

to generate and extract timing information. The sub-sampling approach to mea-

sure skew/delay between a pair of nodes is briefly described next.

Consider the clock signals C1 and C2 at a pair of leaf nodes in Fig. 3.7, both

of period T and a skew/delay of d between them. Let both the clocks be sampled

by another clock of period T + ∆T as shown in Fig. 3.8. It is important to

note that the skew in sampling clock reaching DFF1 and DFF2 and mismatch

between the two flops directly contribute to errors in the measurement. However,

since such a mismatch is basically a static offset, it can be mitigated by single-

point calibration. Fig. 3.9 shows the timing diagram, which clearly demonstrates

‘amplification’ of time period T and skew d by a factor of T/∆T . Furthermore,

the amplified delay by virtue of it being synchronous to the sampling clock can

be measured using just an up/down counter, which counts

up when S1 = 1 and S2 = 0, and

down when S1 = 0 and S2 = 1.

Such an estimator is shown to yield an unbiased estimate of the delay in Ap-

pendix A. A small caveat is to note that the falling edge skew must be eliminated

in order for the simple up/down counter to yield correct results. This state ma-

chine is implemented on FPGA, as described in 4.4.1.

3.7 Conclusions

The various techniques of time measurement discussed in this chapter are sum-

marized in Table 3.1 at a conceptual level, while the exact numbers of earlier

reported works are presented in [66]. The application of time measurement tech-

niques discussed in this chapter, especially the sub-sampling approach, to the

problem of measuring analog voltages for BIST is described in the next chapter.


Tab

le3.

1:C

ompar

ison

ofva

riou

sti

me

mea

sure

men

tar

chit

ectu

res

Sp

ecifi

cati

onB

asic

TD

C[4

9]V

ernie

rT

DC

[49]

Ver

nie

rR

ing

TD

C[4

9]V

CO

-bas

edT

DC

[52]

DM

Ubas

edon

sub-

sam

pling

[62]

Res

oluti

onIn

vert

erdel

ay,

DIn

vert

erdiff

er-

ence

del

ay,

∆(F

ig.

3.4)

Inve

rter

dif

-fe

rence

del

ay,

∆

Inve

rter

del

ayD

Lim

ited

by

mea

sure

-m

ent

tim

e

Mea

sure

men

tti

me

(forb

bit

s)

2bin

vert

erde-

lays∼

2bD

2bD

>2bD·2

(due

toci

rcling)

atle

ast

2bD

2b(T

+∆T

)

Dynam

icra

nge

avai

lable

(Rat

ioof

max

imum

tom

inim

um

mea

-su

rable

del

ays)

nD/D

=n

n∆/∆

=n

No

upp

erlim

iton

mea

sura

ble

del

ay

No

upp

erlim

iton

mea

sura

ble

del

ay

<T/σ

wher

eσ

isth

est

an-

dar

ddev

iati

onof

smal

lest

mea

sure

ddel

ay

Mis

mat

chL

SB

dir

ectl

yaff

ecte

dL

SB

dir

ectl

yaff

ecte

dR

educe

ddue

tore

use

ofin

vert

erst

ages

Red

uce

ddue

tore

use

ofin

vert

erst

ages

App

ears

asoff

set

whic

hca

nb

eca

li-

bra

ted

out

Quan

tity

outp

ut

for

anin

put

del

ayd

⌊ d D

⌋⌊ d ∆

⌋⌊ d ∆

⌋⌊ d D

⌋d T

Info

rmat

ion

nee

ded

toob

tain

abso

lute

del

ay(i

nps)

D,

aver

age

in-

vert

erdel

ay∆

,av

erag

ein

vert

erdiff

er-

ence

del

ay

∆,

inve

rter

dif

-fe

rence

del

ayD

,av

erag

ein

-ve

rter

del

ayor

Tim

ep

erio

dof

VC

O

Tim

ep

erio

dT

Com

pon

ents

re-

quir

ed(f

orb

bit

s)2b

inve

rter

stag

esan

d2b

flop

s

2b+

1in

vert

erst

ages

and

2b

flop

s

2in

vert

erst

ages

,a

flop

and

aco

unte

rat

fre-

quen

cyof

1 2D

2bin

vert

ers

and

aco

unte

rat

VC

Ofr

equen

cy

2flop

san

db-

bit

up/d

own

counte

r

Nat

ure

ofm

ea-

sure

men

tSin

gle-

shot

Sin

gle-

shot

Sin

gle-

shot

Sin

gle-

shot

Per

iodic

(req

uir

este

stdel

ayto

be

pre

sent

ona

per

iodic

sign

al)

Chapter 4

Proposed Architecture

A solution to the problem defined in Chapter 2 is discussed here. The nov-

elty of the solution is to convert voltage information into time delay information

and measure it all-digitally, which is well suited for a distributed architecture

amenable to multiple test nodes.

4.1 Proposed Solution

The solution proposed here is a modified and extended version of the one shown

in Fig. 2.3. As shown in Fig. 4.1, sampling heads are placed at each test node

in order to minimize the routing of analog signals routed over long paths, as

before. But, now each sampling head consists of a pair of flip-flops (DFF) in

addition to a pair of identical delay cells (V2D), as shown in Fig. 4.1(b). A clock

signal is routed serially to all the sampling heads, which is fed to both the delay

cells in the sampling head. The delay of one element of the pair is controlled

by the analog voltage VAi, and that of the other by a reference voltage Vref.

Thus, a voltage difference between the node voltage and reference shows up as

a delay difference in the clocks at the output of the delay cell pair. This pair

of clocks is sampled by a slightly slower sampling clock, giving rise to a pair

of beat frequency signals. We call them the sub-sampled signals and the delay

between them is ‘amplified’ by this process of ‘sub-sampling’ [79]. Hence, there

will be as many pairs of sub-sampled signals as there are test nodes. To measure

a certain test node, the corresponding sub-sampled signal pair has to be fed to

Chapter 4. Proposed Architecture 34

(a) Proposed Scheme for Analog BIST. Connections are indicatedwith dots, just crossing wires not to be treated as connections.(n = log2(N))

(b) Sampling head (SpH)

Figure 4.1: Proposed Architecture for Analog BIST


the DMU with appropriate select signal to the multiplexer. Also, the design of

a sampling head does not depend upon the number of test nodes desired, giving

the advantage of scalability with respect to number of test nodes.

As can be seen from Fig. 4.1, both the input clock (clk) and sampling clock

(samp clk) are ‘picked-up’ from a single point for each sampling head. Hence,

cross-talk and coupling noise which may affect the clocks do not contribute to ad-

ditional noise in the sampling head circuitry. Also, the output sub-sampled signal

pair of the sampling head are low-frequency signals and the delay between them

is already amplified by the ‘sub-sampling’ process, which makes the sub-sampled

signal pair also immune to cross-talk and coupling noise. This technique of ‘sub-

sampling’ provides bandwidth-resolution trade-off, i.e., measurements requiring

coarser resolutions can be done faster whereas finer resolution measurements need

more time.

It is not mandatory that Vref of Fig. 4.1(a) be the same amongst all IPs. If it is

of interest to measure voltage difference between two voltages in the same IP, Vref

can be replaced by that voltage. Such a situation arises when a programmable

current source is employed to achieve current matching in the presence of vari-

ations. Otherwise, Vref can just be grounded. Ground bounce is not a concern

as the technique presented performs averaging over the measurement time deter-

mined by the settings.

4.1.1 Measurement Procedure

(a) Calibration

Because of the possible non-linearity of the delay cells, they will need to be

calibrated apriori. The delay cell pair of sampling head SpHi, corresponding

to VAi is calibrated as follows. MUXcal (Fig. 4.1) is set high so that the

calibration voltage is fed to one of the delay cells (instead of the local node

voltage), while the other delay cell gets the reference voltage. MUXsel is set

to a value so that the multiplexor selects the sub-sampled pair corresponding

to SpHi and feed it to the DMU.

Suppose gi1(·) and gi2(·) are the voltage to delay functions of the two delay

cells of sampling head SpHi respectively, then the delay difference out of this


sampling head, ∆Di is given by

∆Di = gi1(Vcal)− gi2(Vref)

But, we are more interested in ∆Vi = Vcal − Vref. So, we define a function

fi(·) mapping ∆Vi to ∆Di as

∆Di = fi(Vcal − Vref) = fi(∆Vi) (4.1)

The calibration step measures this (potentially non-linear) function f(·) at

few points, which is used later to correct for non-linearity and bias. Calibra-

tion also helps in mitigating mismatches, if any, between the delay cell pairs.

The delay at the input of the DMU is given as:

∆DDMU = ∆Di + ∆Dresidual (4.2)

where ∆Dresidual is the delay difference in the clock pair, accrued in the rest

of the path. This number will be independent of the voltage at node i and

hence can be easily calibrated out. The stability of this calibration over time

is described in 4.8.

(b) Measurement

During the measurement process, MUXcal (Fig. 4.1) is set low. To measure

VAi, the corresponding sub-sampled signal pair is selected by the multiplexer.

Thus, the delay cell pair of sampling head SpHi will create a delay differential

given as:

∆Di = gi1(V Ai)− gi2(Vref) (4.3)

:= fi(VAi − Vref) (4.4)

The input delay difference at the DMU is as given in (4.2). From the cali-

bration data, VAi can be inferred directly or by interpolation.


4.2 Voltage-to-Delay Conversion

Voltage-to-delay conversion is the process of sampling an analog voltage and

converting it into an analog time-difference (on a clock), as shown in Fig. 4.1(b).

A simple technique of voltage-to-time conversion is used in a digital voltmeter,

where the time taken by a negative-ramp signal to reach zero from an unknown

input voltage is monitored by a counter, which produces a digital display accord-

ing to the level of the input voltage signal [81]. Use of alternating voltage-to-time

and time-to-voltage conversions in the design of ADCs is shown to provide natu-

ral error correction due to comparator offset and delay, 1/f noise and switching

charge-injection [82].

The various strategies for voltage-to-delay conversion are as follows:

• Direct Voltage-controlled

• Direct Current-controlled

• Current-starved inverter-based

A comparison between direct voltage-control and current-starved inverter-

based strategies is provided in [83], from which it is evident that while direct

voltage-control strategy can yield large sensitivity of delay to voltage and better

linearity over a wider voltage-range, it occupies more area and consumes higher

power and can go up only to medium frequencies. In [84], a differential design

of a voltage-to-time converter is presented, where the charging time of a pair of

capacitors to (differential) control voltage manifests as the output delay. Such

a voltage-to-time converter is applied in receiver equalization to mitigate inter-

symbol interference (ISI) in mesochronous links, and an overall linearity of 4.3

bits (5.5 bits linearity in a dynamic range of 600mV in simulation) is reported.

Authors in [85] have presented a novel linearization scheme for a voltage-to-pulse-

delay-time converter, suitable for analog-to-digital converters, based on current

starved inverters. The linearity error is demonstrated to be less than 2% over

a dynamic range of 200 mV in simulation. A design of programmable voltage-

to-time converter based on current-starved inverters is described in [86], where

programmability of delay is by way of controlling the bias of a MOS capacitor. A

linearity of 3.7 bits is demonstrated and estimated power consumption of 3.6 mW


in STMicroelectronics 90 nm CMOS process. The architecture used is such that

although each voltage-to-time converter delays only the falling edge, a pair of

such converters delays both edges so as to keep the duty cycle unchanged at the

output. But no particular advantage of delaying both edges is pointed out and

hence dropping the second voltage-to-time converter saves area. A summary of

the listed techniques is provided in Table 4.1

4.2.1 Sample-and-Hold Action

Sample-and-hold circuits are needed in ADCs when the input voltage is a rel-

atively high-frequency signal with respect to the ADC conversion time. But in

V2D converters, the edge of probe clock propagating through the V2D gets de-

layed by an amount dictated by the control voltage. As a result, the control

voltage is sampled at each rising edge of the probe clock. Hence, such a V2D

may act as a sampler provided its conversion rate is sufficiently greater than the

input frequency. The condition when a sample-and-hold circuit may be omitted

is derived as follows. Let the input sinusiod be

Vin = A sin(2πfint) (4.5)

where A is the amplitude and fin is the signal frequency. The maximum slope of

this signal, which occurs at its zero crossing, is given by

dVin

dt

∣∣∣∣max

= 2πAfin (4.6)

Let TC be the conversion time of the technique. Then, the input signal should

not change by more by than one LSB in time TC to avoid errors. This imposes

an upper limit on the maximum slope of the signal and hence on the frequency,

so thatdVin

dt= 2πAfin ≤

VLSB

TC(4.7)

The LSB voltage of an n-bit ADC is given by

VLSB =2A

2n − 1(4.8)


Tab

le4.

1:C

ompar

ison

ofva

riou

svo

ltag

e-to

-tim

eco

nve

rsio

nte

chniq

ues

Met

ric

Dir

ect

Vol

tage

-D

iffer

enti

alC

urr

ent-

Sta

rved

Inve

rter

Pai

rof

Curr

ent-

Sta

rved

Inve

rter

s

Con

trol

[83]

Vol

tage

-C

ontr

ol[8

4][8

3][8

6]F

ig.

4.2

Fig

.7.

3bF

ig.

7.3a

Vol

tage

-to-

Tim

eC

onve

rsio

nF

acto

r7.

2ns/

V-3

20ps/

V0.

5ns/

V15

.3ns/

V4.

65ns/

V5.

95ns/

V

Lin

ear

Input

Vol

t-ag

eR

ange

800

mV

(0.6

V→

1.4

V)

400

mV

(0.8

V→

1.2

V)

100

mV

100

mV

600

mV

(0.4

V→

1.0

V)

600

mV

(0→

0.6

V)

Lin

eari

tyE

rror

±0.

1%4

bit

s±

0.15

%3.

7bit

s5.

29bit

s3.

91bit

s4.

23bit

s

Max

imum

Sam

-pling

Fre

quen

cy10

0M

Hz

6.25

GH

z1.

1G

Hz

5G

Hz

500

MH

z20

0M

Hz

200

MH

z

Pow

erC

onsu

mp-

tion

3.3

mW

7.5

mV

136µ

W3.

6m

W

Are

a48

0µ

m2

14.5µ

m2

6800

µm

269

.72µ

m2

294µ

m2

260µ

m2

Tec

hnol

ogy

Node

180

nm

90nm

180

nm

90nm

130

nm

130

nm

130

nm

VD

D1.

8V

1.2

V1.

8V

1.2

V1.

2V

1.2

V1.

2V


Combining and rearranging (4.7) and (4.8), the limit on input sine frequency for

error-free conversion is given by

fin ≤1

π(2n − 1)TC(4.9)

The inequality (4.9) dictates whether a sample-and-hold is needed or not for the

desired resolution n, conversion time TC and input signal bandwidth fin.

4.2.2 Implementation

Targeting a measurement range of 0 to 100 mV, PMOS controlled current starved

inverters are used, as shown in Fig. 4.2. However, alternative delay cell architec-

tures could be used for other applications as the specifications desire. The area of

the pair of delay cells chosen for this application and taped out is 8.2 µm× 8.4 µm.

As is evident from the circuit, the voltage influences only the delay of the

rising edge, while the delay of the falling edge is uncontrolled. Hence, having

chosen an input clock period of T and duty ratio D, the range of the system is

the value of Vin which gives an absolute delay of Dr, where Dr = D × T , the

maximum rising edge delay possible. For instance, for a clock period of 125 ns,

duty ratio of 0.5, suppose 120 mV gives a delay of 62.5 ns, then the range of the

system is1 120 mV. This range can be increased therefore by increasing the duty

ratio or input clock period or both.

4.3 Design Considerations

With the sizing of the delay cell circuitry, the capacitance between the analog

voltage and the input clock is about 2 fF. With a decoupling capacitor of 4 pF,

the kickback will be less than 0.6 mV. For smaller kickback, either the decou-

pling capacitor has to be increased or cascoding has to be implemented. With a

transistor of gain 20 in cascode, the decoupling capacitor can be as less as 0.1 pF.

Since the delay of every cell is sensitive to supply voltage, variations in supply

voltage directly impacts the voltage measurement. The power supply will have a

1The actual dynamic range will be slightly lesser due to some margin for the falling edgeskew eliminator algorithm.


clkout

clkin

Vin

w=2µm

l=1µm

w=2µm

l=1µm

w=1µm

l=1µm

w=1µm

l=1µm

w=180nm

l=1µm

clkout

clkin

Vin

w=2µm

l=1µm

w=2µm

l=1µm

w=1µm

l=1µm

w=1µm

l=1µm

w=180nm

l=1µm

V2D

Figure 4.2: Schematic Circuit of current-starved Voltage to Delay cell (V2D)

distribution profile across the chip. This profile will get calibrated out provided

the power supply does not change too much with time.

To combat time varying supply voltage, a solution is to make use of delay

cells with a good power supply rejection ratio or to use regulated power supply.

Placing a transistor in cascode also helps to mitigate the effect of power supply

noise on delay.

The measurement of bias voltages is heavily dependent on Vcal for the cali-

bration and interpolation. Hence, the generation of Vcal will have to be accurate

and has to be shielded well so that noise coupled onto Vcal will not impact the

measurement.

The noisy currents of the voltage-to-delay converter contribute to jitter on

the clocks. If N delay cells are used in cascade to generate the delay difference,

assuming the jitter added by each to be independent of one another, the jitter

grows as√N , while the total delay grows as N . Hence, from a noise perspective,

it is advantageous to employ more delay stages. But this leads to bandwidth

limitation1 and increased kick-back to the analog test node.

1Bandwidth is reduced since the test voltage should not change too much so long as theclocks are propagating through the delay stages.


For a given measurement time, the resolution of measurement of delay dif-

ference that can be obtained is say 6σ (where σ is the standard deviation of

the measured delay values). Then, based on the desired voltage resolution, the

required voltage-to-delay ratio is calculated. Based on the total delay required,

one can choose the number of delay stages needed.

For example, suppose that a resolution of 10 ps can be achieved in a given

measurement time and that the voltage resolution desired is 1 mV. Then the

voltage-to-delay converter has to give a differential delay of 10 ps/mV. Suppose

a single delay cell is designed to provide this delay. On the other hand, if 0.1 mV

resolution is desired with the same set-up and measurement time, ten such delay

cells can be used. But, as stated before, the bandwidth of the signal that can

be measured reduces by a factor of ten also with increased kick-back to the test

voltage.

4.4 Hardware Implementation Details

Fig. 4.3 shows the overall block diagram of the implemented system to evaluate

the concept. It consists of two components, the voltage-to-delay circuitry and

sampling flops (making up a sampling head) on-chip and DMU implemented in

a Virtex II development board. The setup is geared to measure a single analog

voltage, shown as the pin named ‘Vin’ in the figure. It is used first for calibration

(by feeding known voltages) and then to measure test voltages. So also, only the

DMU part of the control unit of Fig. 4.1(a) is implemented on FPGA.

A provision is made to select the output of either of a single or a series of 13

voltage-to-delay cells, giving a handle on the voltage-to-delay sensitivity.

4.4.1 Sub-sampling Based Delay Measurement Unit (DMU)

The V2D cells set up a delay between the clocks at nodes Dai and Dbi, which

has to be measured. Since we want to measure this delay digitally, we sample the

clock pair by another clock. A possibility is to use a sampling clock frequency

which is much much larger (say about 100×) that of the clock pair. Although

it sounds reasonable theoretically, practical measurements showed that the stan-

dard deviation of the measured delay, and therefore of voltage, was too high. A


Clk

in

Pe

rio

d=

T

V2

D

V2

D1

V2

D2

V2

D12

1 0

V2

D

V2

D1

V2

D2

V2

D12

DF

FD

clk

Q

DF

FD

clk

Q

DF

FD

clk

Q

DF

FD

clk

Q

0 1

Sa

mp

lin

g C

lk

Pe

rio

d=

T+Δ

T

Vin

On

-Ch

ip F

ron

t-en

d

MU

Xse

lV

ref

+ -

+

Dig

ita

l co

de

for

T

n

up

/dn

cou

nte

r

De

lay

Me

asu

rem

en

t U

nit

(O

n F

PG

A)

Fre

qu

en

cy

div

ide

d b

y 2

k

>>

K

CQ

b

Da

1

Da

0

Db

0

Db

1

Qa

Qb

Fa

llin

g e

dg

e

ske

w

elim

ina

tor

clk

CQ

aL

oo

k u

p

Ta

ble

+

Inte

rpo

lato

r

VV

Fig

ure

4.3:

Blo

ckD

iagr

amof

Imple

men

ted

Set

-up.


sampling frequency which is slightly less or slightly more than that of the clock

pair is used here as per the sub-sampling approach, which was introduced in 3.6.

Referring to Fig. 4.3, if the frequencies of sampling clock and probe clock pair

are rationally related (which happens when one of them is derived from the other),

then the resolution of measurement is lower bounded by a non-zero quantity, de-

termined by the parameters T and ∆T. This means that, in spite of increasing the

measurement time, the accuracy (standard deviation) of measurement does not

improve beyond a certain value. In such cases, adding some additional jitter onto

the clocks, by way of frequency modulation for instance, improves the resolution.

This phenomenon is also reported in a similar setup [62].

On the other hand, when the frequencies of sampling and probe clock pair

are irrationally related, there is no such fundamental limit on resolution. But, it

comes with the rider that this sampling clock has to be generated from a separate

crystal. For this particular application, the tester can be made to feed this second

clock signal. In this setup, the jitter of the clocks comes in as a hindrance and is

mitigated by averaging.

Referring to Fig. 4.4, the input clock pair Dai and Dbi (of period T) is sampled

by an asynchronous1 sampling clock of period2 T+∆T. As a result, the two

outputs will be beat clocks with period given as Tb = (T+∆T)×T/∆T (period

of ideal Qa and Qb in Fig. 4.4). In other words, there is a time “amplification”

by a factor T/∆T. Hence, the skew δ between these two clocks which we intend

to measure will also be amplified by the same amount Tsk = [(T+∆T)×δ/∆T],

shown as skew between ideal Qa and ideal Qb. Due to jitter on the clocks and

meta-stability issues of the samplers, the sampled outputs will be bouncy as

shown in the waveforms for Qa and Qb. Hence, the skew has to be estimated by

averaging the delay difference between the two rising edges of Qa and Qb across

many instances. Since Qa and Qb are synchronous to sampling clock, their delay

difference will always be some multiple of sampling clock period and hence a

simple up/down counter suffices to estimate this difference. Further, the same

counter can be used for averaging across multiple periods of Qa and Qb. Such

an averaging yields an unbiased estimate of the skew as a fraction of the clock

1Asynchronous here means of different frequency and preferably different phase.2Sampling clocks of period T±n∆T (n, integer) can also be used.


Sam Clk

Dai

Dbi

Ideal Qa

Ideal Qb

Qa

Qb

CQa

CQb

T1

Tsk

T1=(T+∆T)*T/∆T T

sk=δ*(T+∆T)/∆T

Skewed

by δ

δ

Figure 4.4: Illustration of timing diagram and the concept of Sub-sampling.

period T [62], the proof of which is described in Appendix A for completeness.

Finally, we would like to ignore the falling edge related bounces and hence, we

generate the clean clocks CQa and CQb which are fed to the up/down counter as

shown in Fig. 4.4. The DMU has an equivalent gate count of 414 NAND2 gates.

4.4.2 Generation and Routing of Clocks

As described in 4.4.1, this approach needs two clocks – a probe clock and a

sampling clock, of slightly different frequencies. Generation and routing of such

close frequency clocks can be challenging due to the phenomenon of injection

locking. In the lab, we used a pair of signal generators independently to provide

these two clocks. In a real-world BIST scenario, the tester can provide one of the

clocks while the other can be generated on-chip. As described in 4.4.1, a sub-

harmonic of the sampling clock does practically no change to this setup of sub-

sampling and hence use of this eliminates the issue of injection-locking. Measured

results from a system where sampling clock is generated from a PLL is described


in Chapter 5.

4.5 Measured Results

Die photo of the chip fabricated in UMC 130 nm process node and a snapshot of

the layout is shown in Fig. 4.5 (area of 70 µm × 153 µm).

Figure 4.5: Die photo along with snapshot of layout.

4.5.1 DC Measurements

Due to jitter and meta-stability issues pointed out in the previous sections, a set

of digital code words correspond to a single voltage. A set of 32 measurements

are taken to compute the mean and standard deviation of the delay count for

each Vin value. The error-bars shown in the plots of Fig. 4.6 represent a value

of ±σ on each side of the mean. The accuracy of voltage measurement is then

defined as the range of voltage values corresponding to ±3σ. This is obtained

by the dividing standard deviation of delay by the local slope at each point in

Fig. 4.6.


Table 4.2: Summary of Measured Results for DC input

Sl. VDD fp fs MT DR σmax Bin-sizeNo. (V) (MHz) (MHz) (s) (ns) (ps) (mV)

1 0.75 8.0 7.6 4.42 6.77 32.1 1.50

2 1.2 10.0 9.8 3.42 1.53 10.1 2.05

3* 1.2 8.0 7.9 4.25 11.25 30.5 0.82

4 0.75 8.0 7.99 4.20 7.80 29.0 1.60

5 0.75 8.0 7.99 16.80 7.30 15.0 0.85

6† 1.0 37.0 36.927 0.056 2.97 9.77 2.10

7† 1.1 37.0 36.927 0.056 2.13 9.92 1.03

8† 1.2 37.0 36.927 0.88 1.20 3.87 1.25

*13 V2D cells† Measurement time strictly integer number of beat periodsDescription: fp - Probe Clock Frequency, fs - Sampling Clock Frequency, MT - MeasurementTime, DR - Dynamic Range of delay, σmax - Maximum standard deviation of delay values,Bin-size - Accuracy of measurement

Table 4.2 and Fig. 4.6 present a summary of measured results, which shows

that delay increases with reducing supply and increasing number of V2D cells.

The delay versus voltage plots of Fig. 4.6 are for the settings presented in the

rows 1, 2 and 3 of Table 4.2. An accuracy of about 1 mV can be obtained by

proper choice of parameters and measurement time. One can easily obtain desired

accuracies by suitably altering the measurement time.

Except entry 3, other entries of Table 4.2 correspond to the single V2D case.

Entries 4 and 5 show that the accuracy improves with measurement time. For a

four-fold increase in measurement time, the accuracy improves by a factor of two,

which goes well with the theory. Entries 6 to 8 are taken for a measurement time

strictly integer number of beat periods. The bin-size obtained in lesser duration

is comparable to other entries acquired over a larger measurement time.

The differences in the delay ranges for the similar settings are present because

they were measured on different days and hence conditions like supply voltage

and temperature could be different. However, the measurements taken with the

same setting about couple of hours apart (after offset-cancellation1) is stable

1Zero difference delay corresponds to zero differential voltage.


0 20 40 60 80 100

0

2

4

6

8

10

12

∆ V (mV)

∆ D

(n

s)

1.2 V supply, 13 V2Ds, #3

1.2 V supply, 1 V2D, #2

0.75 V supply, 1 V2D, #1

Figure 4.6: Plots of ‘offset-canceled’ differential delay versus differential voltagefor the settings mentioned. Refer Table 4.2 for the settings of #1,2,3.

enough 4.5.3.

4.5.2 AC Measurements

A sine wave of frequency 30 Hz is applied to the system directly without sample-

and-hold circuitry. A set of 16,384 data points are collected for the SNR measure-

ments. SNR measurements are determined without calibrating the data points.

The summary of results obtained is shown in Table 4.3.

It is observed that the SNR degrades for both low and high OSR in accordance

with Fig. 5.2. Entries 7 and 8 of the Table 4.3 confirms that the SNR is higher for

a lower bandwidth signal with rest of the settings being similar. Fig. 4.7 shows

the plot of DNL and INL for setting 8 of Table 4.3, obtained by code density

test [33]. The maximum values of DNL and INL are found to be less than 1 LSB.

The theoretical analysis presented in 5.1 shows that a resolution of 12 bits

is possible in this approach (Fig. 5.2) for the used settings, but the measured

results show a maximum of 5.29 bits. The limitation in the resolution is because


Table 4.3: Summary of Measured Results for Sine wave input

Sl. fin fp fs OSR SNR ENOBNo. (Hz) (MHz) (MHz) (dB) (bits)

1 30 37 36.999 16.67 9.61 1.30

2 30 37 36.9975 41.67 17.57 2.63

3* 30 37 36.995 41.67 17.37 2.59

4* 30 37 36.9925 62.50 19.54 2.95

5* 30 37 36.99 83.33 16.51 2.45

6* 30 37 36.98 166.67 9.01 1.20

7 10 37 36.999 50.0 17.76 2.63

8* 10 37 36.927 1825 33.61 5.29

*One measurement over two beat periods.Description: fin - Frequency of input sine wave, fp - Probe clock Frequency, fs - SamplingClock Frequency, OSR - Over-sampling Ratio, SNR - Signal to Noise Ratio, ENOB - EffectiveNumber Of Bits.

0 10 20 30 40−0.1

−0.05

0

0.05

0.1

0.15

0.2

0.25DNL

Digital Codeword

LS

B

0 10 20 30 40−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

Digital Codeword

LS

B

INL

Figure 4.7: DNL and INL plots for the setting of entry 8 in Table 4.3

the delay in this case is a small fraction of the time period of probe clock, i.e., the

dynamic range is limited. So, in order to get a better resolution, a probe clock of

higher frequency has to be used. But the poor rise time of the presently designed

V2D converter limits the frequency of the clock that can be used. In such cases,

the technique described in Chapter 6 can be used to overcome this issue of limited


dynamic range. Also, a better design of voltage-to-delay converter is used for an

example application, which is described in Chapter 7.

Another point to note is that since the oversampling rate is closely related

to the difference of the probe and sampling clock frequencies, both these clock

frequencies need to be accurately controlled. This may be achieved by deriving

this sampling clock from the probe clock and mimicking asynchrony by artificially

introducing frequency modulation [62]. Another approach could be to use a PLL

structure to precisely control the sampling frequency, the details of which are

presented in Chapter 5.

4.5.3 A note on stability of calibration data

The delay of the V2D is a function of temperature and voltage, and as a result is

vulnerable to change. Hence, the calibration data collected may not be accurate

a few seconds later, leading to large errors. But, a simple way of avoiding this

is to do a one-point calibration, setting Vin =Vref . In short, it can be referred

to as ‘offset-cancellation’, also pointed out in Fig. 4.6. Fig. 4.8 shows the re-

sults obtained for the same setup, at two different times separated by 1.5 hours.

Although the actual values of differential delays for the two measurements were

different, the curves almost coincided once the offset was subtracted.

4.6 Conclusions

We propose a scheme for analog BIST which is well-suited to measure voltages

distributed all over a chip by locally converting the test voltage into a delay

between a pair of sub-sampled signals. This is achieved by a sampling head

placed at each test node; each sampling head consisting of a pair of voltage

controlled delay cells and a pair of flip-flops. This approach reduces the routing

of analog signals over long paths to the measurement unit, hence saving chip area

due to absence of shielding lines. Instead, a clock signal, a sampling clock of

slightly different frequency need to be routed serially to each sampling head, and

a pair of low frequency sub-sampled signals from each sampling head needs to be

routed to the central DMU. This chapter gave a comparison of different voltage-

to-delay converters. The implementation details along with measured results were


0 20 40 60 80 100−1

0

1

2

3

4

5

6

7

∆ V (mV)

∆ D

(ns)

Earlier

Later

Figure 4.8: Plot of difference delay versus difference voltage after subtracting thedelay at zero difference voltage from both curves at two time instants separatedby 1.5 hours.

presented. A simple single point calibration is also suggested, which supplements

an extensive one-time calibration, to overcome the variability of delay with time.

Generation of sampling clock from the probe clock along with performance limits

and measured results are presented in the next chapter.

Chapter 5

Performance Limits and

Sampling Clock Generation

5.1 Behavioral model

The difference delay setup on sub-sampled signals by a sampling head is mea-

sured by the DMU, and is mapped to a voltage based on the calibration data of

difference voltage versus difference delay; as shown in Fig. 4.6.

The choice of a measurement time of an integer number of beat periods1 leads

to estimates of reduced variance. This is because one time period of probe clock is

‘amplified’ to one beat period. Hence, across beat periods, the distribution of skew

in a beat period looks identical nominally, except for jitter in the clocks causing

minor differences. This is confirmed by measurement results also in Table 4.2.

Hence, the DMU can be modeled as a quantizer with a conversion time of one

beat period. The resolution of the aforesaid quantizer depends on the duration

of the beat period. If the desired measurement time is multiple, say M, beat

periods; that can be modeled as a moving-average (MA) filter or an ‘integrate

and dump’ filter following the quantizer. More compactly, it can be modeled as

a MA filter followed by decimator. If moving average filtering alone is desired,

then the decimator down-samples at a rate of 1 (no down-sampling), whereas if

‘integrate and dump’ is desired, the decimator down-samples at a rate of M. This

1Beat period is the time-period of sub-sampled signal.

Chapter 5. Performance Limits and Sampling Clock Generation 53

is shown in Fig. 5.1, where an Anti-Aliasing filter (AAF) is shown to limit the

input bandiwdth, a switch is shown to model the sampling at the conversion rate,

and quantization is modeled as an additive noise (εq) followed by averaging. The

anti-aliasing filter is not explicitly implemented in this work, but shown in the

model for the purpose of analysis.

��

��

��

�

Figure 5.1: Behavioral model of voltage quantization employing the DMU

Let T and T+∆T be the time periods of probe clock and sampling clock re-

spectively. Then, the duration of the beat period is given by Tb = (T+∆T)×T/∆T,

which will consist of T/∆T sampling clock periods. Hence, the resolution of the

quantizer in measuring time, in principle, is ∆T/T. Therefore, the maximum

number of bits of this quantizer, bmax and conversion time Tb are given by:

bmax = log2

( T∆T

)Tb = (T + ∆T )× T/∆T (5.1)

Practically, if dmax is the maximum delay given by the voltage-to-delay converter

corresponding to the maximum voltage input, then the number of bits b available

is given by:

b = log2

(dmax∆T

):= log2

(cT

∆T

)(5.2)

where c is the maximum delay as a fraction of the time period T.

If the total measurement time is M beat periods, the variance reduces by

a factor of M, and hence the SNR increases by about1 3 log2(M) dB. Hence,

doubling the measurement time improves the SNR by about 3 dB. The bandwidth

of measurement is 1/2Tb. Hence, clearly there is a bandwidth-accuracy trade-off

1 10 log10(x) ≈ 3 log2(x)


in the choice of parameters T and ∆T. The overall SNR of the complete system

for a sinusoidal input of amplitude Vinmax is given by:

SNR = 10 log10

(A2

2· 1

12· c2 (∆T )2

T 2·M)

(5.3)

where A=kVinmax with k being the voltage to delay gain (assuming voltage to

delay conversion being linear). But, this assumes that sample-and-hold circuitry

is available which can hold the sampled values for a period of the measurement

time (tm), given by

tm = M · (T + ∆T )× T/∆T (5.4)

Let fp and fs be the frequencies of the probe and sampling clocks respectively.

Then, the frequency of the beat signal is given by fb = fc − fs. The above

equations can now be re-written as:

T

∆T=fsfb

(5.5)

∆T =fbfpfs

(5.6)

b = log2

(cfsfb

)(5.7)

Conversion rate = fb (5.8)

In order to simplify the design in this case of distributed voltage measurement,

the use of a sample-and-hold at the voltage nodes is avoided. This renders data

conversion at Nyquist rate impossible, but the methodology of over-sample–and–

average can be made use of, similar to a delta modulator [56, 73]. Basically,

the sampling rate should be high enough so that the signal of interest does not

change beyond an LSB within the conversion time. Assuming an input sinusoid

Vin = A sin(2πfint), an LSB a and conversion rate fb, the maximum change in

Vin within an interval of 1/fb should not exceed a, as discussed in 4.2.1 i.e.,

2πAfinfb

≤ a (5.9)

Define OSR = fb/(2fin) and 2A/a = 2n1 for n1 effective bits of conversion. Then,


OSR sets a limit on the number of effective bits (n1) that can be obtained, as

given by

n1 ≤ log2

(OSR

π

)+ 1 (5.10)

SNR1 ≤ 6.02 log2

(OSR

π

)+ 7.78

SNR1 ≤ 20 log10

(√6 OSR

π

)(5.11)

(5.11) is the limit due to oversampling, which says says that the SNR that

can be achieved increases by about 6 dB for every doubling of OSR (20 dB per

decade/6 dB per octave).

A second limit on the number of effective bits (n2) available is set by the

number of bits available at each conversion, which reduces with increased over-

sampling. Using (5.7),

n2 ≤ b+1

2log2 (OSR) = log2

(cfs

2fin√

OSR

)(5.12)

SNR2 ≤ 6.02 log2

(cfs

2fin√

OSR

)+ 1.76

≤ 20 log10

( √3 cfs√

8 OSR fin

)(5.13)

(5.13) gives the limit due to quantization, which says that achievable SNR

decreases by about 3 dB for every doubling of OSR (-10 dB per decade/-3 dB

per octave). The actual number of bits (n) that one can obtain is given by

n = min(n1, n2) (5.14)

SNR = min(SNR1, SNR2) (5.15)

Fig. 5.2 shows the plot of (5.10) and (5.12), which indicates the existence of

an optimal OSR. The plot also shows that the optimal OSR reduces for a higher

input bandwidth and also yields lesser number of effective bits. It is important

to note that (5.10) and (5.12) are quite loose bounds since the non-idealities of


jitter are not modeled to keep the analysis simple. The actual measured results

from Table 4.3 are also shown in Fig. 5.2.

5.2 Derivation of Design Parameters

There are two independent parameters, namely the pair of T and ∆T , or alter-

natively fp and fs - the probe and sampling clock frequencies. The parameters

need to be chosen so that the SNR obtainable is maximized.

As mentioned earlier, the minimum measurement time is taken to be one beat

period Tb. Hence, the over-sampling ratio (OSR) for a sine wave of time-period

0 5 10 15 20 25−5

0

5

10

15

20

25

log2(OSR)

n

n1

n2

fs/f

in=10

7

fs/f

in=10

6

fin

= 10Hz

fin

= 30Hz

Figure 5.2: Plot of OSR versus n, showing the existence of optimal OSR for agiven fin. Effective n = min(n1, n2) (5.14). The other parameters of the equationsare taken from the settings described in Table 4.3. The dots indicate the resultssummarized in Table 4.3. The gap between the modeled and measured behavioris because the differential delay generated is a small fraction of the clock timeperiod, and the resolution improves as the ratio of differential delay to time periodincreases. An explicit way of ensuring it to speed-up measurement and achievedSNR is described in Chapter 6.


Tin is given by

Tb =T (T + ∆T )

∆T

OSR =Tin

2Tb=

Tin∆T

2T (T + ∆T )(5.16)

Replacing OSR in (5.11) and (5.13) using (5.16), we have

SNR1 = 20 log10

(√6 OSR

π

)= 20 log10

( √6 ∆T Tin

2π T (T + ∆T )

)(5.17)

SNR2 = 10 log10

(3 c2TinT

16 ∆T (T + ∆T )

)(5.18)

For optimal SNR, we have equating (5.17) and (5.18)

6 (∆T )2 T 2in

4π2T 2(T + ∆T )2=

3 c2TinT

16 ∆T (T + ∆T )(∆T

T

)3

=c2π2 (T + ∆T )

8Tin(5.19)

T + ∆T =8Tinc2π2

(∆T

T

)3

(5.20)

From (5.20) and (5.18),

SNR = 10 log10

(3 c4π2 T 4

128 (∆T )4

)(5.21)

∆T

T=

(3 c4π2

128

)1/4

10−SNR/40 (5.22)

T + ∆T =8 c Tinπ2

(3π2

128

)3/4

10−3 SNR/40 (5.23)

With a bit of algebra (taking MATLAB’s help), the parameters T and ∆T for

optimal SNR are given by:

T = Tinc√π

(3

8

)3/4

10−3SNR/40

(1 +

(3c4π2

128

)1/4

10−SNR/40

)−1

(5.24)


∆T =3c2Tin 10−3SNR/40

16(

10SNR/40 +(

3π2c4

128

)1/4) (5.25)

In terms of frequencies, the parameters fp and fs for optimal SNR are given by:

fs = fin

√π

c

(8

3

)3/4

103 SNR/40 (5.26)

fp = fs

(1 +

(3c4π2

128

)1/4

10−SNR/40

)(5.27)

(5.27) and (5.26) give the frequencies of probe and sampling clocks for optimal

SNR. Some example numbers for fp and fs in accordance with (5.27) and (5.26)

are provided in Table 5.1. As expected, the values of fp and fs are small for

smaller SNR, smaller input bandwidth (fin) and larger dynamic range (c).

5.3 Use of PLL to Generate Sampling Clock

As mentioned in 4.4.1, the methodology of delay measurement via sub-sampling

needs a probe clock (of frequency fp) which captures the analog (time) informa-

tion on it and is sampled by a sampling clock (of frequency fs) whose frequency

is slightly less than that of the probe clock. Having separate crystals for the

two clocks is one solution, but increases the cost of the system due to the extra

crystal. Hence, the technique of employing a PLL to generate the sampling clock

from the probe clock is explored in this section.

From (5.7), maximum SNR is obtainable when the difference between fp and

fs is small (i.e., the beat frequency fb is small). Hence, it is better to employ a

PLL to generate the sampling clock from the probe clock as given by:

fs =N − 1

Nfp (5.28)

where N is an integer. In such a case where a PLL is used, the sampling clock

will not be asynchronous from the probe clock (since the phases of the two are

related).

From (5.28) and (5.16), the clock periods and over-sampling ratio (OSR) are


Table 5.1: Example numbers for parameters discussed in (5.27) and (5.26)

c fin SNR (dB) fp (Hz) fs (Hz)

0.50

10 Hz

20.00 2.60×103 2.34×103

40.00 7.65×104 7.40×104

60.00 2.36×106 2.34×106

1 kHz

20.00 2.60×105 2.34×105

40.00 7.65×106 7.40×106

60.00 2.36×108 2.34×108

1 MHz

20.00 2.60×108 2.34×108

40.00 7.65×109 7.40×109

60.00 2.36×1011 2.34×1011

0.80

10 Hz

20.00 1.72×103 1.46×103

40.00 4.88×104 4.62×104

60.00 1.49×106 1.46×106

1 kHz

20.00 1.72×105 1.46×105

40.00 4.88×106 4.62×106

60.00 1.49×108 1.46×108

1 MHz

20.00 1.72×108 1.46×108

40.00 4.88×109 4.62×109

60.00 1.49×1011 1.46×1011

given by

T =(N − 1)

N(T + ∆T ) (5.29)

Tb = NT (5.30)

OSR =Tin

2Tb=

Tin

2NT(5.31)


From (5.11) and (5.31), the limit on SNR due to over-sampling is given by

SNR1 ≤ 20 log10

(√6Tin

2πNT

)(5.32)

The dynamic range is given by

dmax = cT (0 < c < 1) (5.33)

As mentioned in 4.4.1, ∆T is the basic quantization step and better resolution is

obtained by averaging. The limit on SNR due to quantization is derived below.

Noise variance, σ2 =(∆T )2

12

1

OSR=

NT 3

6(N − 1)2Tin

Signal power, P = d2max = c2T 2 (5.34)

SNR2 = 10 log10

(P

σ2

)= 10 log10

(6c2 · (N − 1)2Tin

NT

)(5.35)

= 20 log10

(c(N − 1)

√6TinNT

)(5.36)

For optimum SNR, the desired N to be chosen is given by

√6Tin

2πNT= c(N − 1)

√6TinNT

(5.37)

N(N − 1)2 =1

4c2π2

TinT

(5.38)

The exact analytical solution for N (with the help of MATLAB) is given by:

N = α +1

9α+

2

3, where (5.39)

α =

√( Tin

8c2π2T− 1

27

)2

− 1

729+

Tin

8c2π2T− 1

27

1/3

, (5.40)

ignoring the other two complex roots. Since it is not a simple expression, the


lower and upper bounds are calculated as follows:

(N − 1)3 < 14c2π2

TinT< N3(

Tin4c2π2T

)1/3

< N <

(Tin

4c2π2T

)1/3

+ 1 (5.41)

We will use a value of c = 0.8 for the rest of the discussion below as it matches

with our experimental setup discussed in the next section.

0.34

(fpfin

)1/3

< Nopt < 0.34

(fpfin

)1/3

+ 1 (5.42)

Choosing Nopt ≈ 0.34(fpfin

)1/3

, the optimal SNR is approximately given by:

SNRopt ≈ 10 log10

(6c2TinT·(

Tin4c2π2T

)1/3)

(5.43)

=40

3log10

(TinT

)+ 1.17 (5.44)

=40

3log10

(fpfin

)+ 1.17 (5.45)

For a desired SNR at a given frequency fin, the design parameters are given by

fp = fin · 103 (SNR−1.17)/40 (5.46)

N = 0.34 ·(10(SNR−1.17)/40

)(5.47)

T =1

fp(5.48)

∆T =T

N − 1(5.49)

Example numbers for the design parameters for desired SNR at given fre-

quency fin is given in Table 5.2.


Table 5.2: Example numbers for design parameters fp and N for desired SNR atgiven frequency fin

fin SNR (dB)Approximate Exact

fp (Hz) N N SNR (dB)

10 Hz

20.00 2.58×102 1.01 1.76 15.14

40.00 8.17×103 3.18 3.89 38.27

60.00 2.58×105 10.05 10.75 59.43

80.00 8.17×106 31.79 32.53 79.82

100 Hz

20.00 2.58×103 1.01 1.76 15.14

40.00 8.17×104 3.18 3.89 38.27

60.00 2.58×106 10.05 10.75 59.43

80.00 8.17×107 31.79 32.53 79.82

1 kHz

20.00 2.58×104 1.01 1.76 15.14

40.00 8.17×105 3.18 3.89 38.27

60.00 2.58×107 10.05 10.75 59.43

80.00 8.17×108 31.79 32.53 79.82

1 MHz

20.00 2.58×107 1.01 1.76 15.14

40.00 8.17×108 3.18 3.89 38.27

60.00 2.58×1010 10.05 10.75 59.43

80.00 8.17×1011 31.79 32.53 79.82

5.4 Experimental Validation

The PLL on the Virtex-5 development board provides an output clock of fre-

quency fo from an input clock of frequency fi related by

fo =p

qfi (5.50)


Figure 5.3: A typical PWM signal - the modulating sine wave is also shown indotted lines.

with the limits on the parameters as [87]

p ≤ 64, q ≤ 99, fo ≤ 600MHz, fi ≥ 20MHz (5.51)

Demonstration of time measurement via sub-sampling with a sampling clock

generated from the probe clock itself using a PLL is described in this section. The

analog information is encoded in the duty cycle of the clock i.e., the ON-time

of the probe clock in each period represents the analog information. Such a way

of information representation is popularly referred to as pulse-width modulation

[PWM]). The pulse-width modulated signal is easy to generate using a comparator

(with acceptable input offset and linearity) with one of the inputs being the test

analog voltage and the other input being a periodic ramp or saw-tooth wave1.

For the purpose of this work, a PWM signal generated by part number 33220A

is used. Refer Fig. 5.3 for a typical pulse-width modulated wave, with the test

analog voltage being shown in dotted lines. The duty cycle of the probe clock

is varied between 10% and 90% as determined by the sine wave. A sine wave

is used here so that SNR (signal-to-noise ratio) can be a reasonable metric to

evaluate performance. The duty cycle is measured as the ratio of the ON-time

to the time-period of the sub-sampled signal.

1A saw-tooth wave can be generated by integrating a square wave.


5.4.1 Implementation of duty-cycle measurement unit

The block diagram of the system implemented in the Virtex 5 development board

is shown in Fig. 5.4. The PWM input signal is given from function generator,

bearing part number 3320A. The pin marked X is tied to PWM or another crystal

for different settings. The state machine block shown is implemented as shown in

Fig. 5.5. The vector (·, ·, ·) shown on the edges of the figure is the tuple (S,eN ,eD),

where S is the sub-sampled signal output of the flop; eN is the enable for ON-

time counter and eD is the enable for period counter. S is the input to the state

machine while eN and eD are the outputs. The debounce logic is not shown in

Fig. 5.5 to keep it simple, but similar logic explained in 4.4.1 can be used.

The system implemented on Virtex-5 development board consists of sub-

sampling flops, duty-cycle measurement along with the PLL (with frequency scal-

ing factor of N−1N

). A PWM signal described above is input to the system. For

a comparative study on the choice of input clock feeding the PLL, in one case a

D Q

DFF

PWM

÷PLLX

StateMachine

Counter1

Counter2

S eN

eD

ON time

Period

FPGA

Figure 5.4: Block diagram of system implemented in Virtex 5 development board

q0start q1 q2

(0,0,0)

(1,1,1)

(1,1,1)

(0,0,1)

(0,0,1)

(1,1,1)

Figure 5.5: State Machine of System Implemented in FPGA (Fig. 5.4)

Tuple: (S,eN ,eD). Input: S. Outputs: eN , eD.


different clock source is input to the PLL, while in the other case, the pulse-width

modulated probe clock itself is directly fed to the PLL.

D Q

DFF

PLL

PWM

Source

S

(a) Asynchronous case

D Q

DFF

PLL

PWM S

(b) Synchronous case

D Q

DFF

PLL ÷

PWM S

(c) Dithered case

Figure 5.6: The sources of probe and sampling clocks for different cases

S: Sub-sampled signal

In summary, there are three methods of generating the sampling clock, as

shown in Fig. 5.6:

• A separate clock source/crystal (asynchronous case)

• Sampling clock derived from probe clock with frequency scaling by a factor

of N−1N

(synchronous case)

• Sampling clock derived from probe clock with frequency scaling by a factor

of N−1N

, but with dithered division ratio (dithered case)


5.4.2 Large divide ratios and dithered divide ratio

Due to limitation of frequency scaling in the PLL, a frequency scaling factor of

120/121 cannot be readily implemented with the available PLL. In such a case,

the PLL is used to multiply the input clock frequency by a factor of 30, and the

PLL output is subsequently divided using a frequency divider by factors of {30,

30, 30, 31} in succession (using the delta-sigma modulation of representing 30.25

by the integer sequence {30, 30, 30, 31} so that the average is 30.25), in effect

making the frequency scaling factor 30/30.25 = 120/121.

5.5 Measurement Results

The values of measured duty-cycle values for the PLL setting of N=16 and an

input sine wave of frequency 10 Hz is shown in Fig. 5.7, and its FFT is shown

in Fig. 5.8. Fig. 5.7 shows that there is an offset as the ideal range of duty cycle

should have been between 0.1 and 0.9.

The measured results are summarized in Table 5.3 and Table 5.4. Fig. 5.11

shows the theoretical limits on SNR due to oversampling and quantization; and

also the points obtained by measurement. At each value of N , for a given setting

of fin and fp, the mean and standard deviation of SNR from 16 measurements

are reported in Table 5.3 and Table 5.4.

The measured results confirm the existence of an optimal N value for a spec-

ified fin and chosen fp, as predicted by theory. But, influence of jitter on the

quantization levels was not modeled for the analysis which shows up as the gap

between maximum attainable SNR and actually obtained numbers from measure-

ments. Another reason for the gap from oversampling limit is because this system

is employing simple averaging as against noise-shaping (as in sigma-delta ADCs).

However, the trend in the SNR numbers matches closely with that predicted by

theory.

The fourth column (under heading maximum limit) of Tables 5.3 and 5.4

contains the minimum of the oversampling and quantization limits and shows a

difference of 3 dB between settings of first and second row at small N values in

Table 5.3, while the said difference becomes 6 dB at large N in Table 5.4. It


Table 5.3: Summary of Measured Results comparing asynchronous and syn-chronous cases of sampling clock generation

Table shows mean value of SNR in dB with standard deviation in parenthesis

fp = 5 MHz, fin = 10 Hz

NSNR (dB)

Async Sync Max. Limit

6 12.36 (1.62) 18.59 (5.61) 69.03

11 37.89 (1.84) 35.59 (0.00) 72.42

16 55.96 (1.92) 37.41 (3.07) 74.31

31 48.99 (0.63) 47.76 (2.76) 75.97

41 46.87 (0.50) - 73.54

42 - 46.84 (0.11) 73.33

Table 5.4: Summary of Measured Results comparing asynchronous and dithered(synchronous) cases of sampling clock generation

Table shows mean value of SNR in dB with standard deviation in parenthesis

fp = 5 MHz, fin = 10 Hz

NSNR (dB)

Async Dithered Max. Limit

61 37.79 (0.31) 44.08 (0.44) 70.09

121 24.20 (0.12) 35.53 (1.00) 64.14

is easy to see that the difference between the settings for measurement and the

parameters obtained from design equations is least at largest values of SNR, as

shown in Fig. 5.10. As mentioned previously, this gap exists as the effects of jitter

on linearity and quantization is not modeled.

A least squares linear curve fit is of maximum and measured SNR against

the logarithm of PLL division parameter N is shown in Fig. 5.9. The entries of

Async case of both Tables 5.3 and 5.4 are chosen for the fit. The SNR values


corresponding to small and large values of N are off, leading to slopes different

from the expected values; whereas the local slope of mean values of SNR between

N values of 31 and 41 is -5.26, which matches well with the theoretical value.

As explained earlier, the PLL cannot readily implement large divide ratios of

more than 60 for the synchronous case. As a result to implement a divide ratio

of 61, or equivalently a frequency scaling factor of 60/61, the divider count of

the divider in Fig. 5.6(c) needs to alternately switch between 30 and 31 (so that

30/30.5 = 60/61).

From Table 5.3, there is not much to choose from between the asynchronous

and synchronous cases for relatively low values of N , except for the outlier at

N = 16. But in the case of large values of N , Table 5.4 shows that the dithered

case performs better than the asynchronous case. However, since the SNR values

attained in the dithered case does not fall outside the range of SNR values ob-

tained by synchronous case at lower values of N , the synchronous case itself can

be made use of at appropriate choices of N .

5.6 Conclusions

A method of generating sampling clock from probe clock for the purpose of time

measurement on a Virtex-5 development (FPGA) board is discussed in this chap-

ter. Analog information in the form of a sine wave is used to modulate the ON-

time of probe clock, yielding a PWM signal. Measurement of duty cycle of the

PWM signal by the sub-sampling approach of delay (time) measurement gives

a quantized version of the modulating sine wave. Measurement results of the

system with different ways of generating the sampling clock are reported, with

the maximum attained SNR being 55.16 dB. The variation of SNR with N (PLL

divide ratio) is investigated theoretically and the measured results confirm the

existence of an optimal N which yields maximum SNR. There is a gap between

the settings of actual measurement and the parameters computed theoretically,

however the gap is lesser for larger values of SNR, obtained at moderate values

of N .

In conclusion, the synchronous case of clock generation described performs

as well as the asynchronous case for low values of N , and achieves higher SNR


values than that attained by dithered case. It is therefore the system of choice also

owing to its simple and low cost implementation as it avoids an additional crystal

needed by asynchronous case, and the implementation of alternating divide ratios

as needed for the dithered case.


0 1 2 3 4 5 6 7 8 9 10

x 104

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

No. of Samples

Du

ty c

yc

le

Figure 5.7: Samples of duty-cycle measurement - Quantized values of the inputsine wave

100

101

102

103

104

105

106

10−14

10−12

10−10

10−8

10−6

10−4

10−2

100

Frequency (Hz)

Ma

gn

itu

de

(d

B)

Figure 5.8: Spectrum of measured duty-cycle samples, showing a clear peak at10 Hz, the input sine frequency


0 20 40 60 80 100 120 14010

20

30

40

50

60

70

80

90

N

SN

R (

dB

)

Measured Data

Linear fit

Maximum limit

−6.02 log2(N) + 105.80

−11.02 log2(N) + 102.60

3.77 log2(N) + 59.31

30.66 log2(N) −67.24

Figure 5.9: Linear curve fitting of SNR versus log(N)

0 20 40 6010

1

102

103

104

105

106

107

SNR (dB)

f p (

Hz)

0 20 40 6010

−1

100

101

102

103

SNR (dB)

N

Theory

Measured

Figure 5.10: Gap between theoretically predicted parameters and actual mea-surement settings. Note that the difference is least at large values of SNR.


10

01

01

10

21

03

0

50

10

0

SNR (dB)

f in =

10

Hz,

f p =

5 M

Hz

10

01

01

10

21

03

0

50

10

0

SNR (dB)

f in =

10

Hz,

f p =

2.5

MH

z

10

01

01

10

21

03

0

50

10

0

f in =

20

Hz,

f p =

5 M

Hz

N

SNR (dB)

OS

R L

imit

Qu

an

tiza

tio

n L

imit

As

yn

c

Sy

nc

Dit

he

red

Fig

ure

5.11

:P

lot

show

ing

theo

reti

cal

lim

its

onSN

Ran

dre

sult

s(m

ean

SN

R)

obta

ined

from

mea

sure

men

t.

Chapter 6

Multiphase technique to

speed-up delay measurement via

sub-sampling

Consider the problem of delay measurement by the sub-sampling approach intro-

duced in 3.6. A delay d is present between a pair of probe clock signals of period

T . This clock pair is sampled by another clock of period T + ∆T . Since this

sampling frequency is lesser than Nyquist rate, the original signal cannot be fully

reconstructed back, but there is “amplification” of the delay and the time period

in the resulting sampled signal. Fig. 4.4 shows the timing diagram of the various

signals. Hence, the delay d between the clock pair Da and Db is also amplified to

become d(T + ∆T )/∆T between Qa and Qb.

It is well known that when the probe and sampling clocks are not generated

from the same crystal, it leads to the case of random sampling, which is demon-

strated in evaluating ADC performance [33] and in calibrating delay between two

clock phases [89]. To understand this process of random sampling, let the probe

clock be modeled as a circle since it is periodic with period T (the circumference

corresponding to T ). Then, the delay between the clock pair can be modeled as

a sector in this circle. Since the time period of the sampling clock is greater than

the input clocks, the sampling clock edges precesses around the circle. Please

refer Fig. 6.2. The problem is to estimate the size d of the sector and an estimate

Chapter 6. Multiphase technique to speed-up delay measurement viasub-sampling 74

Figure 6.1: Block diagram of DMU (Delay Measurement Unit) based on sub-sampling

is given by:

d =No. of points in the sector corresponding to delay

Total no. of points corresponding to time-period× T (6.1)

This is shown to yield an unbiased estimate of delay in Appendix A and is easy

to note that this estimate has least variance if the total number of points span

the length of an integer multiple of the circumference [62]. To be able to calcu-

late (6.1) practically, two counters are needed; one each for delay and time-period

counts.

If the delay d is a small fraction of the time period T , then the majority of

the sampling points do not contribute to the numerator of (6.1) limiting the use

of such data points. Also, in such cases the obtained SNR will be lesser than

the maximum possible, as is evident from Fig. 5.2. This problem is analogous

to the situation of rare events in a Monte-Carlo simulation [90] and to the need

of automatic gain control (AGC) in ADCs to prevent its under-utilization. A

straight-forward solution is to use high frequency clocks (low T ) which makes it

consume more power and not work at low supply voltages. Hence, it is interesting

to apply techniques similar to rare-event simulation and AGC for quicker delay

estimation. Such a technique will also improve the SNR achieved.


Point in first

round

Sector of interest

Point in second

round

Figure 6.2: Illustration of the sampling clock precessing around the input clock.The circumference represents the time period of input clock while the sectorrepresents the delay to be measured. The asterisk shaped points are the edges ofsampling clock.

6.1 Proposed Solution

Consider a toy example where the periods of input clock and sampling clock are

16 and 17 units respectively, with the test delay being 2 units. The duration of

a beat period is 16 × 17 = 272 unit and the delay is amplified to 2 × 17 = 34

unit. As a result, the period counter goes till 16 (= 272/17) and delay counter till

2 (= 34/17); which means that 14 samples out of 16 do not contribute much to

delay measurement. If a two-phase clock were available, having detected that the

delay counter is not changing, the other phase is fed to the counter. Hence, the

delay counter now will count upto 4; but has to be divided by 2 since two-phases

were used. Similarly, use of a 4-phase clock will yield a delay count of 8, which

needs to be divided by 4 to get the actual delay. Fig. 6.3 illustrates this example.

In general, suppose an N-phase clock is available and we are interested in

measuring the delay of a DUT (Device Under Test). At the end of the first beat

period, the counts for the delay and time-period will be available. In each of the

subsequent beat periods, once the delay count saturates, the next appropriate


0 5 10 15 20 25 30 350

10

Peri

od

co

un

t

0 5 10 15 20 25 30 350

2

4

Dela

yco

un

t Single phase

0 5 10 15 20 25 30 350

5

Dela

yco

un

t 2 phases

0 5 10 15 20 25 30 350

5

10

Dela

yco

un

t

Time (No. of sampling periods)

4 phases

Figure 6.3: Counts corresponding to period and delay. Two-phase and four-phase clocks measure delay twice and four times in a beat period respectively,thus providing more accuracy in the same measurement time.

phase is calculated and fed to the DUT. As a result, the sector of interest is

scanned multiple times in a beat period, leading to reduced measurement time

for same accuracy or improved accuracy for the same measurement time. The

flowchart describing this process is shown in Fig. 6.5.

An issue, though, is that the phase spacings of the N-phases may not be equal.

Let φ1, . . . , φN be the phases and let di1, . . . , din and pi1, . . . , p

in be the counts of

delay and period respectively for each phase in the ith beat period (Here n < N

since all the available phases may not be used). We need an estimate for the

delay d based on di1, . . . , din and pi1, . . . , p

in. One possibility is to use

d =

(di1pi1

+ · · ·+ dinpin

)× T

n(6.2)

This estimate is not accurate since the counts pi1, . . . , pin can be different due to


0 0.1 0.2 0.3 0.4 0.50

5

10

15

20

25

30

35

Delay as a fraction of time−period (d/T)

Sp

eed

−u

p

N=32

N=16

N=8

N=4

N=2

Plot of y=1/x

Figure 6.4: Plot of speed-up obtained corresponding to fraction of delay to time-period. Here N is the number of phases of clock available.

unequal phase-spacings. A better estimate is given by:

d =di1 + · · ·+ dinpi1 + · · ·+ pin

× T

n

=di

pi× T

n(6.3)

where di = di1 + · · · + din and pi = pi1 + · · · + pin. Estimate of (6.3) is better

since the sum of all phase-spacings is very close to an integer multiple of the time

period modulo the jitter. The speed-up of this scheme over the single phase case

is given by:

Speed− up, n = min

(N,

⌊T

d

⌋)(6.4)

The plot of speed-up n versus the delay (d as a fraction of time-period T ) is

shown in Fig. 6.4. As can be seen from the plot, the speed-up normally goes as

the inverse of d/T and saturates at N , the number of phases available. Hence,


the improvement achieved by this scheme is larger for small delays and smaller

for large delays. Since the scheme based on DMU inherently provides resolution-

bandwidth trade-off, the speed-up obtained can be used to reduce measurement

time or increase accuracy over the single phase scheme.

There will be an error in the measurement if a certain phase does not span

the delay completely. For instance, with an 8-phase clock, suppose the delay to

be measured corresponds to one-fourth of the time-period, and the rising edge of

7th phase is chosen. Since, only 1/8th of the time-period is left, only half the

delay will be counted, leading to an error in the measurement. Such cases can be

avoided by having a conservative algorithm, wherein the delay count is updated

if and only if it gets saturated which cannot happen unless a chosen phase fully

covers the delay. If a certain phase fails to span the delay completely, the count

corresponding to that phase is discarded. This approach also takes care of phase

mismatch by being conservative, but might lose out slightly on speed-up.

6.2 Simulation

To verify the proposed idea, the system described above with the algorithm ex-

plained in flowchart Fig. 6.5 is implemented in MATLAB Simulink environment,

the block diagram of which is shown in Fig. 6.6. An option is made to select one

of single phase, 4-phase or 8-phase clock input to the DUT, which gives out a

pair of clock signals between which a delay is setup. This clock pair is sampled

by a sampling clock of slightly lesser frequency than the input clock. A pair of

counters, one each for delay and period, are setup. The delay count is the accu-

mulated arithmetic difference between the pair of sub-sampled outputs, while the

period count is the accumulated number of rising edges of the sampling clock in

a period of the sub-sampled output. The ratio of delay count to the period count

is an unbiased estimate of the delay d as a fraction of the time-period T [62].

Period count and Delay count implement the appropriate counters, while Deter-

mine phase shift subsystem determines the phase-shift to be applied each time

and generates control signals to reset the counters for every new measurement.


6.3 Simulation Results

6.3.1 Case of Fixed Input Delay

Simulation results for the case of fixed delay are tabulated in Table 6.1 for the

parameters mentioned below the table. The point to be noted is that in entries

4, 5, 6 of Table 6.1, the test delay of 0.5 ns is much smaller than the basic quan-

tization step of 4.72 ns. Mean and standard deviation of 100 measurements are

taken with the same setting. A larger standard deviation of the measured de-

lay in the single phase case means that although the average of measured delays

across measurements is quite close to the input delay, certain individual measure-

ments may be off. Hence, the measurements with single phase clock will need

more averaging to guarantee better accuracy, leading to increased measurement

time. However, the standard deviation is lesser when multi-phase input clock is

employed, leading to reduced measurement time for a given accuracy.

In general, the variance of measured delay decreases inversely as the measure-

ment time. Hence, if σ21 and σ2

N be the variances of a certain measured delay, and

T1 and TN be the respective measurement times, then

σ21

σ2N

=TNT1

(6.5)

But, in this particular scheme, use of N-phase input clock yields a measurement

of lesser variance. Hence, to achieve a given variance of the measured delay, the

scheme with N-phase input clock takes lesser time. This speedup is roughly given

by

Speedup, n =σ2

1

σ2N

(6.6)

Entries of Table 6.1 confirm that speedup is more for small delays and is in close

agreement with theoretically expected values.

6.3.2 Cose of Slowly Varying Input Delay

It is shown in [56] that such delay measurement schemes can also handle slowly

varying delays without the explicit use of sample-and-hold circuitry. However,


Table 6.1: Summary of Measured Results for fixed input delay

Sl. Input Delay No. of Measured Delay (ns) Speedup overNo. (ns) phases Mean Std. deviation single phase

120

1 20.00 1.71 1

2 4 19.94 0.87 3.86

3 8 20.05 0.85 4.054

0.51 0.55 1.48 1

5 4 0.55 0.64 5.35

6 8 0.50 0.50 8.36

Setting:Input clock frequency, fc = 10 MHzSampling clock frequency, fs = 9.55 MHzDuration of beat period, Tb = 2.218µsBasic quantization step, ∆T = 4.72 nsJitter in both input and sampling clocks = 100ps.

Table 6.2: Summary of Measured Results for slowly varying delay

Sl. Input Delay Amplitude No. of Measured Delay Improvement in SNRNo. (ns) phases SNR (dB) over single phase (dB)

15

1 06.50 0

2 4 12.99 06.49

3 8 17.59 11.094

151 15.56 0

5 4 18.85 03.29

6 8 18.63 03.07

Setting:Input sine frequency, fin = 484.15 MHzOversampling ratio, OSR = 465.45

the varying input, say a sine wave, should be suitably oversampled so that the

test delay will not change by more than an LSB during the measurement time.


If d = A sin(2πfint) is the test delay input and Tb is the measurement time, then

2πAfinTb ≤ a

fin ≤a

2πATb(6.7)

where a is the LSB. With Tb = 2.218µs for 8 bits, fin can be atmost 560 Hz.

Choosing1 fin = 484.15Hz, we have the results for the different number of input

clock phases summarized in Table 6.2, with the other settings same as that of

Table 6.1. The SNR numbers reported here are before low pass filtering, and

hence the SNR will improve after filtering and decimation. The results clearly

show that SNR for small test delays improves in the case of multiple phases,

although the increase may not be substantial for large test delays.

6.4 Conclusions

The method of time measurement using sub-sampling based DMU is briefly re-

visited and its limitation in measuring small delays is described. A solution to

improve the speed (and/or accuracy) by making use of a multiphase probe clock

is described. Simulation results from MATLAB Simulink environment demon-

strate a speed-up of upto a factor of eight achieved by an eight-phase input clock

for fixed test delays and an improvement in SNR of upto 11dB for slowly varying

test delays.

1Chosen to ease computation of FFT by eliminating spectral leakage. [91]


Get p0, d0 and N

Start phase,k = 0

Start countersdi = 0, pi = 0

n = 0.

Update coun-ters di and pi

Did di

saturate?

Advance phase,

k = k +⌈N di

p0

⌉.

n = n + 1

Is k > N?Save di, pi and n.

Next samplei = i + 1

Yes

No

Yes

No

Figure 6.5: Flowchart for the proposed scheme

N Number of clock phases available

p0 Period count in first beat period

d0 Delay count in first beat period

pi Period count in ith beat period

di Delay count in ith beat period

n Number of clock phases used

k Phase counter, ranges from 0 to N − 1


Variable

Tra

nsport

Dela

y2

To

Variable

Tra

nsport

Dela

y1

To

Variable

Tra

nsport

Dela

y

To

Triggere

d

Subsyste

m1

In1

In2

In3

Ou

t1

Ou

t2

Ou

t3

Subtr

act

Sin

e W

ave

Scope

S/w

2

S/w

1

S/w

Period c

ount

In2

Rst

Ou

t2

No

. ph

.

4

Multip

ort

Sw

itch

1

Multip

ort

Sw

itch

Multip

hase C

lock

1

8−P

hase

Clo

ck

Multip

hase C

lock

4−P

hase

Clo

ck

Mem

ory

4M

em

ory

3M

em

ory

2

Mem

ory

1G

et period

In1

Ou

t1

Dete

rmin

e

next pahse

No

. p

h.

De

lay

Pe

rio

d

Rst

Ph

o

OF

o

n Sa

t

Dla

y C

nt

Dela

y c

ount

In2

Rst

Sw

Rst

Ou

t2

D F

lip−F

lop

2

D CL

K

!CL

R

Q !Q

D F

lip−F

lop

1

D CL

K

!CL

R

Q !Q

D F

lip−F

lop

D CL

K

!CL

R

Q !Q

Counte

r2

Clk

Rst

Cnt

Hit

UpConsta

nt1

1

Com

pare

To C

onsta

nt

==

2C

lock

;1::

AN

D:.

AN

D

:

boole

an

.;1

.:1

...0

..

double .

double

Figure 6.6: Block diagram implemented in MATLAB Simulink

Chapter 7

Example Application

The technique of observing internal analog voltages described earlier in Chapter 4

is used in the power scalable receiver implementation in UMC 130 nm process,

the block diagram of which is shown in Fig. 7.1.

7.1 Power Scalable Receiver Implementation

Fig. 7.1 shows the block diagram of a low-IF receiver designed for the ZigBee

standard, the IF (intermediate frequency) being 3 MHz. The key innovation of

the design is in sensing the strength of the signal and interference, and accordingly

switching the receiver to low power modes.

The power-scalable receiver has an RSSI (Received Signal Strength Indica-

tor) block, which senses the strength of received signal and an RISI (Received

Interference Strength Indicator) block, which senses the strength of interference.

In case the received signal is strong and interference is weak, the various blocks

in the receiver chain, namely the LNA, VGA and ADC, switch to low power

modes as indicated by the RSSI and RISI blocks. The details of the approach

and implementation is discussed in [36, 92].

The receiver chain is designed to work at a supply of 0.8 V and as a result the

common voltage is expected to be around 400 mV. In this test chip, the common

mode voltage at the output of the mixer was not controlled as the common mode

feedback (CMFB) circuitry was implemented as part of the VGA. As a result,

knowledge of the common mode voltage at the output of the mixer helps the

Chapter 7. Example Application 85

designer while testing/de-bugging. As shown in Fig. 7.1, the I (in-phase) path

of the mixer output is sampled using the sampling head (SpH), reproduced in

Fig. 7.2 for convenience. If the common mode was found to be far different than

what was desired, the common mode has to be controlled from outside the chip.

Similarly, at the input of the ADC, the VGA sets the common mode. But

if this common mode is far off from what is desired, the dynamic range of the

ADC gets affected. Again, as shown in Fig. 7.1, the common mode voltage of I

(in-phase) and Q (quadrature) paths is found (using two large resistors so that

they do not load the paths) and is sampled using the sampling head of Fig. 7.2.

Hence, again knowledge of the common mode voltage at the input of the ADC

comes in handy, and if found to be far off can be adjusted externally.

In summary, the placement of sampling heads (SpH) at internal test nodes

helps in testing/de-bugging during design phase and can be used for production

testing or in-use monitoring post manufacturing.

7.2 BIST Implementation

The mentioned method needs a voltage-controlled delay to be designed which

takes up minimal area and is reasonably linear in the range of test voltages. A

comparison of different voltage-to-delay converters are given in Table 4.1. The

popular architecture of current-starved inverters is made use of, to design the

voltage-controlled delay cells, one each for voltages close to GND and close to

VDD; as shown in Fig. 7.3a and 7.3b respectively. The cell of Fig. 7.3a is used at

both the places shown in Fig. 7.1 since the common mode of 400 mV is closer to

GND than VDD. The cell of Fig. 7.3b is used a monitor an internal bias voltage

node in the RSSI block of Fig. 7.1 (not shown explicitly to avoid clutter), which

is expected to be at 800 mV .

Transistors M6 of Fig. 7.3a and M4 of Fig. 7.3b are tied to the power rails to

make sure that the delay does not blow up for test voltages close to either power

rail, and also improves the linearity of the delay cell. Transistors M7 and M8 of

both cells help improve the slew rate and make the edges better immune to noise.

The above mentioned delay cells are employed in the system as shown in

Fig. 7.4. The pair of delay cells convert the voltage difference (Vin - Vref) to a


MIXER

ANTENNA

LNA

I

Q

RISI

RSSI

To

Dig

ita

l B

ase

ba

nd

ADCVGAVGA

Interference Strength Dependent Controls

Signal Strength Dependent Controls

PLL

VCO

FILTER

SpH

SpH

Figure 7.1: Block diagram of power-scalable receiver. Courtesy: Kaushik Ghosal

Figure 7.2: Sampling head (SpH)

delay difference between the pair of delayed clock outputs. To measure this delay

difference accurately, both the delayed clock outputs are sampled by another

clock, whose frequency is slightly less than that of the probe clock. As a result, the

outputs of the pair of flip-flops are a pair of sub-sampled signals whose frequency

is the difference between that of the probe and sampling clocks. It also turns out

that the delay difference setup by the pair of delay cells is now expanded and can

be measured using an up/down counter. For the purposes of this tape-out, the


M1Vin

M2

M3

M4

M5

M6

M7

M8

M9 M0

Probe clock In

Probe clock Out

(a) Voltage controlled delay cell - for voltages close to GND.

M1

M2

Vin M3

M4

M5

M6

M7

M8

M9 M0

Probe clock In

Probe clock Out

(b) Voltage controlled delay cell - for voltages close to VDD.

Figure 7.3: Architecture of voltage controlled delay cells

pair of sub-sampled signals are brought out through the pin and will be analyzed

off-chip using FPGA. Also, to correct for the possible non-linearity of the delay

cells, a provision is made for calibration prior to actual testing by providing an

analog multiplexer to select between calibration voltage and test voltage.

As shown in Fig. 7.4, the extra pins needed for this testing procedure are as

follows:

• Calibration Voltage (Vcal)


Vin Clkout

V2D

Vin Clkout

V2D

D Q

DFF

D Q

DFF

Vref

S1

S2

Sampling Clock

Probe Clock

Clock Enable

Vin

Vtest

Vcal

Cal/Test

Figure 7.4: Block Diagram of the BIST Setup

(S: Sub-sampled signal)

• Probe Clock

• Sampling Clock

• Pair of output beat signals

• Multiplexer select signals

However, it is to be noted that calibration voltage and probe clock inputs need to

come from outside. Sampling clock can be generated from the probe clock using

a PLL as described in Chapter 5. The pair of output beat signals feed into the

on-chip DMU, described in Chapter 4, the output of which is read out via the

digital test infrastructure. The multiplexer select signals need to be given from

the BIST control unit. The total area of the BIST setup shown in Fig. 7.4 is

about 60 µm × 23 µm.


7.3 Simulation Results

The die micrograph along with the layout snapshot of the implemented BIST

block is shown in Fig. 7.5. The plots of differential delay versus differential

voltage for the delay cells of Fig. 7.3a and Fig. 7.3b is shown in Fig. 7.6. The

plots clearly reveal that the circuit of Fig. 7.3a gives higher sensitivity for voltages

near GND and that of Fig. 7.3b at voltages near VDD. A sample of the variation

of this delay across process and temperature is also shown in the plots of Fig. 7.6.

Figure 7.5: Die micrograph of the power scalable receiver implementation withthe layout snapshots of BIST blocks inserted

7.4 Conclusions

Implementation of a BIST scheme to observe the common mode voltage of analog

circuitry in a test chip of a power-scalable receiver fabricated in UMC 130 nm

is described in this chapter. The design of the voltage-to-delay cells and control

circuitry are described along with simulation results from Cadence environment.

The voltage-to-delay cell described here is better than the one used earlier, as is

clear from the simulation results.


0 0.2 0.4 0.6 0.8 1 1.2 1.4−1

0

1

2

3

4

5

Vin

(V)

De

lay

Dif

fere

nc

e (

ns

)

27°C FNSP

27°C TT

127°C TT

127°C FNSP

(a) Plot of difference delay versus differential voltage

0 0.2 0.4 0.6 0.8 1 1.2 1.4−1

0

1

2

3

4

5

6

Vin

(V)

De

lay

Dif

fere

nc

e (

ns

)

127°C TT

27°C TT

27°C FNSP

127°C FNSP

(b) Plot of difference delay versus differential voltage

Figure 7.6: Simulation Results

Chapter 8

Conclusions

The overall problem of analog testing in an SoC environment which generalizes

well across different classes of analog circuits and offers concurrent testing is still

an open issue. The availability of processing power, especially in terms of digital

processing can be leveraged to design low cost test strategies. In this thesis, a

method of enabling BIST for analog IPs in an SoC setting is developed. The

main goal of the solution was to go towards an all-digital approach to benefit

from technology scaling.

8.1 Contributions

In order to meet the said goals, a simple low cost ‘digitizer’ is developed instead

of a full blown ADC. This ‘digitizer’ is composed of two parts - a sampling head

(SpH) to convert test voltage to delay on a pair of low frequency clock signals,

and a DMU to measure the delay thus setup. Owing to the simplicity and less

area overhead of the SpH, multiple test points could be observed by replicating

the SpH at each test node. Therefore, the sampling heads give rise to as many

low frequency clock pairs as there are test nodes. A multiplexer selects the pair

of low frequency signals to be fed to the DMU based on the test node to be

tested/monitored. A key feature of such an approach is that the test voltage is

always connected to the SpH, thereby avoiding insertion of switches in the signal

path which can potentially degrade system performance.

The sub-sampling approach of delay measurement, introduced in Chapter 3

Chapter 8. Conclusions 92

and applied to the problem of measuring analog voltages in Chapter 4, requires

a probe clock as a carrier of the delay and a sampling clock (of slightly different

frequency) to sample this clock pair carrying the delay between them. A strategy

of generating the sampling clock from the probe clock, so as to minimize the

number of pins that connect to the tester, are described in Chapter 5 along with

derivation of design parameters. A technique of speeding-up such a measurement

using multiple phases of a clock is described in Chapter 6.

8.2 Scope for future work

The technique of measuring low-frequency analog voltages described in this thesis

offers very low area overhead, and is all-digital in nature and therefore benefits

from further technology scaling. It also provides very fine resolution for the testing

of about a miilivolt as demonstrated from measured results from a test chip.

Also, the technique offers a trade-off between measurement time and resolution

achieved, thereby can potentially be sped up for quicker and coarser testing.

Although these features make the proposed technique promising for deep sub-

micron CMOS process, there are a few limitations of this technique, which are

presented next.

Production testing of integrated chips needs techniques to quickly determine

if the chip is a ‘pass’ or ‘fail’. Every millisecond spent on the tester to make

this decision costs, as pointed out in 1.2. Although a technique of speeding up

measurements was described in Chapter 6, testing one node after another serially

as described in this thesis may be costly in such situations. A work-around

for this limitation can be to use multiple control units, like the one shown in

Fig. 4.1, controlling different sets of test nodes; similar to the approach of using

multiple scan chains to speed up digital circuit testing avoiding additional pin

overheads [93].

Another limitation of the dynamic range of analog voltages available for test-

ing is limited by the linearity of the V2D cell used. As it is hard to get V2D cells

to behave linearly over a wide range of voltages, different V2D designs have to

be adopted based on the range of particular test node. One way of mitigating

this non-linearity is calibration as suggested in the thesis. But such a calibration

Chapter 8. Conclusions 93

would need a few analog voltages to be generated. It would be desirable to gen-

erate those voltages in a digital manner, by controlling the duty cycle of a clock

for example, or by similar approaches. Such a technique on-chip would reduce

the burden on the tester to provide these calibration voltages.

Another solution to the said problem of non-linearity of V2D can be to quickly

check if the test delay corresponding to the test voltage is the range of say (dlow,

dhigh), where dlow and dhigh are the delays that correspond to voltages Vlow and

Vhigh respectively. Such a method, which can be called a Go/No-go test, circum-

vents the non-linearity of the V2D cell. However, the voltages Vlow and Vhigh

need to be generated or given by the tester.

The technique based on sub-sampling presented in the absence of sample-and-

hold circuitry works well for near-DC signals and does not suit high frequency

signals directly. In order to test high frequency signals, they can be directly sub-

sampled if the signal is periodic as in [17]. Otherwise, the information in the

signal needs to be converted to a DC signal as in [5] or impressed on to a periodic

signal. Such a scheme can be extended to characterize the frequency response of

DUTs.

Although a behavioral model is developed for the system and analyzed to

obtain the design parameters for best performance, the jitter in the clocks which

directly affects the quantization levels is not modeled in this thesis. Such a

modeling of jitter would help a better understanding of the gap present between

analytical and measured results; and also enable the designer to come up with

acceptable jitter numbers for such techniques.

Appendix A

Unbiased Delay Estimator

In this appendix, we will substantiate the claim that the asynchronous sub-

sampling approach described in 4.4.1 leads to an unbiased estimate of delay.

Let T1, T2 be the times within a clock period when the clock pairDai, Dbi cross

the logic high threshold and let Ts be the time when the sampling clock crosses

the sampling threshold. Due to jitter, these are random variables. Without loss

of generality, let the mean of T1 be zero. The mean of T2 is δ, the quantity to be

estimated.

Let

T2 = T2 − δ

and let

Ts = ts + Ts (A.1)

where ts is the mean value of Ts, and Ts is the random component.

It is of interest to determine the probability that the samplers sample a logic

high. A sampler samples a logic high if the sampling edge occurs earlier than the

clock edge. Hence,

P (q1 = 1) = P (T1 < Ts) = P (T1 − Ts < ts) (A.2)

Let Z1 = T1 − Ts. Let Φ1(·) be the CDF of Z1. From (A.2),

P (q1 = 1) = P (Z1 < ts) = Φ1(ts) (A.3)

Chapter A. Unbiased Delay Estimator 95

Let Z2 = T2 − Ts. Let Φ2(·) be the CDF of Z2. Then,

P (q2 = 1) = P (T2 + δ < Ts) = P (Z2 < ts − δ) (A.4)

= Φ2(ts − δ) (A.5)

The output of the delay measurement unit is given as

S =1

2k

2k∑i=1

Xi (A.6)

with Xi = qi1 − qi2, the difference of the ith samples. Hence,

Xi =

1 if qi1 = 1 & qi2 = 0

0 if qi1 = qi2

−1 if qi1 = 0 & qi2 = 1

(A.7)

The expectation of Xi is given by

E[Xi] = P (qi1 = 1, qi2 = 0)− P (qi1 = 0, qi2 = 1)

= Φ1(tis)− Φ2(tis − δ) (A.8)

assuming qi1 and qi2 are independent of one another.

The variance of Xi is given by

var[Xi] = P (qi1 = 1, qi2 = 0) + P (qi1 = 0, qi2 = 1)− (E[Xi])2

= Φ1(tis)(1− Φ1(tis)) + Φ2(tis − δ)(1− Φ2(tis − δ))

≤ 1

2(A.9)

Let the clock period be T and the sampling clock period be T + ∆T , where

T = N∆T +α, where N is an integer and 0 < α < ∆T . This causes the sampling

edge to fall uniformly across the entire period of the sampled signal to create one

beat period. Let the measurement be taken over M beat periods, so MN = 2k.


Hence, (A.6) can be rewritten as:

S =1

MN

∑j

∑k

Xjk (A.10)

Let α = (α1, α2, . . . αM) be the starting phases in each beat period. Then

ES|α[S|α] =1

MN

∑j

∑k

E[Xjk(αj + k∆T )] (A.11)

Substituting from (A.8), applying the law of iterated expectation1 and reordering

the summation, we get

E[S] = Eα[ES|α[S|α]] =1

N

∑k

1

M

∑j

E[Φ1(αj + k∆T )− Φ2(αj + k∆T − δ)]

(A.12)

Since αjs are uniform over 0 to ∆T , (with PDF of 1∆T

), the inner expectation is

identical for each j and can be evaluated as the following integral:

E[S] =1

N

∑k

1

∆T

∫ (k+1)∆T

k∆T

[Φ1(t)− Φ2(t− δ)] dt (A.13)

The above summation can be replaced by an integral over the entire clock period

T , as follows:

E[S] =1

N∆T

∫〈T 〉

[Φ1(t)− Φ2(t− δ)] dt (A.14)

However, if we assume that the skew δ and the jitter of the clocks are small

compared to the period T , then the limits of the integration can be replaced by

±∞, and noting that N ∆T ≈ T , the integral becomes

E[S] =1

T

∫ ∞−∞

[Φ1(t)− Φ2(t− δ)] dt (A.15)

But, if the skew δ and/or jitter σ is not a small fraction of T , then (A.15) yields

an under-estimate of δ. In general, evaluating this integral is difficult. However,

in this particular case, we can revert to the following trick of ‘differentiating under

1E[X] = EY

[EX|Y [X|Y ]

][94]


the integral sign’ (Leibniz integral rule) [95]:

d

dx

(∫ b

a

f(x, t) dt

)=

∫ b

a

∂

∂xf(x, t) dt (A.16)

Taking x = δ, f = Φ1(t)− Φ(t− δ), from (A.15) and (A.16), we have

dE[S]

dδ=

1

T

∫ ∞−∞

Φ′2(t− δ) dt =1

T(A.17)

Since the term inside the integral is a PDF and integrates to unity, we get

E[S] =δ

T(A.18)

The variance of the estimator S is given by

var[S] =1

22k

2k∑i=1

var[Xi] (A.19)

From (A.9),

var[S] ≤ 1

22k2k × 1

2=

1

2k+1(A.20)

The standard deviation of the estimator is given by

σ[S] ≤ 1√2k+1

(A.21)

Appendix B

Noise in Inverter Chain

As described in Chapter 4, current starved inverters are employed as the voltage-

controlled-delay elements. The voltage information is thus encoded in the delay

of the clock signal passing through the element. Thus, jitter accumulated on the

clock signal while propagating through the delay element manifests eventually as

noise in voltage domain.

The variance of jitter added by a delay element is given by [96]

σ2j =

4kTγN tdIN(VDD − Vt)

(B.1)

where

σj Standard deviation of jitter

k Boltzmann’s constant

T Temperature in Kelvin

td Propagation delay of inverter

IN Inverter current

VDD Supply voltage

Vt Threshold voltage

Let σ2j = β2 td. Suppose in a series of m inverters, one is selected to provide

differential delay while the others nominally do not provide differential delay. Let

Chapter B. Noise in Inverter Chain 99

τ1 and τ2 be the delays seen by the pair of clocks. Let D and d respectively be

the maximum and minimum delay of each delay cell and let ξj ∼ N(0, 1) be unit

variance zero-mean Gaussian random variables. Then,

τ1 = D + (m− 1)d+ β√D ξ1 + β(

√(m− 1)d) ξ2 (B.2)

= D + (m− 1)d+ β(√

D + (m− 1) d)ξ3 (B.3)

τ2 = md+ β(√md) ξ4 (B.4)

The quantity of interest is

τ = τ1 − τ2 = D − d+ β(√

D + (m− 1)d+md)ξ5 (B.5)

= D − d+ β(√

D − d+ 2md)ξ5 (B.6)

The signal-to-noise ratio of this setup is given by

SNR =(D − d)2

β2(D − d+ 2md)(B.7)

=(D − d)

β2

(1

1 + 2mdD−d

)(B.8)

≈ D − dβ2

(1− 2md

D − d

)(B.9)

which clearly shows that SNR decreases as m increases.

References

[1] P. K. Das, “Precise on-chip clock skew measurement using subsampling and

applications,” Ph.D. dissertation, Indian Institute of Science, Bangalore,

Karnataka, India, February 2012. xiv, 29

[2] P. Kabisatpathy, A. Barua, and S. Sinha, Fault Diagnosis of Analog Inte-

grated Circuits. Springer, 2005. 1

[3] M. L. Bushnell and V. D. Agrawal, Essentials of Electronic Testing for Digi-

tal, Memory and Mixed-Signal VLSI Circuits. Kluwer Academic Publishers,

2000. 1, 2, 6, 13

[4] M. Abramovici, M. A. Breuer, and A. D. Friedman, Digital Systems Testing

and Testable Design. IEEE Press, 2001. 2

[5] G. Banerjee, M. Behera, M. A. Zeidan, R. Chen, and K. Barnett, “Analog/rf

built-in-self-test subsystem for a mobile broadcast video receiver in 65-nm

cmos,” IEEE Journal of Solid-State Circuits, vol. 46, no. 9, pp. 1998–2008,

September 2011. 2, 6, 9, 93

[6] G. W. Roberts and S. Aouini, “Mixed-signal production test: A measure-

ment principle perspective,” IEEE Design and Test of Computers Magazine,

vol. 26, pp. 48–62, September/October 2009. 3, 5, 6, 12

[7] S. Sindia, “High sensitivity signatures for test and diagnosis of analog, mixed-

signal and radio-frequency circuits,” Ph.D. dissertation, Auburn University,

August 2013. 5

[8] W. M. Lindermier, “Design of robust test criteria in analog testing,” in

ICCAD Digest of Technical Papers, 1996, pp. 604–611. 6

REFERENCES 101

[9] S. Sunter and N. Nagi, “Test metrics for analog parametric faults,” in Pro-

ceedings of the 17th IEEE VLSI Test Symposium, 1999, pp. 226–234. 6

[10] N. B. Hamida and B. Kaminska, “Multiple fault analog circuit testing by

sensitivity analysis,” Journal of Electronic Testing: Theory and Applications

- Joint special issue on analog and mixed-signal testing, vol. 4, no. 4, pp. 331–

343, November 1993. 6

[11] Z. Guo and J. Savir, “Analog circuit test using transfer function coefficient

estimates,” in Proceedings of the IEEE International Test Conference, 2003,

pp. 1155–1163. 6

[12] B. G. Streetman and S. Banerjee, Solid State Electronic Devices. PHI

Learning, 2009. 6

[13] C.-L. Wey and S. Krishnan, “Built-in self-test (bist) structure for analog

circuit fault diagnosis,” IEEE Transactions on Instrumentation and Mea-

surement, vol. 39, no. 3, pp. 517–521, June 1990. 7

[14] D. Vazquez, J. L. Huertas, and A. Rueda, “Reducing the impact of dft on

the performance of analog integrated circuits: improved sw-op amp design,”

in Proceedings of the 14th VLSI Test Symposium, 1996, pp. 42–47. 7

[15] L. S. Milor, “A tutorial introduction to research on analog and mixed-signal

circuit testing,” IEEE Transactions on Circuits and SystemsII: Analog and

Digital Signal Processing, vol. 45, no. 10, pp. 1389–1407, October 1998. 8, 9

[16] D. D. Venuto and M. J. Ohletz, “On-chip test for mixed-signal asics using

two-mode comparators with bias-programmable reference voltages,” Journal

of Electronic Testing: Theory and Applications, vol. 17, no. 3-4, pp. 243–253,

June 2001. 8

[17] Y. Zheng and K. L. Shepard, “On-chip oscilloscopes for noninvasive time-

domain measurement of waveforms in digital integrated circuits,” IEEE

Transactions on Very Large Scale Integration (VLSI) Systems, vol. 11, no. 3,

pp. 336–344, June 2003. 8, 30, 93

REFERENCES 102

[18] R. Ho, B. Amrutur, K. Mai, B. Wilburn, T. Mori, and M. Horowitz, “Appli-

cations of on-chip samplers for test and measurement of integrated circuits,”

in Symposium on VLSI Circuits Digest of Technical Papers, 1998, pp. 138–

139. 8, 31

[19] A. Chatterjee, B. C. Kim, and N. Nagi, “Dc built-in self-test for linear analog

circuits,” IEEE Design & Test of Computers, vol. 13, no. 2, pp. 26–33,

Summer 1996. 8

[20] M. Negreiros, “Low cost bist techniques for linear and non-linear analog

circuits,” Ph.D. dissertation, Universidade Federal Do Rio Grande Do Sul,

July 2005. 9, 11

[21] M. Slamani and B. Kaminska, “Multifrequency analysis of faults in analog

circuits,” IEEE Design & Test of Computers, vol. 12, no. 2, pp. 70–80,

Summer 1995. [Online]. Available: http://dx.doi.org/10.1109/54.386008 9

[22] M. J. Ohletz, “Hybrid built-in self-test (hbist) for mixed analog/digital in-

tegrated circuits,” in European Test Symposium, 1991. 9

[23] B. Provost and E. Sanchez-Sinencio, “On-chip ramp generators for mixed-

signal bist and adc self-test,” IEEE Journal of Solid-State Circuits, vol. 38,

no. 2, pp. 263–273, February 2003. 9

[24] S. Sasho and M. Shibata, “Multi-output one-digitizer measurement,” in Pro-

ceedings IEEE International Test Conference 1998, Washington, DC, USA,

October 18-22, 1998. IEEE Computer Society, 1998, p. 258. 9

[25] S. Callegari, F. Pareschi, G. Setti, and M. Soma, “Complex oscillation-based

test and its application to analog filters,” IEEE Transactions on Circuits and

SystemsI: Regular Papers, vol. 57, no. 5, pp. 956–969, May 2010. 9, 11

[26] J. M. da Silva and J. S. Matos, “Evaluation of irmDD/vout cross-correlation

for mixed current-voltage testing of analogue and mixed-signal circuits,” in

Proceedings of the European Design and Test Conference, 1996, pp. 264–268.

9

http://dx.doi.org/10.1109/54.386008

REFERENCES 103

[27] M. M. Hafed, N. Abaskharoun, and G. W. Roberts, “A 4-ghz effective sample

rate integrated test core for analog and mixed-signal circuits,” IEEE Journal

of Solid-State Circuits, vol. 37, no. 4, pp. 499–514, April 2002. 10, 11

[28] M. Lubaszewski, S. Mir, A. Rueda, and J. L. Huertas, “Concurrent error

detection in analog and mixed-signal integrated circuits,” in Proceedings of

the 38th Midwest Symposium on Circuits and Systems, 1995, pp. 1151–1156.

11

[29] E. F. Cota, M. Negreiros, L. Carro, and M. Lubaszewski, “A new adaptive

analog test and diagnosis system,” IEEE Transactions on Instrumentation

and Measurement, vol. 49, no. 2, pp. 223–227, April 2000. 11

[30] A. Chatterjee, “Concurrent error detection and fault-tolerance in linear ana-

log circuits using continuous checksums,” IEEE Transactions on Very Large

Scale Integration (TVLSI) Systems, vol. 1, no. 2, pp. 138–150, June 1993.

11

[31] M. Mahoney, DSP-Based Testing of Analog and Mixed-Signal Circuits. Los

Alamitos, California: IEEE Computer Society Press, 1987. 11

[32] M. Negreiros, L. Carro, and A. A. Susin, “Testing analog circuits using

spectral analysis,” Microelectronics Journal, vol. 34, no. 10, pp. 937–944,

October 2003. 11

[33] J. Doernberg, H.-S. Lee, and D. A. Hodges, “Full-speed testing of a/d con-

verters,” IEEE Journal of Solid-State Circuits, vol. SC-19, no. 6, December

1984. 12, 48, 73

[34] L. Jin, H. Haggag, R. Geiger, and D. Chen, “Testing of precision dacs using

low-resolution adcs with dithering,” in Proceedings of the International Test

Conference, 2006, pp. 1–10. 12

[35] S. Aouini, “Extending test signal generation using sigma-delta encoding be-

yond the voltage/amplitude domain,” Ph.D. dissertation, McGill University,

Montreal, August 2011. 12

REFERENCES 104

[36] K. Ghosal, T. Anand, V. Chatrvedi, and B. Amrutur, “A power-scalable rf

cmos receiver for 2.4 ghz wireless sensor network applications,” in 12th IEEE

International Conference on Electronics, Circuits and Systems (ICECS) Di-

gest of Technical Papers, 2012, pp. 161–164. 12, 84

[37] B. Razavi, RF Microelectronics. Prentice Hall, 2012. 12, 13, 14

[38] J. D. Kraus, Radio Astronomy. McGraw-Hill, 1966. 13

[39] M. J. Burbidge, A. Lechner, G. Bell, and A. M. D. Richardson, “Motivations

towards bist and dft for embedded charge-pump phase-locked loop frequency

synthesisers,” IEE Proceedings on Circuits, Devices and Systems, vol. 151,

no. 4, pp. 337–348, August 2004.

[40] C. Weinraub, “Analog built-in self-test module,” Patent US7 327 153 B2, 02

05, 2008. [Online]. Available: http://www.google.com/patents/US7327153

13

[41] D. Lupea, U. Pursche, and H.-J. Jentschel, “Rf-bist: loopback spectral signa-

ture analysis,” in Proceedings of the Design, Automation and Test in Europe

Conference and Exhibition, 2003, pp. 478–483. 13

[42] A. Halder, S. Bhattacharya, G. Srinivasan, and A. Chatterjee, “A system-

level alternate test approach for specification test of rf transceivers in loop-

back mode,” in Proceedings of the 18th International Conference on VLSI

Design, 2005, pp. 289–294. 13

[43] M. Negreiros, L. Carro, and A. A. Susin, “Low cost analog testing of rf

signal paths,” in Proceedings of the Design, Automation and Test in Europe

Conference and Exhibition, 2004, pp. 292–297. 14

[44] ——, “Noise figure evaluation using low cost bist,” in Proceedings of the

Design, Automation and Test in Europe Conference, 2005, pp. 158–163. 14

[45] L. Gammaitoni, P. Hanggi, P. Jung, and F. Marchesoni, “Stochastic

resonance,” Rev. Mod. Phys., vol. 70, pp. 223–287, Jan 1998. [Online].

Available: http://link.aps.org/doi/10.1103/RevModPhys.70.223 14

http://www.google.com/patents/US7327153

http://link.aps.org/doi/10.1103/RevModPhys.70.223

REFERENCES 105

[46] R. B. Staszewski, K. Muhammad, D. Leipold, C.-M. Hung, Y.-C. Ho, J. L.

Wallberg, C. Fernando, K. Maggio, R. Staszewski, T. Jung, J. Koh, S. John,

I. Y. Deng, V. Sarda, O. Moreira-Tamayo, V. Mayega, R. Katz, O. Friedman,

O. E. Eliezer, E. de Obaldia, and P. T. Balsara, “All-digital tx frequency

synthesizer and discrete-time receiver for bluetooth radio in 130-nm cmos,”

IEEE Journal of Solid-State Circuits, vol. 39, no. 12, pp. 2278–2291, 2004.

15

[47] A. Agnes, E. Bonizzoni, P. Malcovati, and F. Maloberti, “A 9.4-enob 1v

3.8µw 100ks/s sar adc with time domain comparator,” in ISSCC Digest of

Technical Papers, 2008, pp. 246–247,610. 15

[48] S.-K. Lee, S.-J. Park, H.-J. Park, and J.-Y. Sim, “A 21 fj/conversion-step

100 ks/s 10-bit adc with a low-noise time-domain comparator for low-power

sensor interface,” IEEE Journal of Solid-State Circuits, vol. 46, no. 3, March

2011. 15

[49] G. Li, Y. M. Tousi, A. Hassibi, and E. Afshari, “Delay-line-based analog-to-

digital converters,” IEEE Transactions on Circuits and Systems-II: Express

Briefs, vol. 56, no. 6, pp. 464–470, June 2009. 15, 20, 22, 32

[50] T. Watanabe, T. Mizuno, and Y. Makino, “An all-digital analog-to-digital

converter with 12µv/lsb using moving-average filtering,” IEEE Transactions

on Circuits and Systems-II: Express Briefs, vol. 38, no. 1, pp. 120–125, April

2004. 15, 20

[51] J. Bergs, Design of a VCO based ADC in a 180 nm CMOS Process for use

in Positron Emission Tomography. Master’s thesis, 2010. 15

[52] M. Z. Straayer and M. H. Perrott, “A 12-bit, 10-mhz bandwidth, continuous-

time σδ adc with a 5-bit, 950-ms/s vco-based quantizer,” IEEE Journal of

Solid-State Circuits, vol. 43, no. 4, pp. 805–814, April 2008. 15, 32

[53] J. Xiao, A. V. Peterchev, J. Zhang, and S. R. Sanders, “A 4-µa quiescent-

current dual-mode digitally controlled buck converter ic for cellular phone

applications,” IEEE Journal of Solid-State Circuits, vol. 39, no. 12, pp. 2342–

2348, December 2004. 15

REFERENCES 106

[54] R. K. Dash and H. Parthasarathy, “Low dropout regulator testing system

and device,” Patent US 8 054 057 B2, 11 08, 2011. [Online]. Available:

http://www.google.com/patents/US8054057 16

[55] R. Vasudevamurthy, P. K. Das, and B. Amrutur, “A mostly-digital analog

scan-out chain for low bandwidth voltage measurement for analog ip test,”

in ISCAS Digest of Papers, 2011, pp. 2035–2038. 16

[56] C. S. Taillefer and G. W. Roberts, “Delta-sigma a/d conversion via time-

mode signal processing,” IEEE Transactions on Circuits and Systems-I: Reg-

ular Papers, vol. 56, no. 9, pp. 1908–1920, September 2009. 16, 54, 79

[57] D. I. Porat, “Review of sub-nanosecond time-interval measurements,” IEEE

Transactions on Nuclear Science, vol. 20, no. 5, pp. 36–51, October 1973. 19

[58] J. Kostamovaara, S. Kurtti, and J.-P. Jansson, “A receiver - tdc chip set

for accurate pulsed time-of-flight laser ranging,” in CDNLive! EMEA 2012

Conference Proceedings, 2012, pp. 1–6. 19

[59] B. K. Swann, B. J. Blalock, L. G. Clonts, D. M. Blinkley, J. M. Rochelle,

E. Breeding, and K. M. Baldwin, “A 100-ps time-resolution cmos time-to-

digital converter for positron emission tomography imaging applications,”

IEEE Journal of Solid-State Circuits, vol. 39, no. 11, pp. 1839–1852, Novem-

ber 2004. 19

[60] V. Kratyuk, P. K. Hanumolu, K. Ok, U.-K. Moon, and K. Mayaram, “A

digital pll with a stochastic time-to-digital converter,” IEEE Transactions

on Circuits and Systems-I: Regular Papers, vol. 56, no. 8, pp. 1612–1621,

August 2009. 20

[61] P. K. Das and B. Amrutur, “An accurate fractional period delay generation

system,” IEEE Transactions on Instrumentation and Measurement, vol. 61,

no. 7, pp. 1924–1932, July 2012. 20

[62] B. Amrutur, P. K. Das, and R. Vasudevamurthy, “0.84ps resolution clock

skew measurement via sub-sampling,” IEEE Transactions on Very Large

http://www.google.com/patents/US8054057

REFERENCES 107

Scale Integration (VLSI) Systems, vol. 99, pp. 1–9, November 2010. 20, 32,

44, 45, 50, 74, 78

[63] P. Chen, C.-C. Chen, C.-C. Tsai, and W.-F. Lu, “A time-to-digital converter-

based cmos smart temperature sensor,” IEEE Journal of Solid-State Cir-

cuits, vol. 40, no. 8, pp. 1642–1648, August 2005. 20

[64] D. Fick, N. Liu, Z. Foo, M. Fojtik, J. s. Seo, D. Sylvester, and D. Blaauw, “In

situ delay-slack monitor for high-performance processors using an all-digital

self-calibrating 5ps resolution time-to-digital converter,” in ISSCC Digest of

Technical Papers. IEEE, 2010, pp. 188–189. 20, 24

[65] T. Xia and J.-C. Lo, “Time-to-voltage converter for on-chip jitter measure-

ment,” IEEE Transactions on Instrumentation and Measurement, vol. 52,

no. 6, pp. 1738–1748, December 2003. 21

[66] M. A. Z. Straayer, “Noise shaping techniques for analog and time to digi-

tal converters using voltage controlled oscillators,” Ph.D. dissertation, Mas-

sachusetts Institute of Technology, June 2008. 21, 25, 26, 27, 31

[67] S. Henzler, S. Koeppe, W. Kamp, H. Mulatz, and D. Schmitt-Landsiedel,

“90nm 4.7ps-resolution 0.7-lsb single-shot precision and 19pj-per-shot lo-

cal passive interpolation time-to-digital converter with on-chip characteriza-

tion,” in Digest of Technical Papers. IEEE International Solid-State Circuits

Conference, 2008, pp. 548–549. 23

[68] R. G. Baron, “The vernier time-measuring technique,” Proceedings of the

IRE, pp. 21–30, January 1957. 23

[69] S. Sindia, F. F. Dai, and V. D. Agrawal, “All-digital replica techniques for

managing random mismatch in time-to-digital converters,” in Proceedings of

the 44th IEEE Southeastern Symposium on System Theory, 2012, pp. 1–5.

24

[70] C.-S. Hwang, P. Chen, and H.-W. Tsao, “A high-precision time-to-digital

converter using a two-level conversion scheme,” IEEE Transactions on Nu-

clear Science, vol. 51, no. 4, pp. 1349–1352, August 2004. 24

REFERENCES 108

[71] M. Lee and A. A. Abidi, “A 9 b, 1.25 ps resolution coarse-fine time-to-digital

converter in 90 nm cmos that amplifies a time residue,” IEEE Journal of

Solid-State Circuits, vol. 43, no. 4, pp. 769–777, April 2008. 25

[72] J. Yu, F. F. Dai, and R. C. Jaeger, “A 12-bit vernier ring time-to-digital

converter in 0.13 µm cmos technology,” IEEE Journal of Solid-State Circuits,

vol. 45, no. 4, pp. 830–842, April 2010. 25

[73] R. J. Baker, CMOS Mixed-Signal Circuit Design. IEEE Press, 2002. 25, 54

[74] I. Nissinen, A. Mntyniemi, and J. Kostamovaara, “A cmos time-to-digital

converter based on a ring oscillator for a laser radar,” in Proceedings of the

29th IEEE European Solid-State Circuits Conference, 2003, pp. 469–472. 26

[75] M. Z. Straayer and M. H. Perrott, “A multi-path gated ring oscillator tdc

with first-order noise shaping,” IEEE Journal of Solid-State Circuits, vol. 44,

no. 4, pp. 1089–1098, April 2009. 27

[76] R. J. V. D. Plassche, “Dynamic element matching for high-accuracy mono-

lithic d/a converters,” IEEE Journal of Solid-State Circuits, vol. 11, no. 6,

pp. 795–800, December 1976. 28

[77] R. Schreier and G. C. Temes, Understanding Delta-Sigma Data Converters.

John Wiley & Sons, 2005. 28

[78] E. G. Friedman, “Clock distribution networks in synchronous digital inte-

grated circuits,” vol. 89, no. 5, May 2001, pp. 665–692. 29

[79] P. K. Das, B. Amrutur, J. Sridhar, and V. Visvanathan, “On-chip clock

network skew measurement using sub-sampling,” in IEEE ASSCC Digest of

Technical Papers, November 2008, pp. 401–404. 30, 33

[80] T. Hashida and M. Nagata, “An on-chip waveform capturer and application

to diagnosis of power delivery in soc integration,” IEEE Journal of Solid-

State Circuits, vol. 46, no. 4, April 2011. 31

[81] A. S. Morris, Measurement and Instrumentation Principles. Butterworth-

Heinemann, 2001. 37

REFERENCES 109

[82] H. Y. Yang and R. Sarpeshkar, “A time-based energy-efficient analog-to-

digital converter,” IEEE Journal of Solid-State Circuits, vol. 40, no. 8, pp.

1590–1601, August 2005. 37

[83] C. Taillefer, “Analog-to-digtal conversion via time-mode signal processing,”

Ph.D. dissertation, McGill University, Montreal, August 2007. 37, 39

[84] S. Song and V. Stojanovic, “A 6.25 gb/s voltage-time conversion based frac-

tionally spaced linear receive equalizer for mesochronous high-speed links,”

IEEE Journal of Solid-State Circuits, vol. 46, no. 5, pp. 1183–1197, May

2011. 37, 39

[85] H. Pekau, A. Yousif, and J. W. Haslett, “A cmos integrated linear voltage-

to-pulse-delay-time converter for time based analog-to-digital converters,” in

ISCAS Digest of Technical Papers. IEEE, 2006, pp. 2373–2376. 37

[86] A. R. Macpherson, K. A. Townsend, and J. W. Haslett, “A 5gs/s voltage-

to-time converter in 90nm cmos,” in 4th European Microwave Integrated

Circuits Conference, 2009, pp. 254–257. 37, 39

[87] X. Inc., Xilinx UG190 Virtex 5 FPGA User Guide. Xilinx, 2006. 63

[88] M. Mansuri and C.-K. K. Yang, “Jitter optimization based on phase-locked

loop design parameters,” IEEE Journal of Solid-State Circuits, vol. 37,

no. 11, pp. 1375–1382, November 2002.

[89] L.-M. Lee, D. Weinlader, and C.-K. K. Yang, “A sub10-ps multiphase

sampling system using redundancy,” IEEE Journal of Solid-State Circuits,

vol. 41, no. 1, pp. 265–273, September 2006. 73

[90] J. A. Bucklew, Introduction to Rare Event Simulation. Springer, 2004. 74

[91] R. G. Lyons, Understanding Digital Signal Processing. Pearson, 2011. 81

[92] S. Dwivedi, “Low power receiver architecture and algorithms for low data

rate wireless personal area networks,” Ph.D. dissertation, Indian Institute of

Science, Bangalore, Karnataka, India, December 2010. 84

REFERENCES 110

[93] K.-J. Lee, J.-J. Chen, and C.-H. Huang, “Using a single input to

support multiple scan chains,” in Proceedings of the 1998 IEEE/ACM

international conference on Computer-aided design, ser. ICCAD ’98.

New York, NY, USA: ACM, 1998, pp. 74–78. [Online]. Available:

http://doi.acm.org/10.1145/288548.288563 92

[94] A. Papoulis and S. U. Pillai, Probability, Random Variables and Stochastic

Processes. Tata McGraw - Hill Education, 2002. 96

[95] R. P. Feynman, Surely You’re Joking Mr. Feynman. Random House, 1992.

97

[96] A. A. Abidi, “Phase noise and jitter in cmos ring oscillators,” IEEE Journal

of Solid-State Circuits, vol. 41, no. 8, pp. 1803–1816, August 2006. 98

http://doi.acm.org/10.1145/288548.288563

Time-based All-Digital Technique for Analog Built-in Self Test · Maruthi, Vinay, Shantanu,...

Documents

Transcript of Time-based All-Digital Technique for Analog Built-in Self Test · Maruthi, Vinay, Shantanu,...