High Speed VLSI Circuit Simulator - Walter Scott, Jr...
Transcript of High Speed VLSI Circuit Simulator - Walter Scott, Jr...
High-Speed VLSI Circuit Simulator Final Report
Spring Semester 2014
Prepared to partially fulfill the requirements for ECE402
By
Qi Chen
Pan Zhang
Department of Electrical and Computer Engineering
Colorado State University
Fort Collins, Colorado 80523
Project Advisor: Prof. Sourajeet Roy
Approved by: Prof. Sourajeet Roy
2
Abstract
With the rapid increase in the processing speed and scaling in electronic feature sizes of
integrated circuits below 45nm, the analysis and simulation of high speed interconnects has
become a critical prerequisite for electrical engineers. Unlike in the past, where the effects of
high-speed interconnects are negligible, interconnects are responsible for non-ideal results such
as signal delay and attenuation which render circuits inoperable. The interconnect analysis
becomes more important due to new technologies such as carbon based interconnects. This
project aims to create a general purpose circuit simulator capable of computer aided design
(CAD) of high-speed interconnects that will not only allow us to explore the current CAD
paradigms but even test highly novel and advanced strategies capable of far greater accuracy and
computational efficiency.
The procedure of this project includes understanding the advanced mathematics and numerical
algorithms behind circuit simulation, the stability, accuracy and computational costs of these
algorithms, computer programming skills to efficiently execute these algorithms, as well as the
skill to use commercial circuit solver tools for validation of these algorithms.
This project will provide important insight of the methods of simulating high-speed
copper and carbon nano-tube interconnects that is located either on-chip or at PCB level. The
project will study the mathematical representation of circuits that is originally partial differential
equations. This means that there is no exact solution in the time domain and corresponding
approximations result in very high CPU cost. Thus, there is a need for an accurate, yet fast
interconnect solver. It can be used for circuit designers to further understand the effect of high-
speed interconnects and its effect on signal degradation. The project will address the current
computational constraints of complex high-speed interconnect networks by exploring model
order reduction method. The ability to do so will contribute to developing an efficient and robust
solver that will help change the current state of the art of circuit simulation.
3
Table of Contents
Abstract 2
Table of Contents 3
I. Introduction 5
II. Review of Previous Work 6
III. Applied Engineering Methods 7
A. Equivalent Circuit Formulation 7
B. Mathematical Models 8
B.1 Frequency Domain Representation 9
B.2 Time Domain Representation 10
B.3 Interconnect Modeling 10
C. Engineering Tools 12
IV. Objectives and Constraints 13
A. Goals 13
A.1 Fall Semester Objectives 13
A.2 Spring Semester Objectives 13
B. Technical Performance Measurements 13
B.1 Accuracy 14
B.2 Computation Time 14
C. Risks and Constraints 14
V. Testing and Evaluation Plan 15
A. Frequency Domain 15
B. Time Domain 16
VI. Current Design Process 17
A. Items Completed 17
B. Test Simulation Results 18
B.1 Frequency Domain Simulation Results 18
B.2 Time Domain Simulation Results 20
C. Re-Evaluation and Simulation Improvement 21
C.1 Frequency Domain Simulation Improvement 22
C.2 Time Domain Un-Coupled Simulation Improvement 23
C.3 Time Domain Coupled Simulation Improvement 24
VII. Model Order Reduction 25
A. Methods and Algorithms 25
B. MOR Implemented Simulation Results 27
VIII. User Manual 29
A. Input Format 29
A.1 Frequency Domain Input Format 29
A.2 Time Domain Input Format 30
A.3 Coupled Model Input Format 30
B. Compiling and Extracting Results 32
IX. Conclusion and Future Work 33
A. Future Development 33
A.1 Parallel Simulation 33
A.2 User Interface plus Feedback 33
A.3 Carbon Nano Tube Characterization 33
B. Conclusion 34
X. References 35
XI. Appendix A – Stamps 36
XII. Appendix B – Abbreviations 40
XIII. Appendix C - Budget 41
XIV. Appendix D – LAPACK Code & Matlab Script 42
XV. Appendix E – Timeline 46
4
List of Figures
Figure 1: Interconnect Hierarchy
Figure 2: HSI distortions
Figure 3: Sample Circuit Formulation
Figure 4: Resistor Stamp
Figure 5: Inductor Stamp
Figure 6: Backward Euler Method
Figure 7: Trapezoidal Method
Figure 8: Telegrapher’s Equations
Figure 9: Lumped Model Derivation
Figure 10: model for Single-conductor transmission line
Figure 11: model for Multi-conductor transmission line
Figure 12: Test Template 1
Figure 13: Test Template 2
Figure 14: Test Template 3
Figure 15: Magnitude Comparison
Figure 16: Phase Comparison
Figure 17: V1 for d=0.5cm
Figure 18: CPU cost vs Length
Figure 19: V2 for d=0.5cm
Figure 20: Improved Frequency Domain Result V3 Magnitude
Figure 21: Improved Frequency Domain Result V3 Phase
Figure 22: Improved Time Domain Result V3
Figure 23: Improved Time Domain Coupled Result V2
Figure 24: MOR based on congruent transformation
Figure 25: MOR Frequency Domain V3 Magnitude
Figure 26: MOR Frequency Domain V3 Phase
Figure 27: MOR Time Domain V3
Figure 28: Frequency Domain CPU Time Reduction with MOR
Figure 29: Time Domain CPU Time Reduction with MOR
Figure 30: Frequency Domain Sample Input
Figure 31: Time Domain Sample Input
Figure 32: Coupled Time Domain Sample Input
Figure 33: Waveform Relaxation Partitioning
List of Tables
Table 1: Current Progress
5
I. INTRODUCTION
In the concentration of Very Large Scale Integration (VLSI) and integrated circuits,
interconnects are known as the structure that provides the electrical connection between elements
to allow signals to propagate. Interconnects, varied by its electrical signal distance such as board-
to-board and chip-to-chip, can range from simple copper wires to high tech carbon nano-tubes
and transmission lines. [6]
Figure 1-Interconnect Hierarchy [6]
For the past decades, the effects of VLSI high-speed interconnect can no longer be
ignored in the design of high-speed integrated systems. The current state of the art integrated
VLSI circuits has very high density and can operate up to multi Giga hertz clock rates, hence the
high-speed. The rapid increase of operating frequencies not only suppresses the accuracy of
modeling circuit and simulation, but also physical optimization. Therefore, in order to increase
the circuit performance, merely decreasing the size of the chip and its components are not
enough to the point that the emphasis requires the attention of distortion and delay of signal
caused by interconnect. The current challenges of simulating the effects of interconnect can
contributed to its distributed model that depends on dimension, as well as the tradeoffs between
rigorous computation cost and accurate results. [6]
Figure 2-HSI distortions [6]
The goal of the High-Speed VLSI Simulator (HSVS) is to develop a VLSI circuit solver
that can be used not only as a general circuit simulator, but also as a tool to simulate and analyze
the effects of high-speed interconnects in an attempt to address the current predicament and
industry need. The procedure of this project includes understanding the advanced mathematics
and numerical algorithms behind circuit simulation, the stability, accuracy and computational
costs of these algorithms, computer programming skills to efficiently execute these algorithms,
as well as the skill to use commercial circuit solver tools for validation of these algorithms. This
project will provide important insight of the methods of simulating high-speed copper and
6
carbon nano-tube interconnects that is located either on-chip or at PCB level. It can be used for
circuit designers to further understand the effect of high-speed interconnects and its effect on
signal degradation. The project will address the current computational constraints of complex
high-speed interconnect networks by exploring model order reduction (MOR) method and
parallel waveform relaxation method. The ability to do so will contribute to developing an
efficient and robust solver that will help change the current state of the art of circuit simulation.
This report will describe the ongoing work and accomplished tasks in the fall 2014
semester as well as the plan for future work. Chapter II briefly reviews the previous work on the
simulation of interconnects and its impacts from researchers and industry leaders. Chapter III
illustrates in detail of the applied engineering methods that are effectively implemented in this
solver. Chapter IV…
II. REVIEW OF PREVIOUS WORK
Prior to delving into software development for the VLSI interconnect solver, a thorough
investigation of previous work is needed to strengthen the understanding of the engineering
problem and methods to deliver the solution. This chapter will briefly describe approaches used
by researchers that laid the foundation for interconnect analysis. A significant constraint that
hinders the timely and accurate analysis of high-speed interconnect is the nature of its distributed
model.
In traditional VLSI circuit solvers, the method of translating circuit schematics into
equivalent mathematical representation is through a method the Modified Nodal Analysis
(MNA) or Stamp. Each circuit element has a unique stamp derived from MDA, which conjoins
into an ordinary differential equations representation that can be easily solved in both time and
frequency domain. This fundamental method is suitable for the use of HSVS and will be
discussed in detail in later chapters.
However, unlike the lumped circuit model where the circuits can be represented by
lumped elements such as resistors, capacitors, and inductors which only depend on time, the
distributed model of interconnects not only depend on time but also on the geometry and shape.
Mathematically, the distributed model is characterized by Telegrapher’s Partial differential
equations which creates a time and frequency domain mismatch. This means that the frequency
domain solution does not have an exact depiction in time domain. In order to resolve this
dilemma, the utilization of different types of integration methods is necessary to approximate a
solution.
General VLSI circuit simulators such as Cadence and HSPICE already embody the
capabilities to simulate VLSI circuits with interconnect. The method of transforming the circuit
representation to a mathematical model is through the delivery of something called the net-list.
The net-list details the specifications of the circuit such as the element and its value, as well as
the requirements for the simulation. The net-list effectively provides a user interface for the
program, and is referenced to as a general guideline for the user interface for HSVS.
7
III. APPLIED ENGINEERING METHODS
A. Equivalent Circuit Formulation
The HSVS design is geared towards a general VLSI circuit solver, which means the
ability to correctly transform the net-list circuit representation to its equivalent mathematical
representation is the first and foremost step. The method used in equivalent circuit formulation is
called Stamp or Stamping. The stamp method is based off of Kirchhoff’s Current Law, where
each KCL generated equation is organized in matrix form for a circuit network.
Figure 3-Sample Circuit Formulation [5]
Figure 3 depicts the sample equivalent circuit formulation of a resistive network using
KVL. The stamp of the equivalent circuit is reflected by the left hand side matrix.
8
Figure 4-Resistor Stamp [4]
Each circuit element has its unique equivalent circuit stamp. Figure 4 shows the stamp for
the resistor. However, ordinary nodal analysis, even though is sufficient for most circuit
elements, is not suitable for the stamp of elements such as inductor and independent voltage
source due to complications of solving the differential equations in frequency domain. The
HSVS uses the modified nodal analysis (MNA) that introduce a new current variable for the
stamp of the inductor and independent voltage source, which adds a new row and column to the
left hand side matrix.
Figure 5-Inductor Stamp [4]
The complete stamps used in HSVS are shown in Appendix A.
B. Mathematical Models
The mathematical representations of the equivalent circuit after transforming the
9
elements into their respective stamps can be summarized as the following.
B.1 Frequency Domain Representation
In frequency domain simulation, the equivalent circuit can be represented by ordinary
differential equations in the form of
𝐺𝑥 + 𝑠𝐶𝑥 = 𝐵 (1)
s = j2πf
Whereas s is the Laplace transform equivalent of 𝑑𝑥
𝑑𝑡 , and G, C, and B are left hand and
right hand matrices respectively. The ODE can be solved for its exact solution in the frequency
domain by matrix inversion. For large matrices operations, inverting matrix can be done by LU
decomposition and forward and backwards substitution. The LU decomposition is accomplished
by using the existing LAPACK routine which uses the following method. The complexity, in
terms of computational cost associated with the use of LU decomposition is 𝑂(𝑛𝑎), where n is
the size of the matrices, and a is between 1.5 and 2.
𝑨𝒏 = 𝑮 + 𝒔𝑪
10
B.2 Time Domain Representation
In time domain simulation, the equivalent circuit can be expressed as ordinary
differential equation.
𝐺𝑥 + 𝐶𝑑𝑥
𝑑𝑡= 𝐵 (2)
The time domain results can be obtained by using numerical integration methods, in the
case of solving electrical networks, linear multi-step numerical integration method is more
suitable than others to approximate a solution.
Figure 6-Backward Euler method [5]
Figure 7-Trapezoidal method [5]
The linear multi-step numerical integration methods used in HSVS are implicit methods
which include the Backward Euler and Trapezoidal methods shown in Figure 6 and 7.
B.3 Interconnect Modeling
In order to simulate interconnects in a circuit network, appropriate electrical models are
required to characterize the distributed nature of interconnects. Currently, there are many models
in electrical engineering studies that can be used to characterize interconnect. For the design of
HSVS, the Quasi-Transverse Electromagnetic Model (TEM) is used among others since “the
approximation is valid for most practical interconnect structures and offers relative ease and low
CPU cost compared to full wave approaches”. [6]
Using the TEM distributed method, the voltages and currents of single, but most
importantly, multi-conductor interconnects are expressed by Telegrapher’s equations.
11
Figure 8-Telegrapher's equation [6]
Figure 8 shows the Telegrapher’s equations for multi-conductor interconnect network for
both time and frequency domain. However, since it is represented as partial differential
equations, numerical techniques are needed to convert the distributed model into ordinary
differential equations. The conventional lumped model is linear ordinary differential model
derived from the distributed model of the Telegrapher’s equations. For the purpose of HSVS, the
model is used for single conductor frequency and time domain simulations, as well as multi-
conductor simulations with inductive and capacitive coupling.
Figure 9-Lumped Model Derivation [6]
12
Figure 10- model for single-conductor transmission line
Figure 11- model for multi-conductor transmission line [6]
C. Engineering Tools
The development of the HSVS requires a multitude of engineering tools to write and test
the programs.
Primary tools:
CSU CRAY Supercomputer
C++ programming platform and GNU C++ compiler
HSPICE for result reference
Secondary tools:
Matlab
LAPACK routine for matrix operations
13
IV. OBJECTIVES AND CONSTRAINTS
A. Goals
The goal of the senior design project is to develop a robust VLSI circuit solver that is
able to simulate single and multi-conductor interconnect in both time and frequency domain with
sufficient accuracy and speed. Of course, the success of this project depends solely on the
experimentation, the testing accuracy with which the simulation shows compared to an industry
standard reference. The following section describes the steps necessary to achieve the
expectations by the end of the fall 2014 semester, as well as the plan of action for the spring
2015 semester.
A.1 Fall semester objectives
Introduction to C++ programming language and HSPICE solver infrastructure and
CRAY high performance computing platform
Create a program that can generate an equivalent circuit from any given interconnect
layout
Create a program that can translate the input equivalent circuit to a mathematical
model in a form of an ordinary differential equations (ODEs) shown in equation 2
Solve the equation in frequency domain by using LU decomposition
Solve equation 2 in time domain using numerical integration methods backward
Euler, and trapezoidal rule.
Validation of the frequency and time domain simulations using results from HSPICE
A.2 Spring semester objectives
Investigate computing accuracy, CPU time, and stability of integration of these
integration methods
Address the computation time and memory costs for large and complex interconnect
networks via parallel simulation using waveform relaxation algorithm and/or model
order reduction method
B. Technical Performance Measurements
As mentioned in previous chapters, the parameters of validation of the simulation from
the HSVS are done through the comparison with HPSICE and can be effectively characterized
by the following technical performance measurements (TPMs).
14
B.1 Accuracy
The accuracy of the simulation in frequency and time domain can be quantified visually
through plotting both results from HSVS and HSPICE, and observed visually. Of course, for
frequency domain comparison, both magnitude and phase results need to be considered. Besides
the visual observation, the accuracy is illustrated more importantly by the use of error norm.
𝑒 =1
𝑁√∑ |𝑆 − 𝐻|2𝑁
1 (3)
Where h represents the step size of the frequency sampling, S and H represents the node voltages
from the solver and HSPICE respectively. The performance specification of the solver requires
the error norm to be less than 5%. Consequently, this allows the optimization to be done
concerning the number of cascading sections n when equating the interconnect layout to the
mathematical representation for that n is expected to be the main cause of accuracy. Thus, by
knowing the error norm, n can be increased accordingly to meet the performance spec.
B.2 Computation Time
The computation time, also known as CPU time, is a TPM that is characterized for
optimization. The CPU time cannot be compared with the respective computation time from
HSPICE, but can be used to set a perspective in order to increase the speed of simulation.
𝑛 =20𝑑√𝐿𝐶
𝑇𝑟 (4)
From equation 4, the cascade section number n is described. Knowing and expecting that
increasing n would allow the accuracy of the solver to increase, but at the same time the
computation time would also increase as a result. Thus, the evaluation of Tr (rise time) and d
(interconnect length) is necessary to optimize the computational cost. The computational cost of
frequency analysis of interconnect networks is expected to be 𝑂(𝑛𝑎), where a ranges from 1.5 to
2. So, as a validation method, n can be varied for a number of simulations to tabulate the
respective a. The factor ‘a’ can be compared with the expected value to show the solver’s
performance.
C. Risks and Constraints
The potential risks of this project, besides the technical difficulties of the code, consist of
the following:
1. One of the challenges is the fact that the per unit lengths structure of carbon nano-tubes
are difficult to obtain. Although we can choose to use the per unit lengths numbers from
test cases in published literature or do analytic approximation, the challenge simply shifts
to mathematical proficiency.
15
2. Another challenge can be the solver’s robustness in terms of its stability, CPU time, and
accuracy. The type of integration methods chosen can result in different stability, CPU
time, and accuracy characteristics. Therefore we need to configure the system such that it
allows flexibility for the user to choose the appropriate combination of these
characteristics for his/her simulation.
For this particular project, the structural procedure of creating a VLSI solver is the main
frame and therefore cannot be changed. However, the options to choose two alternatives to
address the computing time problem for large interconnect networks is encouraged. They are
parallel simulation using waveform relaxation algorithm and model order reduction method. The
parallel simulation using waveform relaxation algorithm can reduce computation time of multi-
level interconnect simulation, but has an issue of convergence which will require an investigation
of hybrid iteration methods. This plan requires rigorous mathematics incorporated algorithm
coding and can be very challenging but nonetheless very rewarding. The alternative model order
reduction method is the other route if we choose to or if the first option fails. This method
basically reduces the order of the unknown variables, and thus the computation time can be
reduced. However, this method presents the tradeoff for a less accurate simulation. [8]
V. Testing and Evaluation Plan
The test and evaluation plan for this software based project can be categorized through the
performance requirements with respect to the timely deliverables. The fall semester deliverable
consist of the general interconnect simulator with the ability to perform frequency domain and
time domain analysis. The spring semester deliverable will primarily consist of the investigation
of model order reduction module to improve the performance of the existing interconnect solver.
In order to test the performance of the solver in a reliable and timely fashion, having good
reference results is necessary. HSPICE is a commercial circuit simulator and an industrial
standard. [3] The testing procedure will involve the comparison between the results of HSPICE
and this project and quantify them in terms of accuracy and CPU computation time. The detailed
testing and evaluation plan is listed as following. [9]
A. Frequency Domain
Prior to the comparison to HSPICE, the code needs to be validated thru debugging of the
modules used as well as the interactions between them. The debugging procedures can ensure the
correct operations from input to output. In order to compare the simulation results of a
interconnect circuit to the simulation results from HSPICE, a test schematic is needed such as the
one shown in figure 1.
16
Figure 12-Testing template 1
Figure 13-Test template 2
Compare accuracy using L2 norm of error, and optimize accordingly
CPU time using C++ functions
B. Time Domain
The time domain analysis mainly focuses on utilizing the Backward Euler and
Trapezoidal integration methods to approximate a linear representation of the time domain
differential equation, which are the mathematical representation of the interconnect circuit in the
time domain. Since the program allows the selection of one of the two integration methods to be
used during the time domain simulation, the performance of both should be examined separately.
Figure 14 shows the multi-conductor testing schematic primarily designed for time domain
17
simulations.
Figure 14-Testing Template 3
Compare accuracy using L2 norm of error, and optimize accordingly
CPU time using C++ functions
VI. CURRENT DESIGN PROCESS
A. Items Completed
Currently, at the end of the fall 2014 semester, the expected progress of designing the
HSVS has been listed in Appendix E, and the details of current progress can be summarized by
the following table.
18
Table 1-Current Progress
Completed Items Delayed Progress
C++ module for equivalent circuit
transformation from input net-list
Inaccurate simulation results in frequency
domain simulations has delayed progress to
optimize the CPU time and accuracy C++ module for transforming input
interconnect component to equivalent circuit
representation
C++ function to call LAPACK function to
perform complex matrix LU decomposition and
forward, backward substitution
C++ code to solve circuit in Time Domain Accuracy calculations due to errors in data
collection from HSPICE C++ code to solve circuit in Frequency Domain
HSPICE results comparison via Matlab
modules and HSPICE-Matlab toolbox
The above table illustrates the completed items this fall regarding scheduled expectations,
as well as setbacks on items that created delay of progress.
The results from the current design of HSVS will be thoroughly presented in the next
section, for which the analysis on both the good and failed trials validate the work, as well as
justify tasks for improvements in the future.
B. Initial Test Simulation Results
B.1 Frequency Domain Simulation Results
The frequency domain simulation started by simulating the test template number 2 in
figure 13, which has the following results.
19
The above figures reflect the frequency domain result in both the magnitude and phase
comparisons. The blue curves denote the HSPICE result, whereas the red curve reflects the
results of the HSVS. Clearly, the result is not as accurate as expected, from the noise shown in
the phase comparison and the magnitude spike. In fact, this was the simulation result from the
second trail of testing which actually reduced the amount of noise from the first trial shown in
the Appendix F. The failed trails, although presented flaws in the program, can lead us in a
correct direction to increase the accuracy in the future.
Figure 15-Magnitude Comparison
Figure 16-Phase Comparison
20
B.2 Time Domain Simulation Results
The following time domain simulations reflect the data gathered from simulating test
template number 3. The output extract points are the node at the end of the first and second
interconnect labeled V1 and V2. The lengths of the interconnect, in this case, are 0.5cm. The
complete simulation include d=0.5cm, d=2cm, and d=10cm, and are used for initial
characterization of CPU complexity. Similarly, the red curve represents the HSVS result and the
blue curve represents the HSPICE result.
Figure 17-V1 for d=0.5cm
Figure 18-CPU cost vs Length
21
Visually observing the simultion comparisons of the two voltage outputs, several things
can be concluded. Accuracy of simulation is a challenge when it comes to the coupled
interconnect effects shown in Figure 18. The magnitude of V2 is significantly less than the one
from HSPICE, despite resemblence of coupled transient effects. The time domain result
comparison of V1, on the other hand, is very promising. The result overlaps most of the time
with the HSPICE result with little defficiency. The initial computation time can be observed from
figure 19. Where the computation time increases with a linear characteristic with the increase of
interconnect length, which is expected.
Overall, the time domain result reflect the current capabilities of the simulator. With
future improvements and optimizations on accuracy and CPU time using methods discussed in
the previous chapters, the solver can be more complete and robust.
Figure 19-V2 for d=0.5cm
22
C. Re-evaluation and Simulation Improvement
Knowing what the HSVS solver is lacking in terms of the simulation accuracy, the next
step for the team is to perform extensive re-evaluation process of the code through debugging
and failure-mode inducement.
Between the two team members, the re-evaluation process of both time and frequency
domain, couple or un-coupled interconnect codes are distributed evenly to increase the
efficiency.
Several methods used to debugg the code and attempts to induce different failure modes
are direct output of variables with usage of breaks in case of error, usage of Visual Studios to
detect syntax error during or before the debugging process. In case of segmentation faults, which
are access of false memories, the process becomes more challenging.
C.1 Frequency Domain Simulation Improvement
The first change introduced in the frequency domain code is the replacement of some
size_t type by int type, thus there will be no discrepencies between the need of an unsigned int
when passing the values between different loops. The accuracy of the frequency domain
simulation is directly dependent on the number of sections of each interconnect, represented by
the value n. Therefore, extensive testing of the optimal value of n was needed to detect a
relatively accurate model for our purpose. The testing values of n ranged from 1.5 to 5 multiples
of the original value. This helped the team determine that 1.9 to 2 multiples of origninal number
of sections proved to be useful in increasing the accuracy.
Furthermore, the use of the function named LU , which contains the codes that pass the
parameters of the final equation to perform sparse matrix inversion using LU decomposition, has
been modified and directly implement in the code int main() istead of a function of passing
variables.
The simulations, in this case, are accomplished by using the testing module 1 in figure 12
which contains 6 un-coupled interconnects with length of 10cm. The improved results are shown
below taken across the 100 ohm resistor.
23
Figure 20-Improved Frequency Domain Result V3 Magnitude
24
C.2 Time Domain Un-Coupled Simulation Improvement
Similar to the un-coupled frequency domain code, the un-coupled time domain program
had issues that were fixed using the same methods. The simulation is done using testing template
1 for a total time of 30ns and a trapezoidal voltage source pulse with rise time of 0.1ns. The
improved result is shown below.
Figure 21-Improved Frequency Domain Result V3 Phase
Figure 22-Improved Time Domain Result V3
25
C.3 Time Domain Coupled Simulation Improvement
The coupled time domain simulation has certainly been a challenge due to its substaintial
error of characterizing the inductive and capacitive coupled effects on the second interconnect of
testing module 3. Through explicit erorr detecting methods such as variable outputs, and long
periods of memory adjustment, the error was identified. Because the error comes from part of the
program that partitions the interconnect in order to characterize the coupled effects, and by
manually turning off either inductive or capacitive coupling, the error was narrowed down to
couple lines of code dealing with node partitioning.
As it turns out, the code was written incorrectly because the coupling node was accessed
at the wrong place for both capacitive and inductive coupling. Fixing this drastically increased
the accuracy of the simulation.
Figure 23-Improved Time Domain Coupled Result V2
26
C. MODEL ORDER REDUCTION
A. Methods and Algorithms
Model order reduction (MOR) method essentially reduces the number of unknowns such
as the node voltages and branch currents. This method is used to significantly decrease
computation time for both time and frequency domain analysis. The following expressions
illustrate the model order reduction approach, where ‘x’ represents the number of unknown
variables.
This type of order reduction method uses a specific process called implicit moment
matching techniques. In this case, it uses “Krylov subspace approaches to project large matrices
on its dominant eigen-space.” [7]
𝐺𝑋 + 𝐶𝑑𝑋
𝑑𝑡= 𝐵𝑢, 𝑜𝑟𝑖𝑔𝑛𝑖𝑎𝑙 𝑠𝑜𝑙𝑣𝑒𝑟
𝑋 = 𝑄𝑥 , 𝑀𝑂𝑅
𝐺𝑟𝑥 + 𝐶𝑟
𝑑𝑥
𝑑𝑡= 𝐵𝑟𝑢
dim (𝑥) ≪ dim (𝑋) (5)
The above equations illustrates the basic steps of reducing the number of unknowns in a
systems of equations by finding the matrix Q, which is a ortho-normalized matrix constructed
through matrix operations on the Kylov subspace matrix. The reduced order system is then
obtained using congruent transformation by substitution and pre-multiplying each side of
original equation by QT.
Figure 24-MOR based on congruent transformation [7]
27
The algorithm used to compute matrix Q is called the Arnoldi algorithm, and is being
used as follows.
,,,,
,,,,
32
321
zAzAAzz
kkkko
K
K
q210
qqqqQ
Finding q0
o
o
o
k
kq
Finding q1
1
1
1111
1
v
vqqqvvv
Aqv
oo
o
Finding q2
2
2
2112222
12
v
vqqqvqqvvv
Aqv
oo
Finding q3
3
3
3223113333
23
v
vqqqvqqvqqvvv
Aqv
oo
The Arnoldi Algorithm is a recursive algorithm that is easy to implement in a function
and can be reused on the matrix Q if the matrix is not truly an ortho-normalized matrix of the
Krylov matrix, in order to generate better result from a projection back to the original x vector.
[7]
The results from model order reduction needs to be tested and evaluated through
comparison to both original solver results and HSPICE results.
Accuracy
o Visual observation
o Evaluate the new ‘n’ for the solver with MOR implemented, and optimize the
error norm to be less than 5% comparing to HSPICE for both time and frequency
domain simulations.
CPU time
o Compare the CPU time between all three methods. Since the CPU time of the
MOR solver is expected to be lower than the original, a validation at the
beginning is necessary. Then, the new ‘n’, ‘d’, and ‘Tr’, from equation 5, of the
MOR solver can be evaluated appropriately to achieve the computational
complexity 𝑂(𝑝𝑎) where p is the dimension of x and ‘a’ is from 1.5 to 2
28
B. MOR Implemented Simulation Results
MOR based simulations were performed in both time a frequency domain for testing module 1
shown in the following figures.
Figure 25-MOR Frequency Domain V3 Magnitude
Figure 26-MOR Frequency Domain V3 Phase
29
Figure 27-MOR Time Domain V3
Figure 28-Frequency Domain CPU Time Reduction with MOR
Figure 29-Time Domain CPU Time Reduction with MOR
It is evident that the simulation results in time and frequency domain are simular to the
original solver without MOR. However, as shown in figure 21, that there exist some
discrepencies in the phase of the same output as compared to HSPICE. This is due to the
inevatible trade offs between accuracy and computation time, where computation time has been
greatly reduced but at the same time undermines the result accuracy. One way of improving the
result is to increase the user determined rank of the Q matrix, which we started by using 400.
Overall, the goal of using MOR to achieve significant decrease in computation time has
been successful, shown by figure 23 and 24.
30
D. USER MANUAL
A. Input format
All input files for the all VLSI solvers are in .txt format. The basic circuit elements that the
solver currently supports are: R (Resistor), L (Inductor), G (Conductor, 1/R), C (Capacitor), E
(Independent voltage source), J (Current source), T (Transmission line), and P (Voltage control
voltage source). In this section, the detailed net-listing style input file format will be described in
detail for frequency and time domain, and coupled solvers.
A.1 Frequency Domain Input Format
Frequency domain HSVS is capable of reading the input file with the following circuit
elements and their respective input formats.
R Node1 Node2 Rval
L Node1 Node2 Lval
G Node1 Node2 Gval
C Node1 Node2 Cval
E Node1 Node2 Eval
J Node1 Node2 Jval
P Node1 Node2 Node3 Node4 Pval
T Node1 Node2 L’val C’val G’val R’val Tr Length
Figure 30-Frequency Domain Sample Input
31
A.2 Time Domain Input Format
R Node1 Node2 Rval
L Node1 Node2 Lval
G Node1 Node2 Gval
C Node1 Node2 Cval
E Node1 Node2 Eval Tr Pw Tf Offset Tmax
T Node1 Node2 L’val C’val G’val R’val Tr Length
Figure 31-Time Domain Sample Input
A.3 Coupled Model Input Format
Lt Lt[1][1] Lt[1][2] … Lt[1][TL] Lt[2][1]…Lt[TL-1][TL]
Lt[TL][1]…Lt[TL][TL]
Ct Ct[1][1] Ct[1][2] … Ct[1][TL] Ct[2][1]...Ct[TL-1][TL]
Ct[TL][1]…Ct[TL][TL]
R Node1 Node2 Rval
L Node1 Node2 Lval
G Node1 Node2 Gval
C Node1 Node2 Cval
T Node1 Node2 G’val R’val Tr Length
E Node1 Node2 Eval Tr Pw Tf Offset Tmax (time domain)
E Node1 Node2 Eval (frequency domain)
32
Figure 32-Coupled Time Domain Sample Input
Notes:
Node1 and node2 stand for two nodes of the single element; while, node3 and node4
refer to two nodes for dependent source
Rval, Gval, Cval, Lval, Eval refer to values of resistance, conductance, capacitance, and
inductance, and voltage source value. L’val, C’va, G’val, and R’val refer to per unit
length parameters for transmission line.
Tr, Pw, Tf, Offset, and Tmax stand for rise time, pulse width, offset, and whole time of
the trapezoidal pulse in time domain.
The unit of Rval, Cval, Lval, Gval, Eval, Time, Length, L’val C’val G’val R’val are
in ohm, F, H, V, s, cm, H/cm, F/cm, mho/cm, and ohm/cm, respectively.
Lt and Ct are per unit length parameters matrices for Coupled Model only, and they are
required to be listed on the top (before T).
T is required be listed at the bottom of the input file.
For time domain, the VLSI solver supports voltage with constant or trapezoidal pulse.
33
B. Compiling and Extracting Results
The HSVS was designed to work on Linux with CSU’s super computer CRAY, since CRAY
provides included libraries such as the Lapack Routine package which is the key open source to
solve ordinary differential equations. If users have Lapck Routine package on their own system,
they don’t have to run the HSVS on a Linux based server as introduced below.
CC “time.c” or “frequency.c” →
aprun a.out →
…(wait until computation is done)… →
Output (stored in “final.txt”) →
SFTP (use window server to extract the result file) →
use MatLab to plot the results(download HSPICE toolbox)
Results can be extracted from the designated result file called final.txt which will appear
in the current directory after successful simulation. Using any available SFTP clients, the result
file can be extracted to a Windows environment for plotting and processing in desired programs.
Computation time of any simulation will be printed out on the screen once the computation
accomplishes.
34
E. CONCLUSION AND FUTURE WORK
A. Future Development
A.1 Parallel Simulation
Parallel simulation of interconnect mainly consist of using the Waveform Relaxation
method to solve coupled interconnects.
In order to correctly characterize the High-Speed Interconnect behavior and parallelize
the simulation at the same time, transverse partitioning method of multi-conductor interconnects
is needed, shown in figure 33. The method used to solve the circuit is through iteration methods
of the waveform relaxation such as the Gauss-Jacobi, and Hybrid Gauss-Jacobi methods. [1]
A.2 User Interface plus Feedback
It is definitely necessary, as a user based software, to provide more feedback on a user
friendly platfrom to allow the user to not only customize the desired simulation, but also being
able to review the simulation for errors occurred internally. In chances of future design, given
that the software coding part of HSVS has been improved and optimized, then designing and
developing a user interface application can be the necessary next step. This interface should
allow the user to interact with the solver in real time, and change parameters with ease.
This will make the HSVS more functional, and at the same time, more marketable and can be
appealing to the non electrical engineer users. This implementation will allow the original
intended target users, which are engineers in chip manufacuring companies, to be expanded and
thus more merchantable.
A.3 Carbon Nano Tube Characterization
If needed, the future development of the HSVS can include ability to CAD high tech
interconnects such as Carbon Nano-Tubes. To do so, the use of a more sophisticated interconnect
model other than the Lumped Model maybe needed for better characterization. The CAD of
Carbon Nano-Tubes should include the length being within the milometer range. This
implementation will definitely increase the marketability of the HSVS
Figure 33 -Waveform relaxation partitioning [1]
35
B. Conclusion
The purpose of developing a general circuit solver with HSI CAD capabilities has been
fulfilled. The current semester serves as the preparation for the spring semester. This means that
prior to the investigation of MOR and Parallel Simulations, the HSVS needs to be optimized
according to the analysis of current results to solve circuits with appropriate accuracy and CPU
time. During the design and development of this project, background literature reading of
mathmatical algorithms and C++ coding allowed for increased efficiency. Furthermore, the
increasing familiarity with computer code modularity and debugging assisted the design process.
Overall, this project has been successful in terms of completion, but more importantly, providing
valuable insights of interconnect effects for the use of chip designers and manufacturers.
36
REFERENCES
[1] E. Lelarasmee, A. E. Ruehli, and A. L. Sangiovanni-Vincentelli, “The waveform
relaxation method for time-domain analysis of large-scale integrated circuits,” IEEE Trans.
CAD Integr. Circuits Syst., vol. 1, no. 3,pp. 131–145, Jul. 1982.
[2] J. White and A. L. Sangiovanni-Vincentelli, Relaxation Techniques for the Simulation of
VLSI Circuits. Norwell, MA: Kluwer, 1987.
[3] Odabasioglu, M. Celik, and L. T. Pilleggi, "PRIMA: Passive reduced-order interconnect
macromodeling algorithm," IEEE Trans. Computer Aided Design, vol. 17, no. 8, pp. 645-
653, Aug. 1998
[4] Roy, Sourajeet. “Chapter 1: Formulation of Network Equations.” 2014
[5] Roy, Sourajeet. “Chapter 3: Numerical Integration Techniques of Differential Equations.”
2014
[6] Roy, Sourajeet. “Chapter 5 High Speed Interconnects.” 2014
[7] Roy, Sourajeet. “Chapter 5 High Speed Interconnects.” 2014
[8] S. Roy, A. Dounavis, and A. Beygi, “Longitudinal-Partitioning-Based Waveform
Relaxation Algorithm for Efficient Analysis of Distributed Transmission-Line Networks,”
IEEE Trans. on microwave theory and technique, vol. 60, no. 3, Mar. 2012
[9] Star-HSPICE Manual, Release 2001.2, Synopsis Inc., Santa Clara, CA, 2001. (Change the
year)
37
APPENDIX-A STAMP
Capacitor Stamp
Independent Current Source Stamp
38
Resistor Stamp
Voltage Control Current Source Stamp
39
Voltage Control Voltage Source Stamp
40
Independent Current Source Stamp
Inductor
Stamp
41
APPENDIX- B
ABBREVIATIONS
HSVS = High-Speed VLSI Simulator
KCL = Kirchhoff’s Current Law
KVL = Kirchhoff’s Voltage Law
MNA = Modified Nodal Analysis
MOR = Model Order Reduction
ODE = Ordinary Differential Equations
PCB = Printed Circuit Board
TEM = Quasi-Transverse Electromagnetic Model
VLSI = Very Large Scale Integration
HIS = High-Speed Interconnect
MOR = Model Order Reduction
SFTP = Safe File Transfer Protocol
42
APPENDIX-C
BUDGET
This project is basically based on software design, so we do not expect any expense so far.
43
APPENDIX-D
LAPACK CODE & MATLAB SCRIPT
LAPACK code for Time Domain (Coupled HSI d=10cm)
44
clear all;
addpath('U:\VLSI\HspiceToolbox')
load('U:\VLSI\frequncy domain data\data.mat');
y=loadsig('U:\VLSI\frequncy domain data\failed trail\HPICE output\one port hspice netlist
l=10cm.ac0');
lssig(y);
v_3 = evalsig(y,'v_3');
f = evalsig(y,'HERTZ');
%plot(f,angle(v_3),'b', F, angle(Vr+i*Vi), '-.r');
plot(f,abs(v_3),'b', F, abs(Vr+i*Vi), '-.r');
%title('Frequencu Domain Phase Analysis for V3 for d=10cm)');
title('Frequencu Domain Simulation for V3 for d=10cm)');
xlabel('Time/ s');
ylabel('Voltage/ V');
legend('HSPICE','our solver');
Matlab code for Frequency Domain (single transmission line d=10cm)
clear all;
addpath('U:\VLSI\HspiceToolbox')
load('U:\VLSI\time domain data\matrices\10cm.mat');
y=loadsig('U:\VLSI\time domain data\Hspice data\hspice code for Coupled HSI for d=10.tr0');
lssig(y);
V_3 = evalsig(y,'v_3');
V_6 = evalsig(y,'v_6');
T = evalsig(y,'TIME');
%plot(T,abs(V_3),'-.b',t,abs(v3_10cm),'r');
%title('Time Domain Simulation for V1 for d=10cm)');
plot(T,abs(V_6),'-.b',t,abs(v6_10cm),'r');
title('Time Domain Simulation for V2 for d=10cm)');
xlabel('Time/ s');
ylabel('Voltage/ V');
legend('HSPICE','our solver');
Matlab code for Time Domain (Coupled HSI d=10cm)
plot(length,time,'r');
title('CPU time vs Interconnet length');
xlabel('Length/ cm');
ylabel('Time/ s');
Matlab code for Time vs Length
45
APPENDIX E SAMPLE CODES FOR STAMP
C++ code for Resister Stamp
C++ code for Independent Voltage Source Stamp
46
C++ code for Voltage Control Voltage Source Stamp
47
APPENDIX-E
TIMELINE