The Impact of KGD and Module Repair on Multichip Module Cost · The Impact of KGD and Module Repair...

The Impact of KGD and Module Repair on Multichip Module Cost

The International Journal of Microcircuits and Electronic Packaging, Volume 23, Number1, First Quarter 2000 (ISSN 1063-1674)

© International Microelectronics And Packaging Society 23

The Impact of KGD and Module Repair onMultichip Module CostH. K. Charles, Jr.*, W. D. Barnhart**, J. Van Rij+, and J. M. Petek++

*The Johns Hopkins University Applied Physics Laboratory11100 Johns Hopkins RoadLaurel, Maryland 20723Phone: 240-228-8050Fax: 240-228-6119e-mail: [email protected]**Undergraduate Electrical Engineering Student, Virginia Polytechnic Institute+Mechanical Engineering Graduate Student, University of Utah++Undergraduate Electrical Engineering Student, Ohio State University

Abstract

The impact of known good die (KGD) probability on multichip module (MCM) yield and cost has been modeled and systematicallyanalyzed. The current work extends the researchers’ previous MCM modeling effort involving single chip populations (a single KGDprobability) to modules containing complex, multiple chip populations. Most of the analysis is performed on modules with dualpopulations (that is, two chip types). In order to develop a total cost picture for an MCM versus the respective KGD probabilities ofthe underlying chip populations, it was necessary to develop new algorithms to calculate the number of modules necessary to ensureone working MCM, the chip cost as a function of KGD, and chip repair cost as it relates to the complexity of module testing and theactual removal and replacement operations. The combination of these models and algorithms produces cost surfaces in dual (KDG)probability space that contain optimum or minimal points. Associated with the cost minimums are specific KGD probabilities foreach chip type in the population. Thus, one only pays for improved KGD probability up to the values that minimize the overall modulecost. Repair has a direct impact on the overall module yield and cost. The authors have shown that the first repair makes the mostsignificant improvement in yield and cost reduction. Higher numbers of repairs also have positive impact in large modules. In thedual-population case, there is an optimum number of repairs to minimize cost, and as repair costs increase; this optimum numberappears to decrease.

Key words:

Known Good Die, KGD, MCM Cost, MCM Repair, MultichipModules, and Multiple Population MCMs.

1. Introduction

The production of working Multichip Modules requires theconfluence of three major elements: known good substrates,known good chips (or dice), and appropriate assembly processes(defect free). Substrate validation is usually accomplished by a

series of design rule checks to ensure compatibility with the fab-rication processes followed by substrate electrical testing (usingbed of nails or flying probe-type testers) to verify that all theelectrical networks on the substrate are connected in the appro-priate manner (consistent with the design) and that there are noopens or shorts. Known good die (KGD) requires testing of thechips or die prior to assembly. Much time and effort have beenexpended on the mounting of die for full functional testing (atspeed and temperature) and then demounting them for attach-ment to the actual MCM part. Following assembly (defect freesubstrate populated with KGD), the resultant MCM must be testedto ensure a working module, thus, validating the defect-free na-ture of the assembly process. In an MCM part, if one die fails,the whole module is typically useless, or performance is so re-duced that the MCM cannot be sold for its intended purpose or ata price consistent with full cost recovery. Since chip pre-testing

© International Microelectronics And Packaging Society


Intl. Journal of Microcircuits and Electronic Packaging

24

cannot ensure that all dice will perform as planned after assem-bly, one must either build MCMs so inexpensively that they arethrow-aways (that is, using a high-yielding process driven by alarge commodity market), or one must be able to repair or re-place defective die.

Testing to find the defective chips is a key element in therepair process. Without appropriate testing, repair is extremelycostly or cannot be performed at all. MCMs must be designed inthe beginning for repair, including the complete test matrix andprotocol to locate defective die. The need to repair and influencesubstrate design requiring extended pads; room for die attachand removal collets; robust board metallizations; and, perhaps,extra test points, traces, and/or chips to support the testing pro-cess. The chips themselves may need to include extra circuitelements and paths to facilitate the testing process. The designof both substrates and integrated circuit chips for testability isbeyond the scope of this work.

This study focuses on the repair issue with the assumptionthat the defective chips can be located, removed, and replaced,thus producing working modules. The researchers have accountedfor the difficulty in locating and replacing defective chips by arepair complexity factor, which relates directly to the cost of theseoperations. The approach is based on minimizing overall chipand module costs as a function of KGD probability and the num-ber (and associated cost) of repair. Results are shown to be ex-tremely sensitive to the KGD cost model and the relative repaircost of a defective die (that is, finding the defective die and re-placing it with a working equivalent).

In previous work1, the authors concentrated on modules whosechips all have the same KGD probability. In this current work,the focus was on modules containing chips of different KGDprobabilities. Results are presented for different chip popula-tions containing various numbers of chips with different KGDprobabilities. Analyses are presented for several cost models andrepair scenarios and for various MCM module types. In addi-tion, attention is paid to the estimation of the number of modulesrequired to ensure the production of at least one good module, aswell as to refinements (over previous work) in both the chip costand the repair cost algorithms. Graphical display of the dataallows readers to make trade-offs between KGD probability andthe cost of both chips and modules. Current results reinforcemany of the trends developed in previous studies1, but they alsoprovide new insights as to break-even points, both on repair andon spending money for known good die.

2. Definitions and Equations

To quantify the magnitude and impact of repair on the ulti-mate cost of MCMs, one must introduce a systematic look atMCM yield and the influence of repair on this yield. Since mul-tiple chips with different KGD probabilities complicate the math-ematics to some degree, a review of the equations is first intro-duced and result for MCMs containing chips with a single KGD

probability, and then extension of these results to the case ofmodules having chips with multiple KGD probabilities isachieved.Single KGD Probability. First let P

n(g) = probability that g chips

out of a lot of n chips are good. Assuming that the chips followa binominal probability distribution2, one can write,

n!P

n(g) = –––––––––– pg

cq(n-g)

c (1)

g!(n – g)!

where,p

c = probability that an individual chip is good and,

qc = (1 – p

c) is the probability that an individual chip is bad.

To find the yield of an MCM containing n chips without repair,one needs to find the probability that all dice are good. Thisrelation is given by the following,

Y0mcm

= Pn(n) = pn

c (2)

where the zero superscript indicates zero defects or no need forrepair in this module.

To find the yield number under repair, one may look for mod-ules with defects less than or equal to the number of repairs, r,per module to make. In previous work1, the authors derived ageneralized expression for MCM with r repairs. This expressionis given by the following,

Yrmcm

= Pn(n) + P

n (n – 1) p

c + ... + P

n (n – r) pr

c, (3)

n! = pn

c [ 1 + n(1 – p

c) + ... + –––––––– (1 – p

c)r ]

r!(n – r)!

where rmcmY is the fractional yield of an n-chip MCM (all dice

having the same KGD probability) with r repairs, assuming thatthe repairs of chips themselves have the probability, p

c, of being

good. The yield for any number of repairs, r, can be built fromthe expression on the right-hand size of Equation (3).

Figure 1 plots the MCM yield, Equation (2), as a function ofthe number of chips in a module for a given probability, p

c, that

an individual chip is good. Even at very high KGD probabilities(≥0.98), large modules containing large numbers of chips still haverelatively poor yields (<50%). Figure 2 plots MCM yield withrepair (r = 0 to 5), Equation (3), as a function of the probability, p

c,

that an individual chip is good for a 25-chip MCM.




Figure 1. MCM franctional yield as a function of the numberof chips in the module for various chip KGD probabilities(ranges from 0.9 to 0.99).

Figure 2 can be generalized to other sized modules with theobservation that the first repair makes the largest percentageimprovement in yield, with each succeeding repair diminishingthe important1. Similarly, the lower the initial KGD probability,the larger the impact of repair regardless of module size1.

Figure 2. MCM fractional yield as a function of the KGDprobability for a 25-chip MCM with and without repair (rranges from 0-5).

Dual KGD Probabilities. Now, consider a module with n to-tal chips: l chips have KGD probability p

1 and m chips have

KGD probability p2, where n = l + m. In analogy to the single

population case, the yield of this dual-population module with-out repair is the population-weighted probability that all chipsare good, given by the expression,

2Y0

mcm =

2P*

n(n) = P

l (l) • P

m(m) = pl

1pm

2 (4)

where the asterisk (*) indicates population weighting. The 0superscript represents no repair and the preceding 2 as a sub-script indicates a dual-chip population.

Now, if one repair is allowed, one may have a dilemma as towhich chip to repair; should one repair one of the l-chips withKGD probability p

1 or one of the m-chips with KGD probability

p2? In reality, one would repair the chip that is malfunctioning,

but for analysis purposes, one will adopt a population weightingapproach, as mentioned above, based on the p

is (where for a dual

population model, i = 1 or 2) such that when p1 = p

2, the resultant

equations reduce to those given in the single population caseabove.

If the repair chip has the same KGD probability as its re-spective population of build chips, then the equation of the MCMyield for one repair can be written as follows,

2Y1

mcm =

2P*

n (n) +

2P*

n (n –1) (5)

= pl1pm

2 [ 1 + l(1 – p

1) + m(1 – p

2)].

Now, if cppp == 21 , then 12 mcmY reduces to

[ ])1(1 cnc pnp −+ , just as is required for the single-population

case. Figure 3 presents the improvement in MCM module yield(dual population) for the single-repair case. Figure 3 is plottedfor various fixed population probabilities as a function of thenumber of chips of each type in the population. The combinedchip population is allowed to range to 100 of each type. For tworepairs, the expression becomes as follows,

2Y2

mcm =

2P*

n (n) +

2P*

n (n –1) +

2p*

n (n –2)

= 2Y1

mcm +

2P*

n (n – 2)

l(l – 1) =

2Y1

mcm + pl

1pm

2 [ ––––––– (1 – p

1)2 + (6)

2!

m(m – 1) lm(1 – p

1)(1 –p

2) + –––––––– (1 – p

2)2].

2!

Similarly, for three repairs,

2Y3

mcm =

2Y2

mcm +

2P*

n (n – 3) (7)

and, in general, for r repairs,

2Yr

mcr =

2Yr-1

mcm +

2P*

n (n – r) (8)

where the )(*2 rnPn − term takes the form following form,

l(l – 1) ... [l – (r – 1)]

m(m – 1) ... [m – (r – 1)]

2P*

n (n – r) = pl

1pm

2 {––––––––––––––––– (1 – p

1)r + –––––––––––––––––––– (1 – p

2)r +

(r – 1)! (r – 1)!

l(l – 1) ... [l – (r – 3)]m(m – 1) m(m – 1) ... [m – (r –3)]l(l – 1)–––––––––––––––––––––– (1 – p

1)r – 2(1 – p

2)2 + –––––––––––––––––– –––(1 – p

2)r – 2 (1 – p

2)2

(r – 2)!2! (r – 2)!2!

+ ... + (paired terms involving all the remaining factors of r)}.

(9)




26

Figure 3 illustrates the increasing effect of the number of re-pairs on module yield for the dual-population case. For fixedchip population probabilities, the yield is plotted as a function ofthe number of chips of each type (Type 1 and Type 2) and thenumber of repairs n

r (r = 0, 1, 3, and 5). Figure 4 plots the MCM

yield (fixed dual population, 30 chips Type 1 and 70 chips Type2) as a function of KGD probability for each chip type and thenumber of repairs n

r (r = 0, 1, 3, and 5).

0

Fra

cti

on

al

Mo

du

le Y

ield 1

0 .5

0

100 0

100

N um ber o f Type 2 ChipsNum ber o f Type 1 Chips

3 R e pa irs

p1 = 0.9

p2 = 0.945

0 R epa ir

0

Fra

cti

on

al

Mo

du

le Y

ield 1

0 .5

0

100 0

100

N um ber of Type 2 C hipsN um ber of Type 1 C hips

0

Fra

cti

on

al

Mo

du

le Y

ield 1

0 .5

0

100 0

100

Num ber of Type 2 Ch ipsNum ber of Type 1 Ch ips

5 R epa irs

1 R epa ir

Fra

cti

on

al

Mo

du

le Y

ield 1

0 .5

0

100 0

100

Num ber of Type 2 Ch ipsNum ber of Type 1 Ch ips

0

Figure 3. MCM fractional yield as a function of the numberof chips of each type (type 1 or type 2) in the total chippopulation (up to 100 of each type) for 0, 1, 3, and 5 repairs.In all cases, p

1 = 0.9, p

2 = 0.945.

0 R ep air 1 R epa ir

Fra

cti

on

al

Mo

du

le Y

ield 1

0 .5

0

1 0.9

1

KGD Probab il ity T ype 2

0.9K GD Probabil ity Type 1

Fra

cti

on

al

Mo

du

le Y

ield

1

0 .5

0

1 0.9

1

KGD Probab ility T ype 2

0.9KGD Probab il ity Type 1

3 R e pa irs

Fra

cti

on

al

Mo

du

le Y

ield 1

0 .5

0

1 0.9

1

K GD P ro bability Type 2

0.9KGD Probab il ity Type 1

5 R epa irs

Fra

cti

on

al

Mo

du

le Y

ield 1

0 .5

0

1 0.9

1

KGD Probab il ity T ype 2

0.9K GD Probabil ity Type 1

Figure 4. MCM fractional yield as a function of the KGDprobabilities for the two chip types (type 1 and type 2) in thetotal chip population of 100 for 0, 1, 3, and 5 repairs. In thismodel there are 30 chips of type 1 and 70 chips of type 2.

Multiple KGD Probabilities. In general, the yield for anyMCM containing more than two distinct chip populations can bedeveloped in an analogous manner using the method describedabove for the dual population case. Thus, for a module contain-ing n chips composed of i distinct populations (1, 2, 3, . . . i) or

KGD probabilities with α chips of population 1, β chips of popula-tion 2, γ chips of population 3, and ρ chips of population i, suchthat α + β + γ + . . . + ρ = n, the module yield with no repair is givenby the following,

iY0

mcm =

iP*

n (n) = pα

1pβ

2pγ

3 ... pρ

i (10)

In the case of one repair, again, with repair chips having thesame KGD percentage as their respective subpopulations, theMCM yield is given by the following,

iY1

mcm =

iP*

n (n) +

iP*

n (n – 1)

= pα

1pβ

2pγ

3 ... pρ

i[1 + α(1 – p

1) + β(1 – p

2) + γ(1 – p

3) + ... ρ(1– p

i)]

(11)and the general term for repairs can be built as for the dual popu-lation case using the relation,

iYr

mcm =

iP*

n (n) +

iP*

n (n – 1) + ...+

iP*

n (n – r). (12)

In the rest of this paper, the authors will focus on the dual-popu-lation case, comparing results to the previously published uni-form or single population case1.

3. Number of Modules

In the preceding section, the authors calculated MCM mod-ule yield for various chip populations and KGD probabilities andshowed that module yield is lower for modules with increasingnumbers of chips or decreasing p

is for fixed chip populations. In

practice, rather than module yield, one would like to know howmany modules, N, must be built to assure selves of at least oneworking module. There are several ways to derive an estimate ofthis number. In previous work1, the authors assumed a defect“rate” model based on the minimization of the probability asso-ciated with producing an entirely defective module (all chips arebad). In this current effort, the researchers adopt a confidencelevel approach3 for determining N. Since N varies with chipyield, a general equation for N can be written as follows,

N = µN + Z

cσ

N (13)

where µN is the average of N, σ

N is the standard deviation of N,

and Zc is the confidence coefficient3, which determines the level

of confidence in the resulting estimate of N. Values of Zc for

various confidence levels are given in Table 1.




( )Y−1 ( )Y−1

Table 1. Confidence coefficients Zc for various levels ofconfidence.

ConfidenceLevel *

ZcConfidence

Level *Zc

50% 0.6475 95% 1.96060% 0.8415 96% 2.05470% 1.0365 97% 2.17080% 1.2815 98% 2.32690% 1.645 99% 2.575

*For confidence levels not given in the table, thevalues of Zc can be found using a normal curve areatable.

To determine N, the values of µN and σ

N must be known. To

calculate µN and σ

N, one assumes that the number of modules that

work, out of the number of modules built, N is described by abinomial distribution (consistent with our original probability andyield distributions.) This gives the following,

µw = µ

NY = 1 (14)

σw = µ

NY(1 –Y) = (15)

where µw is the average number of modules that work out of the

N built. By definition of N, this is equal to 1, and σw is the standard

deviation of the working modules built from N. Now, if µN changes

by one standard deviation, so will µw. Thus,

µN + σ

N = (µ

w + σ

w)/Y (16)

with ( )Yww −== 1,1 σµ , and YN /1=µ ; then Equation (16)can be solved for σ

N as follows,

σN = ( )/Y (17)

and the equation for N (Equation (13)) becomes,

N = 1/Y + Zc ( )/Y (18)

Figure 5 plots this relationship for 90%, 95%, 99%, and 99.9%confidence levels. The expectation value or average value of N(=1/Y) is also plotted for comparison, along with the N foundusing the previous defect “rate” method1 with D = 0.05 (or 95%confidence.)

It is interesting that the estimate of N for the previous defect“rate” method compares quite favorably with those from this newmethod. Just as before, high confidence levels (lower defect rates)require building more modules, while for a given confidence level,higher chip KGD probability decreases the number of modulesrequired.

( )Y−1

( )Y−1

0

10

20

30

40

50

60

70

0.9 0.92 0.93 0.95 0.96 0.98 0.99KGD Probability

Average

D Method

CI 90%

CI 95%

CI 99%

CI 99.9%

Figure 5. Number of modules necessary to produce one goodMCM (with no repair) using various estimation methods asa function of chip quality (KGD probability). In the figure,the “average” curve corresponds to the estimated value, theD method curve refers to the previous defect rate method [3]at a 5% defect rate, and the CI’s represent the confidencemethod approach for various confidence levels.

4. Known Good Die Probability

Many researchers have tried to assess semiconductor dieyield and determine what is reasonable to expect for the prob-ability that a chip once received (either from the manufacturer ordistributor) is good4. A survey of over 25 sources yielded KGDprobabilities in the range of 0.5 to 0.99. In this study, the rangeof KGD probabilities was limited to 0.9 and above. At a KGDprobability of 0.9, even small modules (a few chips) suffer sig-nificant yield loss (of the order 35% for 4 chips, 52% for 9 chips);thus, considering values below 0.9 does not seem prudent or cost-effective. One would also recognize that for large MCM mod-ules (50 chips and above), the value of the KGD probabilitiesrequired is greater than 0.95 to produce a significant (cost-effec-tive) module yield.

Just as additional testing has been used to improve the confi-dence that the customer has in die quality (that is, KGD prob-ability), improvement in module screening methods can alsoensure higher delivered module yield5. In fact, confidence that adelivered module (that is, one that has passed manufacturerscreening tests and been shipped to the customer) will work whenreceived (assuming it was built with n chips of KGD probabilitypc) is directly related to confidence that the module screening testwill detect faults. The representation of this final yield is givenby the following,

Ymcm = Yo(1 – η) mcm = pc (1 – η) n (19)

where η is the fault coverage percentage of the module test con-taining a single population expressed as a decimal.




28

( )Y−1

6. Module Repair Costs

Cost of repair is strongly dependent on the ability to locateone or more defective chips in a module containing n chips. Oncethe defective part is located, it must be removed, the site pre-pared for the acceptance of a new die, and the new part attachedand interconnected. In developing a model for repair, it has beenassumed that the cost of locating the defective dice is dependenton module type (digital, analog, etc.), the number of module in-puts and outputs (I/O), the number of die in the module, and acomplexity factor that relates to individual die I/O. Similarly,the repair operations themselves relate to the size of the die, thetype of die attach used, the number and type of interconnections,and, of course, the number of actual repairs conducted on a givenmodel. Analytically, this repair model is given by the expres-sion,

CR = m{n

RA[C

DR + C

DS + C

DA] + n

R(I

CC

I + ∆) + [I

MC

LMβ(1 + γn/2 )]}.

(22)

Table 2 contains the descriptions and some of the values usedin the repair model. Other cost parameters and estimates ofmodule I/O were given previously8. Figure 6 illustrates repaircosts per module as a function of the number and type of chips ina module and the complexity of repair. Changes in chip andmodule complexity and density can significantly change thesecosts. Similarly, the cost of the repair chip is not included inEquation (24), but would be an additive factor to all costs. It isassumed that multiple repair events are essentially independentand that their costs are additive. A multiple repair interactionfactor has been added to account for any additional costs (such asdamage to adjacent chips or wires).

In handling dual chip populations, the repair model is ap-plied to each type of chip. Thus, a cost of repair is determined foreach chip type and then probabilistically weighted based on popu-lation statistics and KGD probabilities to determine an averagerepair cost for a chip in the dual population module.

For the two-population case, the expression becomes,

2Y

mcm =

2Y

mcmo(1 – η1)(1 – η2) = p

1(1 – η

1)lp2(1 – η

2)m (20)

where η1 is the fault coverage percentage for defects in chip popu-

lation 1 (containing l chips) and η2 is the fault coverage percent-

age for chip population 2 (containing m chips). An expression forany number of subpopulations can be built in an analogous man-ner.

Thus, screening can improve confidence that chips or mod-ules that pass the screening tests and are delivered to customersare more likely to work than their unscreened counterparts. How-ever, this screening is difficult and usually very costly. IdealKGD screening would be to burn-in and test the chip at waferlevel; but contact issues, power distribution, and thermal transferproblems make this difficult and costly5. Built-in self-test andboundary scan methods have helped this process. At the modulelevel, fault diagnosis and failure isolation are difficult and costly,and the cost increases nonlinearly with chip number and input/output density. Rework costs at the module level are stronglydependent on this test cost6, as well as on replacement chip costand the cost of old die removal, site preparation, and the place-ment and interconnection of the new die. If this cost is low com-pared to other module costs, including the cost of KGD, repair isviable. If it is extremely high, all steps must be taken to ensurefirst product yield, including paying for KGD percentages. Quan-tification of these concepts is pursued in the following sections.

5. Chip Cost Models

In previous analyses1, 7, the authors described various KGDcost models. For this study, they settle on the cost model givenby the following relation,

(pc – p

g)

C = Co[ 1+ α –––––––– ] (21)

(1 – p

c)

where pg is the KGD probability associated with a known cost

Co, and α is the scaling constant (typical value 0.33). This expres-

sion closely mimics the original cost model7 at low KGD probabili-ties, where little premium is paid for improvements in KGD value,but provides strong cost weighting on KGD value improvement athigh KGD probabilities, where complete testing costs are extremelyexpensive. Other cost models can be developed, but the authorsbelieve the one presented in this case is representative of theactual costs encountered in the purchase of KGD and the per-ceived costs that would be encountered if such dice were avail-able. In the following sections, all calculations are done with acost model of the form given in Equation (21).




Table 2. Parameter values for MCM repair cost model.

Symbol DescriptionValue orRange

m Number of modulesrepaired

(a)

n Number of chips in module (b)nR Number of repairs per

module0, 1 – 5

A Die area, cm2 0.5 – 5CDR Die removal cost, $/cm2 0.5 – 2.5CDS Die site preparation, $/site 0.5 – 2.5CDA Die attach cost, $/cm2 0 – 2

Ic Interconnects per chip (c)CI Cost per interconnect, $/lead 0 – 0.10CC Cost of repair chip (d)Im Number of module

interconnects(e)

CLM Test cost per modulelead, $/lead

0.05-0.20

γ Chip complexity factor 1 – 2β Module complexity (type

factor)1 – 4

∆ Multiple repairinteraction parameter

(f)

(a)From confidence interval curves or engineeringdecision(b)User specified(c)Chip type, complexity (Rent’s rule [7]), or userspecified(d) From chip pool, or use cost model(e)Rent’s rule, or user specified(f)nR raised to a power ≥1, typically 1.5

0.9

To

tal

Co

st,

K$

2 0

1 0

0

0.91

1

KGD Probability Type 1 KGD Probability Type 2

0.9

To

tal

Co

st,

K$

20

10

0

0.91

1


0.9

To

tal

Co

st,

K$

20

10

0

0.91

1


0.9

To

tal

Co

st,

K$

2 0

1 0

0

0.91

1


H ig h C om p lex ityM e diu m C om plexity

N o Re pair L ow C om p lex ity

Figure 6. Total chip cost as a function of KGD probabilityfor a dual population module (n

1 = 25 chips, n

2 = 15 chips)

with 0 and 1 repair. Repairs are classified as low, medium,and high in complexity or cost. Chip costs for both populationsare the same.

7. Total Chip and Module Cost

Total chip cost to produce a working module can be found bycombining the various previous elements together (such as num-ber of modules to ensure a high confidence of yield, the cost ofchips, and the cost of repair). Figure 6 shows an example of totalchip cost for a dual-population module (n

1 = 25 and n

2 = 15) as a

function of the KGD probability for each chip type. The Figureillustrates both the no repair case and single repairs of variouscomplexities. As can be seen, the ability to repair significantlyinfluences total chip cost as a function of the respective KGDprobabilities. The surfaces plotted in Figure 6 assume that eachchip type has unity chip cost at p

i = 0.9. Figure 7 presents similar

data where the small die population (type 2) has a Co that is 10

times larger than that of the larger die population (type 1). Thesurfaces are complex and the minimums in cost are difficult tolocate along the respective probability axes, but it is clear thatrepair can lower the overall chip costs. Similarly, even thoughtotal chip cost for high-complexity repair is larger than for low-complexity repair, even at high cost levels, repair appears to beadvantageous, especially as module size increases.

0.9

To

tal

Co

st,

K$

3 0

1 5

0

0.91

1


0.9

To

tal

Co

st,

K$

30

15

0

0.91

1


0.9

To

tal

Co

st,

K$

30

15

0

0.91

1


0.9

To

tal

Co

st,

K$

3 0

1 5

0

0.91

1


H ig h C om p lex ityM e diu m C om p lex ity

N o Re pair L ow C om p lex ity

Figure 7. Total chip cost as a function of KGD probabilityfor a dual population module (n

1 = 25 chips, n

2 = 15 chips)

with 0 and 1 repair. Repairs are classified as low, medium,and high in complexity or cost. Chip costs for the n

2 chip

populations are ten times larger than the n1 population.

To better assess the cost minimums as a function of KGDprobability, the authors have developed a two-dimensional, con-tour mapping approach that locates the minimum cost point inthe joint probability plane. An example is shown in Figure 8.The cost minimums are located at the “center” of the innermostcontour. This example is for a dual population module (n

1 = 25 and

n2 = 15) as a function of repair (number and complexity). The

results indicate a general shift in the minimum location towards




30

lower KGD probabilities as the number of repairs increases. Theonly exception exists at high repair costs (complexity) where largerrepair numbers seem to require higher KGDs to produce a costminimum. Table 3 illustrates some of the actual costs associatedwith these minimum points. Similarly, depending upon chip andrepair costs, there appears to be an optimum number of repairs thatminimizes the total chip cost, as shown in Table 3 for a typical case.In Table 3, as repair complexity (cost) increases, the optimum num-ber of repairs to minimize cost seems to decrease.

p 2

0.99

0.98

0.97

0.96

0.950.96 0.97 0.98 0.99

p1

p2

0.99

0.98

0.97

0.96

0.950.96 0.97 0.98 0.99

p 1

p2

0.99

0.98

0.97

0.96

0.950.96 0.97 0.98 0.99

p 1

1 Repair 2 R epairs0 Repa ir

4 Repairs 5 R epairs

p 2

0.99

0.98

0.97

0.96

0.950.96 0.97 0.98 0.99

p1

p2

0.99

0.98

0.97

0.96

0.950.96 0.97 0.98 0.99

p 1

p2

0.99

0.98

0.97

0.96

0.950.96 0.97 0.98 0.99

p 1

View A3 Repairs

p 2

0.99

0.98

0.97

0.96

0.950.96 0.97 0.98 0.99

p1

p2

0.99

0.98

0.97

0.96

0.950.96 0.97 0.98 0.99

p 1

p2

0.99

0.98

0.97

0.96

0.950.96 0.97 0.98 0.99

p 1

1 Repair 2 R epairs0 Repa ir

4 Repairs 5 R epairs

p 2

0.99

0.98

0.97

0.96

0.950.96 0.97 0.98 0.99

p1

p2

0.99

0.98

0.97

0.96

0.950.96 0.97 0.98 0.99

p 1

p2

0.99

0.98

0.97

0.96

0.950.96 0.97 0.98 0.99

p 1

View B3 Repairs

Figure 8. Contours of constant total chip cost in the dualprobability plane as a function of repair number for a mediumcomplexity repair. Module size is a 40-chip MCM (n1 - 25chips, n2 = 15 chips). In View A, the chip costs for both chiptypes are equal. In View B, the smaller population of chips(n2) has a cost 3 times larger than the larger population. Thenumber of modules built corresponds to the 99% confidencelevel.

Table 3. Total chip cost for a 40-chip MCM (n1 = 25, n2 - 15) as afunction of the repair complexity and number of repairs. Initalchip cost Co = $10 at pi = 0.9. The number of modules built to

assure one good module is based on a confidence level of 99%.Number of RepairsRepair

Complexity 0 1 2 3 4 5Low $14,425 $6,710 $4,597 $3,853 $3,658 $3,711Medium 14,425 6,875 4,846 4,190 4,104 4,277High 14,425 7,442 5,687 5,324 5,593 6,145

In most cases, for large modules, the total cost for highly com-plex repair is less than the chip cost without repairs, regardlessof repair number. This result contrasts somewhat with the re-sults presented previously for the single-population case for smallmodules (that is, ≤25 chips).

In extrapolation to total module cost9, where chip cost is onlya fraction (typically <20%) of the total10, the results behave in asimilar manner. The higher the additive costs for the other mod-ule elements (substrate, initial assembly), the greater the impactof repair. In fact, for substrates with high added value such asMCM-Ds, repair improves yield and saves costs at all reasonableKGD probabilities.

8. Discussion

Most of the results shown in this paper can be generalized toother types and numbers of chips. As can be seen from Figure 2and Table 3, the first repair improves yield and reduces cost themost. This is true regardless of whether the population is singleor dual chip. The authors believe that similar results will holdtrue for even more complex population distributions. However,as modules get larger, the impact of the second and third repairshould not be neglected. With dual-chip populations, dependingon relative numbers of chips in each subpopulation, chip cost,and repair complexity, there appears to be an optimum numberof repairs to minimize cost; as repair costs increase, this opti-mum number seems to decrease.

Repair costs are extremely dependent on the ability to lo-cate the defective die. The cost for such tests can vary over or-ders of magnitude, ranging from tens of dollars for simple mod-ules with a few chips to hundreds and even thousands of dollarsfor large modules in complex chips. Multiple chip populationscan make this problem more difficult. The authors believe theweighted repair model described in this study captures the essen-tial elements and systematically produces costs consistent withactual empirical experience. This cost model produces similarresults. The application of the module build number, chip cost,and repair costs models, under most scenarios, has produced costcurves (single-population case) and surfaces (dual-populationcase) with an optimum (minimum) point or KGD probability(ies).Such minimums suggest that one pays for increased KGDprobability(ies) only up to that point.

The ability to repair even after the defective chip(s) is located isa major concern for multichip module producers. Flip Chips shouldbe the easiest to repair of all the first level interconnects. Uponproper controlled and localized heating, the defective die can be




when repair costs are too high?”

Acknowledgments

The authors thank S. Lynn Hoff for manuscript preparationand Margaret Knox Morris for editing.

References

1. J. M. Petek and H. K. Charles, Jr., “Known Good Die, DieReplacement (Rework) and Their Influences on MultichipModule Costs,” Proceedings 48th IEEE Electronic Compo-nents and Technology Conference, ECTC ’98, Seattle, Wash-ington, May 24-28, pp. 909-915, 1998.

2. A. Papoulis, “Probability, Random Variables, and StochasticProcesses”, New York, McGraw-Hill, pg. 102, 1965.

3. M. R. Spiegel, “Schaum’s Outline Series: Theory and Prob-lems of Probability and Statistics”, New York, McGraw-Hill,pp. 194-196, 1975.

4. R. R. Tummala, E. J. Rymaszewski, and A. G. Klopfenstein,“Microelectronics Packaging Handbook”, New York,Chapman & Hall, pp. I-38-40, 1999.

5. G. Flynn and L. Gilg, “A Pragmatic Look at Water LevelBurn-in: The Water Level Known-Good Die Consortium,”Proceedings 1996 International Conference on MultichipModules, Denver, Colorado, April 17-19, pp. 287-292, 1996.

6. C. E. Bauer, K. W. Posse, and K. T. Wilson, “Good EnoughDie? A Cost Model for KGD Test Decisions,” Proceedings8th International Microelectronics Conference, Omiya, Ja-pan, pp. 260-270, 1994.

7. H. K. Charles, Jr., and M E. Uy, “Design Aid for MultipchipModules,” Proceedings International Microelectronics Sym-posium, Boston, Massachusetts, November 15-17, pp. 42-48,1994.

8. H. K. Charles, Jr., “Design Rules for Advanced Packaging,”Proceedings International Microelectronics Symposium, Dal-las, Texas, November 9-11, pp. 301-306, 1993.

9. H. K. Charles, Jr., “Cost Versus Technology Trade-Offs forMultichip Modules,” International Journal of Microcircuitsand Electronic Packaging, Vol. 19, No. 3, pp. 295-300, 1996.

10. H. K. Charles, Jr., “Multichip Module Design, Fabrication,and Assembly,” Proceedings 1st International Conference andExhibition on Emerging Microelectronics and Interconnec-tion Technologies, Bangalore, India, February 12-16, 1996,pp. 239-247.

About the authors

proper controlled and localized heating, the defective die can beremoved from the substrate without substrate damage. Ceramicsubstrates (MCM-Cs) are the most robust in this regard, followedby Printed Wiring Boards (MCM-Ls), then thin film multilayerssuch an MCM-Ds. After removal, site clean up and dress is usu-ally required as well as inspection for pad and solder dam damage.Reflow of the new die is accomplished using a localized heat sourceor hot chuck. If the original die was underfilled, removal is muchmore difficult and, perhaps, impossible without substrate damageunless reworkable underfill materials were used.

With tape automated bonded chips, both the die attach and theouter lead bonds must be broken and removed. Selective heatingcan usually remove the outlead bonds, which are typically solderreflowed. The die attach bond (epoxy or metal reflow) can also beremoved by applying heat and selective mechanical force. Cleanupand dress of both the outerlead bond pads and the die attach areusually necessary. Again, the robustness of the substrates typi-cally order from ceramic to thin film multilayer.

Wirebonded chips are relatively east to remove, provided theyare not glob topped or the glob top is reworkable. Wires can becut and the dice removed by selective heating and the applicationof a torquing force. Die attach pad clean up and dress are usuallyrequired before installing the new die. The real question is, “Arethe pads for the tail bonds large enough to accommodate a sec-ond tail bond?” If repair is anticipated, pad size could be in-creased in the design stage to facilitate repair. If no room existsfor a second bond, then an attempt would have to be made tobond over the remaining wire residue (if any) after the old wirehas been peeled from the substrate. This may require a tampingoperation, followed by tail bond replacement. A security ballcan also be used in certain special circumstances. Reverse bond-ing (ball-on-substrate, wedge-on-chip) has also been observed.The above comments also apply to ultrasonic or wedge-wedgebonding, but no security fall feature exists.

9. Conclusions

The influence of KGD probability on MCM module yield wassystematically studied. This work extends previous work to thedual population case and suggests methods for handling evenmore complex multiple chip populations. Models that estimatemodule build size, chip costs, and repair costs were developed.Results indicate that the ability to repair MCMs is extremelyimportant and that the repair of at least one chip per module cansignificantly improve yield. Yield curves or surfaces versus KGDprobability(ies) have cost minimums that can be utilized to opti-mize chip cost and, hence, module cost.

Future work will include expansion to three and four popula-tion cases with diverse chips including large numbers of highyield (high KGD probability) passives and a small number of com-plex, low yielding integrated circuits. Questions to be answeredinclude: “Can passive repair be ignored?,” “Is sequential assem-bly (and test) prudent and cost effective?,” and “How to know




32

Dr. Charles holds B.S. and Ph.D. Degrees in Electrical Engineer-ing from Drexel University and The Johns Hopkins University,respectively. He is a member of the Principal Professional Staff atThe Johns Hopkins University Applied Physics Laboratory andAssistant Department Head for Engineering. Dr. Charles hasworked for over 25 years in the microelectronics arena and is a

specialist in electronic devices, systems,packaging, and reliability. His latest in-terests include multichip module design,fabrication and testing; advanced inter-connect; biomedical instrumentation;and novel sensors based on MEMS tech-nology. He has published over 160 pa-pers on electronic devices and packag-ing. Dr. Charles is a Fellow and formerPresident of IMAPS - The Microelectron-ics and Packaging Society, a Fellow ofthe IEEE, and a member of the Board of

Governors of the IEEE’s Components, Packaging, and Manufac-turing Technology (CPMT) Society. He has received internationalrecognition for his research, development, and teaching activities,including ISHM’s Technical Achievement Award (1987), selectionas Maryland’s Distinguished Young Engineer (1989), The JohnsHopkins University’s Outstanding Teaching Award (1992), theCPMT Board of Governors’ Outstanding Service Award (1992),ISHM’s Distinguished Service Award (1994), the IMAPS Daniel C.Hughes Memorial Award (1998), and numerous awards for bestpapers.

William Barnhart will obtain a B.S.E.E. Degree from Virginia Poly-technic Institute in May 2000 and has been accepted in their Mas-ters program. He will be performing research in the area of RFintegrated circuit design. He studied MCM packaging and yieldin relationship to KGD probability at The Johns Hopkins Univer-sity Applied Physics Laboratory, Laurel, Maryland, in 1999. In

addition to his studies, Mr. Barnhartworks part-time in the Virginia Polytech-nic Institute Microelectronics PackagingLaboratory. He is also an officer in thestudent chapter of both IMAPS and theIEEE societies.

Jennifer Van Rij received an AS De-gree in Mechanical Engineering from SaltLake Community College in 1977, a B.S.Degree in ME from the University of Utahin 1999, and will obtain an M.S. Degree in

Mechanical Engineering from the University of Utah in June 2000.Her work experience includes detonation of old and failed missilesmotors at Alliant Techsystems in 1997, MCM package modeling,and KGD analysis at The Johns Hopkins University Applied Phys-

ics Laboratory in 1998, and design andmodification of cooled turbine blades/vains at Pratt and Whitney in 1999. Ms.Van Rij graduate research involvesroughness of gas turbine engine sur-faces. Other interests include runningmarathons, snowboarding, kayaking,backpacking, bicycling, and rock climb-ing.

Jeannie Petek will obtain a B.S.Degree in Electrical and Computer Engineering from the Ohio StateUniversity in June 2000. She will be an Optics Engineer for Evans& Sutherland, Salt Lake City, Utah. Her work experience includeselectro-optics testing at DCS Corporation, Alexandria, Virginia, in1998; and MCM packaging and KGD probability analysis at TheJohns Hopkins University Applied Physics Laboratory, Laurel,MD, in 1997. Ms. Petek hopes to continue her education to in-clude a Masters Degree in Optics at the University of Utah.

The Impact of KGD and Module Repair on Multichip Module Cost · The Impact of KGD and Module Repair...

Documents

Transcript of The Impact of KGD and Module Repair on Multichip Module Cost · The Impact of KGD and Module Repair...