1_1

download 1_1

of 6

description

Tilings and MAthematics

Transcript of 1_1

  • A Uniform Optimization Technique for

    Oset Assignment Problems

    Rainer Leupers, Fabian David

    University of DortmundDept. of Computer Science 1244221 Dortmund, Germany

    email: [email protected]

    Abstract{ A number of dierent algorithms for op-timized oset assignment in DSP code generation havebeen developed recently. These algorithms aim at con-structing a layout of local variables in memory, suchthat the addresses of variables can be computed e-ciently in most cases. This is achieved by maximizingthe use of auto-increment operations on address reg-isters. However, the algorithms published in previouswork only consider special cases of oset assignmentproblems, characterized by xed parameters such as reg-ister le sizes and auto-increment ranges. In contrast,this paper presents a genetic optimization technique ca-pable of simultaneously handling arbitrary register lesizes and auto-increment ranges. Moreover, this tech-nique is the rst that integrates the allocation of modifyregisters into oset assignment. Experimental evalua-tion indicates a signicant improvement in the qualityof constructed oset assignments, as compared to pre-vious work.

    1 Introduction

    One architectural feature typical for DSPs is the specialhardware for memory address computation. DSPs areequipped with address generation units (AGUs), capa-ble of performing indirect address computations in par-allel to the execution of other machine instructions. Infact, the AGU architectures of many DSPs, such as theTI C25/C50/C80, the Motorola 56k family, the Ana-log Devices ADSP-210x, or the AMS Gepard DSP core,are very similar and, for support of indirect addressing,dier mainly in the following parameters:

    The number k of address registers (ARs). ARs storethe eective addresses of variables in memory andcan be updated by load and modify (i.e., adding orsubtracting a constant) operations.

    The numberm of modify registers (MRs). MRs canbe loaded with constants and are generally used tostore frequently required AR modify values.

    The auto-increment range l.

    Further dierences in the detailed AGU architectures ofDSPs are whether MR values are interpreted as signedor unsigned numbers, and whether ARs and MRs areorthogonal, i.e., whether each MR can be used to modifyeach AR.

    Typically, such AGUs permit to execute two typesof address arithmetic operations in parallel to other in-structions:

    1. Auto-increment operations of the form

    AR[i] += d or

    AR[i] -= d,

    where AR[i] is an AR, and d l is a constant lessor equal to the auto-increment range.

    2. Auto-modify operations of the form

    AR[i] += MR[j] or

    AR[i] -= MR[j],

    where AR[i] is an AR, and MR[j] is an MR.

    These two operations can be executed without any over-head in code size or speed. In contrast, loading an AR orMR, or modifying an AR by a constant larger than l al-ways requires one extra machine instruction. Therefore,any high-level language compiler for DSPs should aim atmaximizing the use of auto-increment and auto-modifyoperations for address computations, so as to maximizecode speed and density. This is extremely urgent inembedded DSP systems with real-time constraints andlimited silicon area for program memory.

    One method to minimize the code needed for ad-dress computations is to perform oset assignment oflocal variables. Let v be a variable accessed at somepoint of time in a program, and let variable w be theimmediate successor of v in the variable access sequence.Whether or not the address of w can be computed fromthe address of variable v by auto-increment obviouslydepends on the absolute dierence of their addresses.Since the compiler determines the addresses ("osets")of local variables in the stack frame of a function, theosets can be chosen in such a way, that the oset dis-tance of v and w is less or equal to l, so that auto-increment is applicable. A more precise description of

  • oset assignment is given in section 2. Alternatively,an MR containing the oset distance may be exploitedto implement the address computation by auto-modify.Examples are given in section 4.

    Since oset assignment requires detailed knowledgeabout the variable access sequences in a program, thisoptimization is typically executed as a separate compilerphase subsequent to code generation.

    The purpose of this paper is to present a novel tech-nique for solving oset assignment problems, which ismore general than previous work and, as a consequenceof this, yields better solutions. It is complementary totechniques, which optimize address computations for axed memory layout with application to global variablesor array elements [1, 2, 3, 4]. The exact contributionsof this paper are the following:

    In section 3 we present a genetic algorithm (GA)formulation for solving the oset assignment prob-lem given an arbitrary number k of ARs and a xedauto-increment range l = 1.

    In section 4, we extend the GA to incorporate anarbitrary number m of MRs into oset assignment.Utilization of MRs, which is treated separately fromoset assignment in previous work, has a large im-pact on the quality of the constructed oset assign-ments.

    In section 5, we further generalize the GA, so as tooptimize oset assignments for an arbitrary auto-increment range l.

    In all three sections we provide experimental resultsthat indicate the improvement achieved over previoustechniques. Finally, section 6 gives conclusions.

    2 Simple and general oset as-signment

    Bartley's algorithm [5] was the rst that solved the sim-ple oset assignment problem (SOA), which can be spec-ied as follows.

    Given an AGU with a single AR and an auto-increment range of 1, a set of variables V = fv1; : : : ; vngand an access sequence S = (s1; : : : ; sm), with each sibeing a member of V , compute a bijective oset map-ping F : V ! f1; : : : ; ng, such that the following costfunction is minimized:

    C(S) =

    m1Xi=1

    d(i)

    d(i) =

    1; if jF (si) F (si+1)j > 10; else

    Bartley proposed a graph-based heuristic to computeF based on information extracted from S, and showed

    that using SOA as a compiler optimization (instead ofa straightforward oset assignment that neglects S) in-deed gives a signicant improvement in code quality.

    Liao [6] proposed an alternative SOA algorithm,proved the NP-hardness of SOA, and provided a heuris-tic algorithm for the general oset assignment problem(GOA). GOA is the generalization of SOA towards anarbitrary number k of ARs. Liao pointed out that GOAcan be solved by appropriately partitioning V into ksubsets, thereby reducing GOA to k separate SOA prob-lems. In [7, 8] improved SOA algorithms are described,while in [9] an improved variable partitioning heuristicfor GOA has been given. However, these techniques onlywork for a xed auto-increment range l = 1 and do notexploit modify registers.

    3 Genetic optimization for GOA

    Genetic algorithms (GAs) are a well-tried optimizationtechnique imitating natural evolution to achieve goodsolutions (cf. [10] for an overview). GAs are partic-ularly well-suited for nonlinear optimization problems,since they are capable of skipping local extrema in theobjective ("tness") function, and thus in general comeclose to optimal solutions. Among many other applica-tions, GAs have been successfully applied to data pathsynthesis [11].

    We have chosen GAs for solving oset assignmentproblems mainly due to two reasons: First, GAs aremore robust than heuristics. If enough computationtime is invested, then a GA most likely approximatesa global optimum, while heuristics in many cases aretrapped in a local optimum. Since very high compila-tion speed is not important for DSP compilers, GAs aremore promising than heuristics, whenever a reasonableamount of time is not exceeded. Second, since oset as-signment essentially demands for computing a good per-mutation of variables w.r.t. simple cost functions, osetassignment has a straightforward encoding as a GA.

    In this section, we present our solution approach theGOA problem in the form of a GA. The same formula-tion is also used for the extensions described in sections4 and 5, where only the tness function is adapted, soas to solve generalized oset assignment problems.

    3.1 Chromosomal representation

    The chromosomal representation in our approach closelyreects the oset mapping F dened in section 2. Eachvariable in V is represented by its unique index inf1; : : : ; ng. Each gene in the chromosome represents onevariable, and its position in the chromosome accountsfor its oset. This representation can be immediatelyused for solving the SOA problem, since SOA essentiallydemands for a certain permutation of variables.

    In order to solve the GOA problem, it must be ad-ditionally decided, which of the k available ARs will beused for accessing a certain variable. We accommodate

  • 2 5 4613AR1

    SOA representation (k=1)

    2 5 3 1 6 47

    AR1 AR2

    separator

    GOA representation (k=2)

    123456

    assignmentoffset

    v2v5v3v1v6v4

    V = {v1, v2, v3, v4, v5, v6}S = (v2, v4, v1, v2, v3, v6, v1, v5, v4, v1)

    SOA GOAAGU operations

    AR1 = 1AR1 += 5AR1 -= 2AR1 -= 3AR1 += 2AR1 += 2AR1 --AR1 -= 2AR1 += 4AR1 -= 2

    AR1 = 1AR2 = 6AR1 += 3AR1 -= 3AR1 += 2

    AR1 ++AR1 -= 2AR2 ++AR1 += 2

    AR2 --

    Figure 1: Chromosomal representation for SOA andGOA

    this by using a modied chromosomal representation:In addition to the variable indices f1; : : : ; ng, the genesmay assume a value in fn + 1; : : : ; n + k 1g. In anystep of the GA, each chromosome represents one per-mutation of the values f1; : : : ; n+ k 1g. Indices largerthan n serve as "separator symbols" in the chromosome.Starting with AR number 1, each separator representsa transition to the next higher address register, which isthen used for all variables following in the chromosomebefore the next separator (g. 1). Since there are k 1separators, one obtains an oset assignment for k ARs.The chromosomal representation is exemplied in g. 1.

    3.2 Mutation

    Since any chromosome must represent a permutation off1; : : : ; n + k 1g, mutation operators have to be per-mutation preserving, i.e., they must only generate newpermutations of f1; : : : ; n+k1g. This can be achievedby using transpositions for mutation of chromosomes.A transposition denotes the exchange of the contentsof two genes in a chromosome. The positions of thetwo genes are randomly chosen. A transposition can ei-ther modify the osets of two variables or change theAR assignment in GOA. Since any permutation can becomposed by series of transpositions, all GOA solutions,in particular the optimal one, can be reached by usingonly transpositions as mutation operators.

    3.3 Crossover

    In order to accelerate the convergence of the GA, alsocrossover operations are applied to generate new indi-viduals in the GA population. Just like mutations, thecrossover operator has to be permutation preserving,so as to obtain only valid solutions. In our approach,we use the standard order crossover operation (g. 2),which generates two ospring individuals from two par-ent individuals as follows:

    1. Randomly choose two gene positions in the par-ents' chromosomes A and B. These positions in-

    7 3 4 6 0 2 1 958

    6 0 2 87 1 9 4 3 5

    7 3 4 6 0 2 1 958

    6 0 2 87 1 9 4 3 5

    A

    B

    B

    A

    step 1

    step 2

    step 3

    step 4

    A1 A2 A3

    B3B2B1

    4 5 6 0 8 7 3

    1 2 9 8 3 7 0

    2 9

    5

    4 5 6 0 8 7 3

    1 2 9 8 3 7 0

    1

    64

    Figure 2: Permutation preserving order crossover

    duce a three-partitioning A = (A1; A2; A3) andB = (B1; B2; B3).

    2. Mark the genes occurring in A2 in B and the genesoccurring in B2 in A.

    3. Generate a new chromosome A0 from A: Startingwith the leftmost position of interval A2, in left-to-right order, write the non-marked genes of A intoA1 and A3, while leaving interval A2 as a gap. Gen-erate a new chromosome B0 from B analogously.

    4. Copy the center interval from B to A0, and copythe center interval from A to B0.

    3.4 Fitness function

    The tness function Z is a metric for the quality of anindividual I in the population. For a given variableaccess sequence S = (s1; : : : ; sm), function Z decodesa chromosome (w.r.t. oset and AR assignment) andcounts the number of address computations that canbe implemented either by switching the AR index orby auto-increment. Like auto-increment, switching theAR index does not require an extra machine instruction.Thus, higher tness corresponds to a cheaper oset as-signment. Each individual I induces a partitioning ofthe variable set V into disjoint subsets V1; : : : ; Vk, aswell as k "local" oset functions F1; : : : ; Fk. Each vari-able v in a subset Vi is addressed by AR number i andhas a global oset of

    F (v) = jV1j+ : : :+ jVi1j+ Fi(v)

    In summary, the tness function is dened as

    Z(I) =

    m1Xi=1

    d(i)

    with

    d(i) =

    8>>>>>>>:

    1; if jF (si) F (si+1)j 1 andsi and si+1 are in the same variablesubset Vj 2 fV1; : : : ; Vkg

    1; if si and si+1 arein dierent variable subsets

    0; else

  • 3.5 GA and parameters

    The initial population consists of a set of randomly gen-erated permutations, where exactly one individual iscomputed by a GOA heuristic [9], so as to provide a"seed" to the GA, which accelerates the convergence to-wards the optimum. In each iteration (generation) ofthe GA, the tness of all individuals in the populationis evaluated. The "strong" individuals (i.e., a fractionof the total population with highest tness) are selectedfor crossover. On a fraction of the ospring generatedby crossover, a mutation is performed. Finally, the o-spring replaces the "weak" individuals in the popula-tion. This is iterated until a termination condition issatised. After termination, the ttest individual in thenal population is emitted as the best solution.

    A crucial aspect in the applicability of GAs arethe parameters that guide the optimization procedure.Based on extensive experimentation, we have selectedthe following parameters:Mutation probability per gene: 1=(n+ k 1)Population size: 30 individualsReplacement rate: 2/3 of the population sizeTermination condition: 5,000 generations or 2,000generations without a tness improvement

    For GOA problem sizes of practical relevance (jV j 90; jSj 100), we have measured maximum runtimes of12 CPU seconds on a 300 MHz Pentium II PC. Thisis about 10 times more than for previously publishedheuristics. However, as shown in the following, the GAprocedure achieves better solutions, so that the addi-tional computation time is justied.

    3.6 Results

    We have statistically evaluated the performance of theabove genetic optimization, as compared to the GOAheuristic described in [9]. For dierent problem param-eters (k, jSj, and jV j) we have measured the averagequality of computed GOA solutions over 100 randomvariable access sequences1 . The results are summarizedin table 1 which gives the average number of extra in-structions for address computations for each parametercombination, as well the percentage of the GA as com-pared to the heuristic.

    The results show, that the proposed GA on the av-erage outperforms the heuristic by 18 %. The general-ization described in the next section, however, provideseven higher improvements.

    4 Inclusion of modify registers

    As explained in section 1, also auto-modify operationscan be exploited to avoid code overhead for addresscomputations. Auto-modifys permit to implement some

    1Text les containing all detailed problem instances and com-puted oset assignments can be obtained from the authors.

    k jSj jV j GOA [9] GA %

    2 50 10 14.3 11.9 83

    2 50 20 18.0 16.3 91

    2 50 40 10.4 9.8 94

    2 100 10 33.0 29.3 89

    2 100 50 41.6 41.0 98

    2 100 90 11.2 11.0 98

    4 50 10 5.7 4.3 75

    4 50 20 10.7 7.0 65

    4 50 40 9.5 7.1 75

    4 100 10 9.8 6.6 67

    4 100 50 29.0 27.5 95

    4 100 90 9.5 8.2 86

    8 50 10 5.0 4.2 84

    8 50 20 9.0 5.7 63

    8 50 40 11.9 7.1 60

    8 100 10 5.0 5.0 100

    8 100 50 20.6 15.9 77

    8 100 90 12.7 9.0 71

    average 82

    Table 1: Experimental results: Genetic optimization forGOA

    address computations in parallel which cannot be cov-ered by auto-increments. This requires that MRs beloaded with appropriate values, i.e., oset dierences.While previous GOA algorithms neglect auto-modifys,exploitation of such operations may lead to reduced ad-dress computation code.

    So far, few approaches have been reported to ex-ploit MRs during oset assignment. Wess [15] consid-ered the AGU architecture of an ADSP-210x, which con-tains m = 4 MRs. In his technique, these MRs are stat-ically loaded with the values f-2,-1,1,2g. Conceptually,this approach is equivalent to an increase in the auto-increment range l, which will be discussed in section 5.

    In [9], we have argued that dynamic loading of MRsyields a higher optimization potential, since a largersearch space is explored. It has been shown, that anextension of a page replacement algorithm (PRA) foroperating systems [12] can be applied to determine thebest values of m MRs at each point of time. The mainidea of the PRA is, that a new oset dierence d shouldbe loaded into a MR currently storing a value d0 6= d,whenever the next use of d in the remaining sequence ofaddress computations precedes the next use of d0. Theeect of the PRA, as compared to a pure GOA solu-tion, is exemplied in g. 3 a) and b). The algorithmis not aected by orthogonality or signed-ness of MRs(cf. section 1). In fact, the PRA yields optimal results,but needs to be applied as a post-pass optimization aftersolving GOA. Thus, its potential optimization eect isalready reduced by the computed oset assignment.

    Considering auto-modifys during oset assignmentgives much higher opportunities for optimization. Asillustrated in g. 3 c), better and completely dierentmemory layouts than with the pure GOA approach may

  • ***

    ***

    *

    *

    AR1 = 2AR1 --AR1 += 3AR1 ++AR1 -= 2AR1 --AR1 ++AR1 --AR1 += 3AR1 -= 3AR1 +=2AR1 --AR1 += 0AR1 += 0AR1 -= 2AR1 ++AR1 --AR1 ++AR1 += 0AR1 += 3

    12345

    v4v0v2v1v3

    offs.ass.

    *

    *

    *

    *

    * AR1 = 2AR1 --MR1 = 3AR1 += MR1AR1 ++AR1 -= 2AR1 --AR1 ++AR1 --AR1 += MR1AR1 -= MR1MR1 = 2AR1 += MR1AR1 --AR1 += 0AR1 += 0AR1 -= MR1AR1 ++AR1 --AR1 ++AR1 += 0AR1 += 3

    12345

    v4v0v2v1v3

    offs.ass.

    12345

    v3v2v1v0v4

    offs.ass. *

    *

    *

    AR1 = 4AR1 ++MR1 = 2AR1 -= MR1AR1 -= MR1AR1 ++AR1 += MR1AR1 -= MR1AR1 += MR1MR1 = 3AR1 -= MR1AR1 += MR1AR1 --AR1 --AR1 += 0AR1 += 0AR1 += MR1AR1 --AR1 ++AR1 --AR1 += 0AR1 -= MR1

    S = (v0, v4, v1, v3, v2, v0, v2, v0, v3, v0,v1, v2, v2, v2, v4, v0, v4, v0, v0, v3)

    V = {v0, v1, v2, v3, v4}

    a) c)b)

    Figure 3: Example for exploitation of modify registers(k = 1;m = 1), "*" denotes address computationsnot implemented by auto-increment or auto-modify: a)GOA solution, b) post-pass utilization of auto-modifyswith PRA, c) integrated GOA and MR exploitation (op-timality can be shown for this example)

    be computed when simultaneously considering auto-modifys.

    We can exploit this additional optimization poten-tial in the GA technique from section 3 by integratingthe PRA into the tness function Z. Using an appro-priate implementation, the PRA takes only linear timein jSj and thus hardly increases the runtime required tocompute Z(S), which also takes linear time. The inte-gration works as follows: The PRA is executed on eachoset assignment represented by an individual I in thepopulation. It returns the numberM (I) of address com-putations that can be saved (besides application of auto-increment) by optimal exploitation of auto-modifys forthe represented oset assignment. The tness functionfrom section 3 is replaced by

    Z(I) =

    m1Xi=1

    d(i) +M (I)

    This approach uses the PRA no longer as a post-passprocedure, but as an additional means to measure thequality of an oset assignment during the GA. There-fore, the potential optimization due to MRs inuencesthe computed oset assignment itself.

    We have compared the performance of the GA usingthe new tness function to a "conventional" approachthat uses the PRA only as a post-pass optimization afterGOA. The results are shown in table 2. The parametersare like in table 1. The number m of MRs (unsigned

    k = m jSj jV j post-pass integrated %

    2 50 10 5.9 3.6 61

    2 50 20 11.4 6.5 57

    2 50 40 9.1 6.9 67

    2 100 10 12.6 4.0 32

    2 100 50 38.6 28.7 74

    2 100 90 11.0 9.5 86

    4 50 10 5.0 4.0 80

    4 50 20 7.6 5.1 67

    4 50 40 9.0 6.2 69

    4 100 10 5.2 4.1 79

    4 100 50 17.3 9.3 54

    4 100 90 8.7 7.2 83

    8 50 10 5.0 4.0 80

    8 50 20 8.9 5.2 58

    8 50 40 11.7 7.3 62

    8 100 10 5.0 4.1 82

    8 100 50 15.0 9.9 66

    8 100 90 12.2 8.6 70

    average 68

    Table 2: Experimental results: GOA including auto-modify optimization

    values assumed) has been set equal to k. The GA basedtechnique yields better results than the previous tech-nique for all parameter combinations, with an averagereduction of costly address computations of 32 %.

    5 Inclusion of larger auto-incre-

    ment ranges

    So far, we have considered a xed auto-increment range(AIR) of l = 1, which is, for instance, given in the TI'C25 and the Motorola 56000 DSPs. Other machines,such as the TI 'C80 or the AMS Gepard have an AIR ofl = 7, i.e., much larger oset dierences can be coveredby auto-increment operations.

    An optimization technique for AGUs with an ar-bitrary AIR l 1 has been given in [13]. It is an ex-tension of the graph-based heuristic reported in [6], andactually considers the AIR only when solving the SOAproblem, i.e., after the variables have been assigned toARs. Therefore, its optimization eect for GOA is pre-sumably very limited. Furthermore, the potential use ofMRs is neglected.

    As already mentioned in section 4, Wess' technique[15] uses static MR values to achieve an AIR of l = 2.The oset assignment is computed by simulated anneal-ing. Similarly, also the technique in [14] only works fora xed AIR of l = 2.

    Our GA approach permits to accommodate arbi-trary l values during GOA by a minor modicationof the tness function. Instead of checking jF (si) F (si+1)j 1 during computation of the d(i) values fromsection 3.4, we check, whether jF (si) F (si+1)j l.

  • (jV j; jSj) k Wess GA % GA %l = 2 l = 2 l = 0

    m = 0 m = 0 m = 4

    (15,20) 1 1.8 2.2 122 0.9 50

    (10,50) 1 11.9 14.1 118 10.2 86

    (10,50) 2 1.8 1.2 67 1.0 56

    (25,50) 2 6.7 7.6 113 6.7 100

    (40,50) 4 3.0 1.1 37 0.8 27

    (50,100) 4 15.6 17.4 112 13.9 89

    average 95 68

    Table 3: Experimental results: Generalized GA

    Thereby, an address computation is only considered"costly", if the oset distance exceeds l.

    Since this extension is completely independent ofthe MR utilization technique from section 4, our GAtechnique actually considers the trade-o between auto-increments and auto-modifys (with dynamicMR reload-ing) when optimizing the oset assignment. As com-pared to an approach with static MR contents, this pro-vides better solutions.

    For experimental evaluation, we have compared ourgeneralized GA technique to Wess' approach. The pa-rameter combinations and results are taken from [15],while the AGU conguration corresponds to that of anADSP-210x DSP. Table 3 summarizes our results.

    Column 3 gives the results (number of extra in-structions for address computations) achieved by Wess'technique. Since in that approach, the 4 available MRsare statically loaded with the values -2, -1, 1, 2, no fur-ther MRs are left (m = 0), and the eective AIR isl = 2. Column 4 shows the GA results for the sameconguration, i.e., l = 2;m = 0, while column 5 showsthe percentage. The results are of similar quality on theaverage. This is no surprise, since the simulated anneal-ing technique is known to produce results of a qualitycomparable to GAs.

    However, as indicated by the data in column 6(percentage in column 7), the GA outperforms Wess'technique, when the restriction of static MR contentsis removed, so that all 4 MRs become available for dy-namic reloading. Since the ADSP-210x permits no auto-increments without using MRs, we have set l = 0. Ascan be seen, the larger exibility in most cases resultsin better solutions, since the GA itself decides whetherstatic or dynamic values of MRs are preferable for a cer-tain instance of the oset assignment problem. On theaverage, the amount of extra instructions for addresscomputations is reduced to 68 %.

    6 Conclusions

    Oset assignment is a relatively new code optimizationtechnique for DSPs, which exploits the special addressgeneration hardware in such processors. We have pre-

    sented a novel GA based approach to solve oset as-signment problems for dierent AGU congurations ofDSPs. One main advantage of the proposed techniqueis, that the functionalities of a number of previous tech-niques are covered by a uniform optimization procedure,which is only adjusted by parameters in the tness func-tion. In addition, as a consequence of its generality, theGA yields signicantly better oset assignments thanprevious algorithms. Practical applicability in compil-ers is ensured, because the GA is suciently fast andcomparatively easy to implement. We are currently in-vestigating further generalizations of the GA technique,such as incorporation of variable live range informationand oset assignment in presence of multiple AGUs.

    References

    [1] G. Araujo, A. Sudarsanam, S. Malik: Instruction Set Design andOptimizations for Address Computation in DSP Architectures, 9thInt. Symp. on System Synthesis (ISSS), 1996

    [2] C. Liem, P.Paulin, A. Jerraya: Address Calculation for Retar-getable Compilation and Exploration of Instruction-Set Architec-tures, 33rd Design Automation Conference (DAC), 1996

    [3] C. Gebotys: DSP Address Optimization Using a MinimumCost Circulation Technique, Int. Conf. on Computer-Aided De-sign (ICCAD), 1997

    [4] R. Leupers, A. Basu, P. Marwedel: Optimized Array IndexComputation in DSP Programs, Asia South Pacic Design Au-tomation Conference (ASP-DAC), 1998

    [5] D.H. Bartley: Optimizing Stack Frame Accesses for Processorswith Restricted Addressing Modes, Software { Practice and Ex-perience, vol. 22(2), 1992, pp. 101-110

    [6] S. Liao, S. Devadas, K. Keutzer, S. Tjiang, A. Wang: StorageAssignment to Decrease Code Size, ACM SIGPLAN Conferenceon Programming Language Design and Implementation (PLDI),1995

    [7] N. Sugino, H. Miyazaki, S. Iimure, A. Nishihara: Improved CodeOptimization Method Utilizing Memory Addressing Operationand its Application to DSP Compiler, Int. Symp. on Circuitsand Systems (ISCAS), 1996

    [8] B. Wess, M. Gotschlich: Constructing Memory Layouts for AddressGeneration Units Supporting Oset 2 Access, Proc. ICASSP, 1997

    [9] R. Leupers, P. Marwedel: Algorithms for Address Assignment inDSP Code Generation, Int. Conf. on Computer-Aided Design (IC-CAD), 1996

    [10] L. Davis: Handbook of Genetic Algorithms, Van Nostrand Rein-hold, 1991

    [11] B. Landwehr, P. Marwedel: A New Optimization Technique forImproving Resource Exploitation and Critical Path Minimiza-tion, 10th Int. Symp. on System Synthesis (ISSS), 1997

    [12] L.A. Belady: A Study of Replacement Algorithms for aVirtual-Storage Computer, IBM System Journals 5 (2), pp. 78-101, 1966

    [13] A. Sudarsanam, S. Liao, S. Devadas: Analysis and Evaluationof Address Arithmetic Capabilities in Custom DSP Architec-tures, Design Automation Conference (DAC), 1997

    [14] N. Kogure, N. Sugino, A. Nishihara: Memory Address Alloca-tion Method for a DSP with 2 Update Operations in IndirectAddressing, Proc. ECCTD, Budapest, 1997

    [15] B. Wess, M. Gotschlich: Optimal DSP Memory Layout Gen-eration as a Quadratic Assignment Problem, Int. Symp. onCircuits and Systems (ISCAS), 1997

    Main PageISSS98Front MatterTable of ContentsSession IndexAuthor Index