Profiling Techniques for FPGA-Based Hardware- Software ...

24
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR RCIM SEMINAR SERIES RCIM SEMINAR SERIES February 3 rd , 2006 Slide 1 Profiling Techniques for Profiling Techniques for FPGA FPGA - - Based Hardware Based Hardware - - Software CoDesign Software CoDesign Presented by: Jason G. Tong Supervisor: Dr. M.A.S. Khalid Date: February, 3 rd 2006 University of Windsor Department of Electrical and Computer Engineering

Transcript of Profiling Techniques for FPGA-Based Hardware- Software ...

RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR

RCIM SEMINAR SERIESRCIM SEMINAR SERIES February 3rd, 2006 Slide 1

Profiling Techniques for Profiling Techniques for FPGAFPGA--Based HardwareBased Hardware--

Software CoDesignSoftware CoDesign

Presented by: Jason G. TongSupervisor: Dr. M.A.S. Khalid

Date: February, 3rd 2006

University of WindsorDepartment of Electrical and Computer Engineering

RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR

RCIM SEMINAR SERIESRCIM SEMINAR SERIES February 3rd, 2006 Slide 2

OUTLINE• Introduction

What is CoDesign?The CoDesign Methodology

• Profiling ToolsInstruction-Level Profiling (gProf)Profiling Hardware CountersSnoopP (FPGA-Based)

RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR

RCIM SEMINAR SERIESRCIM SEMINAR SERIES February 3rd, 2006 Slide 3

Introduction – What is CoDesign?• Hardware Software CoDesign is an area of research

that first surfaced in the early 1990s

• CoDesign is used in Embedded Systems Design

• Designing of systems is composed of hardware and software components executing concurrently and cooperatively

• The main difference between Embedded Systems and Personal Computers: One interfaces with other components while the other interfaces with a human (end-user)

RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR

RCIM SEMINAR SERIESRCIM SEMINAR SERIES February 3rd, 2006 Slide 4

Introduction – What is CoDesign?• Examples of Embedded Systems: MP3 players,

Microwave Embedded System, Cell Phones, Digital Cameras, etc.

RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR

RCIM SEMINAR SERIESRCIM SEMINAR SERIES February 3rd, 2006 Slide 5

Introduction – What is CoDesign?

• The first CODES/CASHE workshop took place in 1992

• It has brought an insight to hardware/software partitioning of components, being the very first step in the design process of embedded systems

• Researchers are focussed on new CoDesign methodologies and tools in designing of embedded systems

• The goal is to find a partitioning solution that optimizes the design of the embedded system

RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR

RCIM SEMINAR SERIESRCIM SEMINAR SERIES February 3rd, 2006 Slide 6

Introduction – What is CoDesign?• Traditional Method: To design the hardware and software

components separately. This introduced compatibility issues amongst each component and possibly a complete redesign may take place. Lots of time and money wasted.

• CoDesign Method: To design the hardware software components concurrently. Minimizes compatibility conflicts amongst each component. Reduces time and money on development.

Courtesy of R. Ernst, “Codesign of Embedded Systems: Status and Trends”, In Proceedings of IEEE Design and Test, pp.45-54, April-June 1998 Figure 1: CoDesign and Traditional Methods

RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR

RCIM SEMINAR SERIESRCIM SEMINAR SERIES February 3rd, 2006 Slide 7

Introduction – What is CoDesign?

• Hardware Functions – Are functions implemented on hardware (ASIC) or a reconfigurable fabric (FPGA)

• Software Functions – Are functions implemented on software and run on a general purpose processor (CPU)

Figure 2: CPU, ASIC and Memory coupled on the bus

Courtesy of W. Wolf, “A Decade of Hardware/Software Codesign”In IEEE 5th International Symposium on Multimedia Software Engineering (MSE2003), pp. 38-43, December 2003

RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR

RCIM SEMINAR SERIESRCIM SEMINAR SERIES February 3rd, 2006 Slide 8

Introduction – What is CoDesign?

• Currently ASICs can be replaced by a reconfigurable fabric or a Field Programmable Gate Array (FPGA) for certain applications (e.g., DSP functions such as FFT and DFT, multipliers, adders, etc.)

• FPGAs are most widely used due to its reconfigurability and flexibility

• The problem arises as to which functions should be implemented into hardware and software domains such that it gives the best performance and the least cost

RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR

RCIM SEMINAR SERIESRCIM SEMINAR SERIES February 3rd, 2006 Slide 9

Introduction – The CoDesign Methodology• Specification: An

embedded system described in general terms or detailed description

• Partitioning: Dividing the components into the hardware and software domains and also interfacing them

• HW/SW and Interface Synthesis: Building the components in each of its domains

• Validation: To verify the the system operates properly

• It may require many iterations to obtain the proper design

Courtesy of C. J. N. Coelo Jr., D. C. Da Silva Jr., and A. O. Fernandes, “Hardware-Software Codesign of Embedded Systems”In Proceedings of the 11th Brazilian Symposium on Integrated Circuit Design, pp. 2-8, January 1998

Figure 3: CoDesign Methodology

RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR

RCIM SEMINAR SERIESRCIM SEMINAR SERIES February 3rd, 2006 Slide 10

Hardware / Software Partitioning

Figure 4: Partitioning Stage of a Software Specification

RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR

RCIM SEMINAR SERIESRCIM SEMINAR SERIES February 3rd, 2006 Slide 11

Introduction - The CoDesign Methodology• Hardware/Software partitioning is the most difficult step

since components rely on each other

• Any change in the hardware design can potentially change the software design and vice-versa (interdependency amongst the components)

• To find a good quality design, one must find an efficient, optimized hardware/software partition that each component performs cooperatively and concurrently

• Prior to partitioning, information of the system’s performance is needed to measure each of the components

RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR

RCIM SEMINAR SERIESRCIM SEMINAR SERIES February 3rd, 2006 Slide 12

Introduction - The CoDesign Methodology• CAD Tools available today help designers to measure

the performance of the system which are called Profiling ToolsProfiling Tools

•• Profiling ToolsProfiling Tools obtain the performance metrics of the system and help the designer optimize the design or implement certain functions in the hardware or software domain

• These tools have various ranges of performance measurement analysis

RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR

RCIM SEMINAR SERIESRCIM SEMINAR SERIES February 3rd, 2006 Slide 13

OUTLINE• Introduction

What is CoDesign?The CoDesign Methodology

• Profiling ToolsInstruction-Level Profiling (gProf)Profiling with Hardware CountersSnoopP (FPGA-Based)

RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR

RCIM SEMINAR SERIESRCIM SEMINAR SERIES February 3rd, 2006 Slide 14

Profiling Tools• Main Purpose - Is to measure the performance of a

software or hardware system

• Two methods in designing embedded systems:

• Implement the entire system in hardware and move each component into software

• Implement the entire system in software, run a software profiler to measure the performance and move each component into hardware

RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR

RCIM SEMINAR SERIESRCIM SEMINAR SERIES February 3rd, 2006 Slide 15

Profiling Tools

• Profiling tools are essential when coming to the problem of hardware/software partitioning

• Returns feedback to the designers to make reasonable decisions as to which components should be placed in the hardware or software domain

• There are automated hardware/software partitionersthat rely on this profiling information

RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR

RCIM SEMINAR SERIESRCIM SEMINAR SERIES February 3rd, 2006 Slide 16

Profiling Tools (gProf)Instruction-level Profilers• Are a set of profiling tools that count the number of clock cycles

(CPU Time in seconds) of each instruction that is currently executing on the processor

• “gProf” is an open-source GNU tool designed for Linux and Unix workstations

• It is a profiling tool that measures the performance of softwarecode written in C or C++

• Technique of analysis: Samples the program counter at a specified interval (frequency) and increments a counter that keeps track of the amount of time needed to execute that particular function

RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR

RCIM SEMINAR SERIESRCIM SEMINAR SERIES February 3rd, 2006 Slide 17

Profiling Tools – gProf (Flat / Call Graph)

Figure 5: gprof tool profiling software code

Call Graph

Flat Profile

RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR

RCIM SEMINAR SERIESRCIM SEMINAR SERIES February 3rd, 2006 Slide 18

Profiling Tools – Hardware Counters- Profiling with “Hardware Counters” is to count the number of certain events that occur during program execution

- The hardware counters technique is mostly used for memory profilers to profile memory systems during program execution

- Type of Events:

- Cache misses due to read and writes from memory

- Improper allocation and freeing of memory

- Stalling of processor during execution

- Remote memory page touching of other processor memories

RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR

RCIM SEMINAR SERIESRCIM SEMINAR SERIES February 3rd, 2006 Slide 19

Profiling Tools – Page Migration Approach

• Uses hardware counters to count the number of remote memory page touching other processor’s on-board memory

• Utilizes the SunFire hardware counters to profile the application code. Based on the information gathered, the profiler halts the execution of threads in the processors (at fixed intervals) after a number of “remote memory page touching” occurs.

• The profiler will take a page of memory (that a remote processor keeps accessing), and “migrates” that page to that processor.

RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR

RCIM SEMINAR SERIESRCIM SEMINAR SERIES February 3rd, 2006 Slide 20

Profiling Tools – Page Migration Approach

Figure 5: Page Migration ApproachCourtesy of M. M. Tikir and J. K. Hollingsworth, “Using Hardware Counters to Automatically Improve Memory Performance”, In Proceedings of

the ACM/IEEE conference on Supercomputing, pp.46-58, November 2004

RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR

RCIM SEMINAR SERIESRCIM SEMINAR SERIES February 3rd, 2006 Slide 21

Profiling Tools – SnoopPSnoopP (L.Shannon et al.)• An accurate clock-cycle counting profiler that measures the

performance of each instruction of a program

• A profiling architecture that is non-intrusive to the software, counts the number of clock cycles based on the system clock and utilizes the program counter to compare address ranges for code analysis (user specified)

• Performance of the on-chip profiler had no impact on the actual execution of the software itself

• Accuracy was very comparable against gprof and produced feedback that was very different in terms of clock cycle counts of each instruction and function

RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR

RCIM SEMINAR SERIESRCIM SEMINAR SERIES February 3rd, 2006 Slide 22

Profiling Tools – SnoopP

Figure 6: SnoopP ArchitectureCourtesy of L. Shannon and P. Chow, “Using Reconfigurability to Achieve Real-Time Profiling for Hardware/Software Codesign”

In Proceedings 12th international symposium on Field programmable gate arrays, pp. 190-199, February 2004

RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR

RCIM SEMINAR SERIESRCIM SEMINAR SERIES February 3rd, 2006 Slide 23

Conclusion

- Profiling Tools play an important role in the hardware/software partitioning process in CoDesign

- The quality of the partition is dependent on the profiled data retrieved from profiling tools

- Accurate profiling information can increase performance of a design, however with a less accurate information retrieved, system performance may degrade

RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR

RCIM SEMINAR SERIESRCIM SEMINAR SERIES February 3rd, 2006 Slide 24

References[1] L. Shannon and P. Chow, “Using Reconfigurability to Achieve Real-Time Profiling for Hardware/Software

Codesign”, In Proceedings 12th international symposium on Field programmable gate arrays, pp. 190-199, February 2004

[2] W. Wolf, “A Decade of Hardware/Software Codesign”, In IEEE 5th International Symposium on Multimedia Software Engineering (MSE2003), pp. 38-43, December 2003

[3] R. Ernst, J. Henkel and T. Benner, “Hardware-Software Cosynthesis for Microcontrollers”, In IEEE Design and Test of Computers, pp. 64-75, December 1993

[4] J. Fenlason and R. Stallman, “GNU gprof”, http://www.gnu.org/software/binutils/manual/gprof-2.9.1/html_mono/gprof.html January 1997

[5] L. Shannon and P. Chow, “Maximizing System Performance: Using Reconfigurability to Monitor System Communications”, In 2004 International Conference on Field Programmable Technology, pp. 231-238, December 2004

[6] M. Itzkowitz, Brian J. N. Wylie, C. Aoki and N. Kosche, “Memory Profiling using Hardware Counters”, In Proceedings of the ACM/IEEE conference on Supercomputing, pp. 17-30, July 2003

[7] M. M. Tikir and J. K. Hollingsworth, “Using Hardware Counters to Automatically Improve Memory Performance”, In Proceedings of the ACM/IEEE conference on Supercomputing, pp.46-58, November 2004

[8] C. J. N. Coelo Jr., D. C. Da Silva Jr., and A. O. Fernandes, “Hardware-Software Codesign of Embedded Systems”, In Proceedings of the 11th Brazilian Symposium on Integrated Circuit Design, pp. 2-8, January 1998

[9] D.W. Franke and M. K. Purvis, “Hardware/Software CoDesign: A Perspective”, In Proceedings of the 13th International Conference on Software Engineering, pp.344-352, May 1991

[10] W. H. Wolf, “Hardware-Software Co-Design”, In Procedings of the IEEE, pp. 967 - 989, July 1994[11] D. C. Suresh, W. A. Najjar, F. Vahid, J. R. Villarreal and G. Stitt, “Profiling Tools for Hardware/Software

Partitioning of Embedded Applications”, In Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems, pp. 189-198, June 2003

[12] C. J. N. Coelo Jr., D. C. Da Silva Jr., and A. O. Fernandes, “Hardware-Software Codesign of Embedded Systems”, In Proceedings of the 11th Brazilian Symposium on Integrated Circuit Design, pp. 2-8, January 1998

[13] K. Banovic, M.A.S. Khalid and E. Abdel-Raheem, “FPGA-Based Rapid Prototyping of Digital Signal Processing Systems”, In Proceedings of the 48th Midwest Symposium on Circuits and Systems, August 2005

[14] R. Ernst, “Codesign of Embedded Systems: Status and Trends”, In Proceedings of IEEE Design and Test, pp.45-54, April-June 1998