BASICS - McMaster Universityoptlab.mcmaster.ca/~yzinchen/CES701_07/COMPENG701_2.pdfComputer...

39
BASICS

Transcript of BASICS - McMaster Universityoptlab.mcmaster.ca/~yzinchen/CES701_07/COMPENG701_2.pdfComputer...

BASICS

Hardware components

Computer Architecture

• Computer Organization

– The von Neumann architecture

– Same storage device for both instructions and data

– Processor components

• Arithmetic Logic Unit

• Control Unit

• Registers

Computer Architecture

• Device Controllers

– Memory mapped I/O

– Direct Memory Access (DMA)

• Instruction Set

– Data transfer operations

– Arithmetic / logic operations

– Control flow instructions

Computer components

• Central Processor Unit

– e.g., G3, Pentium III, RISC

• Random Access Memory

– generally lost when power cycled

• Video RAM

– amount sets screen size, color depth

• Read Only Memory

– used for boot

• Input/Output, through interfaces such as

– Small Computer System Interface

– Universal Serial Bus

– Firewire (video standard)

– Ethernet

• Hard Disk Drive (permanent storage), Compact Disk Read-Only-Memory, CD-Read/Write, DVD R/W, etc.

A typical architecture (fragment)

CPU basics

• Smallest thing a computer knows– a bit 0 or 1 (false/true)

• CPU knows how to perform and, or, xor(exclusive or) operations– And returns true if both same

– Or returns true if either true

– Xor returns true if different

• CPU is a massive collection of and and or gates

• A specific CPU has a set of instructions it can execute (usually 50-100, machine language)

CPU basics

• Number of instructions per seconds is set by the “clock speed”– e.g., 500 MHz Pentium III

• One clock tick is called a cycle– modern CPUs can often execute >1 instruction per cycle

• Programs are set of instructions to be executed by the CPU– compilers/linkers or interpreters do this for you

• Floating point speed is measured in floating point operations per seconds (flops)

Data

Bits/Bytes and Words

• Bits are grouped into larger units

– 8 bits = 1 byte (still common)

– 2/4/8 bytes = word (varies between CPU’s)

– Most desktop machines are 32-bit words,

64-bit machines are becoming more common

• set by data bus

– Why important?

• Sets minimum size unit you can access in program,

and often the precision for computations

Words and Bytes

• Number of unique values that can be represented depends on number of bits

• With n bits one can have 2n unique values

• For n=8 have (Byte) = 256– grouped into larger units to represent different data

• ASCII – American Standard Code for Information Interchange– Basic version is 7 bit (127 characters)

– A-Z, a-z, 0-9 and special characters

– Values <32 are “control characters”

Numbers and bases

• Numbers represented in different base systems

– Binary base 2 (0-1)

– Octal base 8 (0-7)

– Hexadecimal base 16 (0-15, with A-F representing

10-15)

– E.g, 5410=3616=668=1101102

• Prefixes: kilo = 1024; mega=1048576;

giga=10737741824 (approximately 103,106,109)

Instructions

Program execution

“The machine cycle”

Instruction composition

Stored program

Fetch step of the machine cycle I

Fetch step of the machine cycle II

Decoding the instruction

Mnemonics

• It is hard to remember commands as numbers

• Use words associated with the numbers

Some Assembly language

Operating System

OS

• The Operating System (OS)

– Controls everything in the way the computer works.

– Not Specific to a CPU type but often some OS’s are

associated with specific CPUs

• G3/4/5 68x series MacOS

• Pentium, x86 DOS (Windows)

• SPARC Solaris (Unix)

– OS controls IO and memory management

• Program implementations are dependent on OS

Programming interface to OS

• Depending on language used, OS

interface may or may not be important

• For Fortran, C, C++ when program is

linked OS routines are needed

– How to read from keyboard or file?

– How to write to screen or disk?

• In your program you do not need to go into

the low-level (OS) details

Storage in memory

• Memory treated as a linear array of bytes, from 1 to <size of memory>

• OS keeps track of used and free memory, for use by programs and data

• Some computers do “byte-swapping”– the bytes are not counted linearly but rather are switched

– main (but not only) styles are Big Endian (HP, Sun, Macs) and Little Endian (PC)

– affects ability to transfer binary data; TCP knows this and will accommodate this up to a certain degree

Basics revisited

Hard disks• Contain the computer “file

system”– allows access through file names

• Directory structure points to file location– reason for having less space

available than the size of disk + some calibration tracks

• Actual content of HD and directories depend on OS– e.g., FAT16, FAT32, NTFS for

Windows, EXT2 for Linux

• In general, OS can only use their own file-system

Accessing RAM vs. HDD

• The highest possible

bandwidth (peak

bandwidth) for the various

types of RAM

– However, RAM also has to

match the motherboard,

chipset and the CPU

system bus

• HDD ~ only 80MB/s

• In MATLAB: try save,

load, pack, clear10600DUAL DDR2-533

8600DUAL DDR2-400

6400DUAL DDR PC3200

3200DDR 400 (PC3200)

2664DDR 333 (PC2700)

2128DDR 266 (PC2100)

3200Rambus, Dual PC800

1600Rambus, PC800

1064SD RAM, PC133

800SD RAM, PC100

Max.

Transfer, MB/s

Module type

RAM and “fast” RAM/cache

• A CPU cache is a cache used

by the central processing unit of

a computer to reduce the

average time to access memory

– Access time: roughly speaking

“CPU speed against the bus speed”

Integers

• Integer numbers can be represented exactly (up

to the range allowed by the number of bytes)

• A 2-byte integer, unsigned 0-65535, signed

±32767 (sometimes called short)

• A 4-byte integer, unsigned 0-4294967295,

signed ±2147483827

– (With a 32-bit address bus, can have 4Gbytes of

memory—reason max memory is limited in

computers)

Floating point

• Representations vary between machines (often

reason binary files can not be shared)

– Precise layout of bits depends on machine and

format; all formats are (mantissa)*2(exponent)

The IEEE standard for floating

point arithmetic• Single precision (32 bits=4 bytes)

S EEEEEEEE FFFFFFFFFFFFFFFFFFFFFF

0 1 8 9 31

The value V represented by the word may be determined as follows:

• If E=255 and F is nonzero, then V=NaN ("Not a number")

• If E=255 and F is zero and S is 1, then V=-Infinity

• If E=255 and F is zero and S is 0, then V=Infinity

• If 0<E<255 then V=(-1)S * 2(E-127) * (1.F) where "1.F" is intended to represent the binary number created by prefixing F with an implicit leading 1 and a binary point

• If E=0 and F is nonzero, then V=(-1)S * 2 (-126) * (0.F). These are "unnormalized" values.

• If E=0 and F is zero and S is 1, then V=-0

• If E=0 and F is zero and S is 0, then V=0

Single precision floating point

• In particular

0 00000000 00000000000000000000000 = 0

1 00000000 00000000000000000000000 = -0

0 11111111 00000000000000000000000 = Infinity

1 11111111 00000000000000000000000 = -Infinity

0 11111111 00000100000000000000000 = NaN

1 11111111 00100010001001010101010 = NaN

0 10000000 00000000000000000000000 = +1 * 2(128-127) * 1.0 = 2

0 10000001 10100000000000000000000 = +1 * 2(129-127) * 1.101 = 22*(23+22+1)/23 = 6.5

1 10000001 10100000000000000000000 = -1 * 2(129-127) * 1.101 = -6.5

0 00000001 00000000000000000000000 = +1 * 2(1-127) * 1.0 = 2(-126)

0 00000000 10000000000000000000000 = +1 * 2(-126) * 0.1 = 2(-127)

0 00000000 00000000000000000000001 = +1 * 2(-126) *

0.00000000000000000000001 =

2(-149) (Smallest positive value)

Double precision floating point

S EEEEEEEEEEE FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF

0 1 11 12 63

The value V represented by the word may be determined as follows:

• If E=2047 and F is nonzero, then V=NaN ("Not a number")

• If E=2047 and F is zero and S is 1, then V=-Infinity

• If E=2047 and F is zero and S is 0, then V=Infinity

• If 0<E<2047 then V=(-1)S * 2(E-1023) * (1.F) where "1.F" is intended to represent the binary number created by prefixing F with an implicit leading 1 and a binary point.

• If E=0 and F is nonzero, then V=(-1)S * 2(-1022) * (0.F) These are "unnormalized" values.

• If E=0 and F is zero and S is 1, then V=-0

• If E=0 and F is zero and S is 0, then V=0

Is the finite precision an issue?

• An extended example: condition number of a

symmetric matrix.

Consider a system of linear equations

and the perturbed system

=

1997

1999

998999

9991000

y

x

=

01.1997

99.1998

ˆ

ˆ

998999

9991000

y

x

Example

Note but ?

What went wrong?

Recall an n-by-n matrix A is symmetric if A=AT.

Fact (“spectral decomposition”):

If A is symmetric, it may be written as

A=UDUT,

where D is the diagonal matrix,

U is unitary (i.e., UUT=I, the identity matrix).

1

1

=

y

x

99.18

97.20

ˆ

ˆ

+=

y

x

Example

Fact: U is unitary is “almost the same” as

U is a rotation matrix • “almost the same” because U might include reflections

Fact: For A=UDUT as before, the diagonal

elements of D are the eigenvalues of A and

columns of U are the right eigenvectors of A.

Recall t is an eigenvalue of A iff det(A-tI)=0,

u is the corresponding right eigenvector iff

Au=tu.

Example

• How does A act on x, step-by-step:

Ax=UDUTx=UD(UTx)=U(D(UTx)),

that is, “rotate, scale, rotate back”.

Define the condition number of A as

χ(A)=|t|max(A) / |t|min(A)

where |t|max and |t|min are the largest and the smallest in absolute value eigenvalues of A.

χ(A) shows how far off the solution to Ax=b may be from the solution to Ax=b+Ε (a measure of “relative singularity”).

Assignment 2MATLAB and C code are posted on the web.

1. Using the MATLAB script from the class, try to identify the cache size on your machine. Note that usually a number in MATLAB occupies 8 bytes (double precision floating point), and that the function requires roughly 2-times the memory needed to store the x vector (x in the input, x_new in the output). Explain your results. As a sanity check, you might want to use a CPU info tool.

2. Condition number I: for the class example, answer the following:a) find the matrix spectral decomposition and compute the matrix condition number,

b) find perturbations Ε of the right-hand side of the equation with (the Euclidean norm) ||Ε||=1 that give the largest and the smallest errors in the solution vector; plot each of the two perturbations, together with their image under UT, under D-1UT and, finally, under A-1,

c) given the equation for the curve ||Ε||=1, find its image under A-1 and plot it,

d) find the explicit expression for the condition number χ(A-γ I) for γ > 0.3. Condition number II: given a 2x2 matrix with double-precision entries, what is the

worst condition number this matrix might have (as a real number)? Explain.

4. Condition number III: for the question 2 of the first assignment, is there any ill-conditioning? (Ill-conditioning refers to a matrix having large condition number, hence often resulting in numerical instability; if you happen to encounter complex eigenvalues, you are on the wrong track; for the definition in class, matrix must be symmetric.) Explain your answer.

5. Bonus question: find all the distinct spectral decompositions of the identity matrix.