Assembly Programming (Part II. Assembly Language...
Transcript of Assembly Programming (Part II. Assembly Language...
Assembly Programming (Part II. Assembly Language
Fundamentals) CSE 3030
Prof. Hyuk-Jun Lee
CSE3030
Basic Elements of Assembly Language Microsoft Syntax Notation
[…] : optional. {…|…|…} : a choice of one of the enclosed elements
separated by | character. Elements in italics : items which have known definitions
or descriptions. Integer Constants
[{+|-}] digits [radix] 26, -26, 26d, 11010011b, 42q, 42o, 1Ah, 0A3h
h : hexadecimal, q/o : octal, d : decimal, b : binary, r : encoded real, t : decimal(alt), y : binary(alt).
Assembly Programming 2
CSE3030
Integer Expressions Assembler calculate the expressions. Operators and precedence levels:
Examples:
Assembly Programming 3
CSE3030
Real Number Constants [sign] integer.[integer][exponent] sign : {+|-}, exponent: E[{+|-}]integer Encoded Real : The binary representation of a number.
+1.0 ⇔ 3F800000r
Character and String Constants ‘A’, “x” ‘A Boy’, “A Girl” ‘10010010’ and 10010010b are different data. Embedded quotes
“It isn’t a book” ‘Say “Goodnight,” Kim’
Assembly Programming 4
CSE3030
Identifiers Programmer-chosen names of variables, constants,
procedures and labels. 1-247 characters, including digits. Case insensitive (by default). First character must be a letter, _, @, or $.
Reserved words(App. D) can’t be used as identifiers Instruction mnemonics, directives, type attributes,
operators, predefined symbols. Directives(see pp.669, D.7)
Commands understood by the assembler. Case insensitive. Not part of the Intel instruction set. Used to declare code, data areas, select memory model,
declare procedures, etc. Examples: .data, .code, name PROC, etc. Different assembler may have different directives. Assembly Programming 5
CSE3030
Instructions Assembled into machine code by assembler. Executed at runtime by the CPU. Member of the Intel IA-32 instruction set. Standard instruction format:
Labels
Marks the address (offset) of code and data. Code label : followed by colon. Data label : not followed by colon.
Comments : begin with semicolon (;). Good programs: programs with well written comments. For debugging, revision, documentation, etc.
Assembly Programming 6
CSE3030
Instruction Format Examples No operands
stc ; set Carry flag
A single operand inc eax ; register inc myByte ; memory
Two operands add ebx,ecx ; register, register sub myByte,25 ; memory, constant add eax,36*25 ; register, ; constant-expression
Assembly Programming 7
CSE3030
Example: Adding and Subtracting Integers
Assembly Programming 8
TITLE Add and Subtract (AddSub.asm) ; adds and subtracts 32-bit integers. INCLUDE Irvine32.inc .code main PROC mov eax,10000h ; EAX = 10000h add eax,40000h ; EAX = 50000h sub eax,20000h ; EAX = 30000h call DumpRegs ; display registers exit main ENDP END main
beginning of a procedure beginning of the code segment
add necessary definitions and setup inform.
calles predefined MS-Windows function that halt the proc. the end of procedure ‘main’. the last line of programs to be assembled.
CSE3030
Example Output
Coding Styles : develop your own. Some approaches to capitalization
Capitalize nothing, or everything. Capitalize all reserved words, including instruction
mnemonics and register names. Capitalize only directives and operators
Other suggestions Descriptive identifier names. Blank lines between procedures.
Assembly Programming 9
EAX=00030000 EBX=7FFDF000 ECX=00000101 EDX=FFFFFFFF
ESI=00000000 EDI=00000000 EBP=0012FFF0 ESP=0012FFC4
EIP=00401024 EFL=00000206 CF=0 SF=0 ZF=0 OF=0
CSE3030
Indentation and spacing Code and data labels – no indentation. Executable instructions – indent 4-5 spaces. Comments: begin at column 40-45, aligned vertically. 1-3 spaces between instruction and its operands.
ex: mov ax,bx
1-2 blank lines between procedures.
Assembly Programming 10
CSE3030
Alternative Version of Addsub
Assembly Programming 11
TITLE Add and Subtract (AddSubAlt.asm) ; Adds and subtracts 32-bit integers. .386 .MODEL flat,stdcall .STACK 4096 ExitProcess PROTO, dwExitCode:DWORD DumpRegs PROTO .code main PROC mov eax,10000h ; EAX = 10000h add eax,40000h ; EAX = 50000h sub eax,20000h ; EAX = 30000h call DumpRegs INVOKE ExitProcess,0 main ENDP END main
minimum CPU required for this program flat : generate code for protected mode stdcall : enables the calling of MS Windows functions.
INVOKE : called a procedure or function
procedure from Irvine32 link lib.
CSE3030
A Program Template
Assembly Programming 12
TITLE Program Template (Template.asm) ; Program Description: ; Author: ; Creation Date: ; Revisions: ; Date: Modified by: INCLUDE Irvine32.inc .data ; (insert variables here) .code main PROC ; (insert executable instructions here) exit main ENDP ; (insert additional procedures here) END main
CSE3030
Assembling, Linking, and Running Programs Assemble-Link Execute Cycle
The following diagram describes the steps from creating a source program through executing the compiled program.
If the source code is modified, Steps 2 through 4 must be repeated.
Assembly Programming 13
Source File
Object File
Listing File
Link Library
Executable File
Map File
Output
Step 1: text editor
Step 2: assembler
Step 3: linker
Step 4: OS loader
CSE3030
Listing File
Assembly Programming 14
- Use it to see how your program is compiled - Contains
- source code - Addresses - object code (machine language) - segment names - symbols (variables, procedures, and constants)
- Project Property -> Configuration Properties -> Microsoft Macro Assembler -> Listing File -> Assembled Code Listing File -> $(InputName).lst
CSE3030
Map File (by /MAP when LINK) Information about each program segment:
starting address ending address size segment type
Assembly Programming 15
CSE3030
Debugging Your Program Protected Mode
WinDbg : Very good for Win2000, NT, XP. Visual Studio : Work in any Windows OS.
Real-mode CV (Code View) 화면 모드 설정:
chcp 949(chcp 437) : 한글(영문)모드로 전환
Mouse setting : 빠른편집모드 unclick.
See appendix and related web sites.
Assembly Programming 16
CSE3030
Assembling and Debugging in Visual C++ 2012 Visual C++ 2012
1. Reference Web Site : http://kipirvine.com/asm/
2. VC++ 2012 Express Edition 다운로드 및 설치
http://www.microsoft.com/express/Downloads/#2012-Visual-CPP
3. Express 버전 기간제한 해제
설치 후 Help메뉴 -> Register Product -> Obtain a registration key online을
눌러 사이트에 로그인 후 얻은 등록 키를 입력한다
4. Example programs and link libraries source code for Visual Studio 2012 다운로
드 및 설치
http://kipirvine.com/asm/examples/index.htm
C:\Irvine 폴더 생김.
Assembly Programming 17
CSE3030
Creating New Project in VC++ 2012 1. 파일 Menu에서 새로만들기 -> 프로젝트 -> Visual C++ -> 일발 -> 빈 프로젝트 2. 경로와 프로젝트 이름을 정한 후 확인을 누른다.
Assembly Programming 18
CSE3030
Creating New Project in VC++ 2012 3. 파일 Menu에서 새로만들기 -> 파일 -> 일반 -> 텍스트 파일 선택한 후 코드를 작성한다.
4. 파일저장은 파일 Menu에서 -> 다른 이름으로 저장 ->
(Filename).asm으로 확장자를 변경하여 저장한다.
Assembly Programming 19
CSE3030
5. 작성한 asm파일을 프로젝트에 추가한다.
좌측의 솔루션 탐색기창에서, 소스파일에서 우측버튼 -> 추가 -> 기존항목 -> 작성한 asm 파일 추가한다.
Assembly Programming 20
Creating New Project in VC++ 2012
CSE3030
6. Tools 메뉴에서 Settings -> Expert Settings를 선택한다. (VC++ 2012 Express Edition만 해당)
Assembly Programming 21
Creating New Project in VC++ 2012
CSE3030
1. 좌측의 솔루션 탐색기의 프로젝트에서 마우스 우측 버튼 클릭 사용자 지정 빌드를 선택한다.
Assembly Programming 22
VC++2012 Project Properties Setting
2. VC++ 빌드 사용자 지정파일창에서 masm항목을 체크-> OK
CSE3030
VC++2012 Project Properties Setting 3. 프로젝트에 추가한 asm파일에서 마우스 우측 버튼 클릭 속성을 선택한다. 4. 구성 속성 -> 일반 탭에서 항목 형식을 Microsoft Macro Assembler로 설정해준다.
Assembly Programming 23
CSE3030
VC++2012 Project Properties Setting 5. 다시 좌측의 솔루션 탐색기의 프로젝트에서 마우스 우측 버튼 클릭 -> 속성선택
6. 구성 속성 -> Microsoft Macro Assembler -> 일반 탭에서 Include Paths 를 library를 설치한 경로로 설정해준다. (default로 c:\Irvine으로 설정)
Assembly Programming 24
CSE3030
VC++2012 Project Properties Setting 7. Configuration Properties -> Microsoft Macro Assembler
-> Listing File 탭에서 Assembled Code Listing File을 $(ProjectName).lst 로 설정해준다.
Assembly Programming 25
CSE3030
VC++2012 Project Properties Setting 8. 구성 속성 -> 링커 -> 일반 탭에서 Additional Library
Directories를 library를 설치한 경로로 설정해준다 (default로 c:\Irvine으로 설정)
Assembly Programming 26
CSE3030
VC++2012 Project Properties Setting 9. 구성 속성 -> 링커 -> 입력 탭에서 추가 종속성에
irvine32.lib; 을 추가한다.
Assembly Programming 27
CSE3030
VC++2012 Project Properties Setting 10. 구성속성 -> 링커 -> 디버깅 탭에서 디버그 정보 생성을 Yes(/DEBUG)로 설정해 준다.
Assembly Programming 28
CSE3030
VC++2012 Project Properties Setting 11. 구성 속성 -> 링커 -> 시스템 탭에서 하위시스템을 콘솔 (/SUBSYSTEM:CONSOLE)로 설정해준다.
Assembly Programming 29
CSE3030
Build and Run Program 1. Build Build 메뉴-> Build Solution(F7) 2. Run Debug 메뉴 -> Start without Debugging(ctrl + F5)
Assembly Programming 30
CSE3030
Debugging Assembly Programs in VC++ 2012 Visual C++에서 아래와 같은 function 키를 이용하여 프로그램의 올바른 수행을 검사한다.
F5 – Go, Shift+F5 – 디버그모드 종료 F9 – Insert/Remove Breakpoint F10 – Step Over F11 – Step In Ctrl+F5 – Execute Program
Assembly Programming 31
CSE3030
Debugging Assembly Programs in VC++2012 Debugging시 메뉴에서 Debug – Windows를 선택하면
Register, Memory등을 확인할 수 있다. 자세한 디버깅 방법은 아래 사이트에서 확인할 수 있다. http://kipirvine.com/asm/debug/vstudio2010/index.htm Assembly Programming 32
CSE3030
Building 16-bit Applications 16bit Application을 위한 bat파일을 수정
Visual Studio 2012가 설치된 폴더를 찾는다.
C:\Program Files\Microsoft Visual Studio 11.0\VC\bin (64bit) 또는 C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\bin 에 설치되어 있을 것이다.
위 폴더에서 ml.exe 파일을 확인하자.
C:\Irvine\make16.bat 파일에서 19번째 줄을 위 경로 중 하나로 수정한다(아래 그림은 19th 줄을 REM으로 comment out하고 20th 줄에 제대로 된 경로를 삽입한 것이다)
Assembly Programming 33
CSE3030
Codeview와 DOSBox 설치 16bit 프로그램은 64bit Win7에서 수행할 수 없다
(32bit Win7에서는 가능할 수도 있음)
Codeview는 16비트 프로그램을 디버깅할 수 있는 프로그램이다. 아래 주소에서 다운 받을 수 있다.
http://pds12.egloos.com/pds/200809/24/87/cv41patch.exe
cv41patch.exe 파일을 수행하고 unzip을 클릭하면 C:\masm615 라는 폴더가 생기는데 이 폴더에서 cv.exe를 찾을 수 있을 것이다.
DOSBox는 윈도우 환경에서 16비트 프로그램을 수행할 수 있는 DOS 창을 생성해 준다. http://www.dosbox.com/ 에서 윈도우 버전을 다운 받아 설치하도록 하자.
Assembly Programming 34
CSE3030
프로젝트 생성 32비트용 프로젝트 생성하는 것과 같은 방법으로 빈 프로젝트 하나를 만든다.
Assembly Programming 35
CSE3030
파일 생성 및 추가 이 역시 32비트용 프로그램 작성과 마찬가지 이다.
다른 에디터를 사용하여도 무방하다.
Assembly Programming 36
CSE3030
파일 생성 및 추가(계속) 테스트용으로 다음과 같은 프로그램을 만든다.
Assembly Programming 37
TITLE Add and Subtract, Version 2, Real Mode (AddSub2r.asm) ; This program adds and subtracts 32-bit integers ; and stores the sum in a variable. INCLUDE Irvine16.inc ; 16비트 버전 .data val1 dword 10000h val2 dword 40000h val3 dword 20000h finalVal dword ? .code main PROC mov ax,@data ; initialize DS mov ds,ax ; new mov eax,val1 ; start with 10000h add eax,val2 ; add 40000h sub eax,val3 ; subtract 20000h mov finalVal,eax ; store the result (30000h) call DumpRegs ; display the registers exit main ENDP END main
CSE3030
파일 생성 및 추가(계속) 생성한 파일을 ~.vcxproj가 있는 폴더에 저장한다.
확장자는 ~.asm이어야 한다.
Assembly Programming 38
CSE3030
파일 생성 및 추가(계속) 저장한 파일을 프로젝트에 추가한다.
Assembly Programming 39
CSE3030
16비트 어플리케이션 빌드를 위한 세팅 도구->외부도구->추가를 클릭한다.
Toos->External Tools->Add
Assembly Programming 40
CSE3030
16비트 어플리케이션 빌드를 위한 세팅(계속) 제목에 Build 16bit Asm 이라고 입력하고 추가 세팅을 한다.
Assembly Programming 41
CSE3030
16비트 어플리케이션 빌드를 위한 세팅(계속) 명령창에는 C:\Irvine 폴더에서 make16.bat를 찾아 클릭하여 입력한다.
Assembly Programming 42
CSE3030
16비트 어플리케이션 빌드를 위한 세팅(계속) 인수에는 “항목파일이름”, 초기 디렉터리에는 “항목 디렉터리”를 클릭한 후 적용, 확인 을 클릭한다.
Assembly Programming 43
CSE3030
16비트 어플리케이션 수행를 위한 세팅 32비트에서는 수행 가능한데 이를 위한 세팅을 추가하자(64비트에서는 세팅은 가능하지만 사용할 수는 없으니 할 필요가 없다).
도구->외부도구->추가를 클릭한 후 아래와 같이 세팅한다.
Assembly Programming 44
직접 입력
항목 파일 이름 선택 후 /C 를 추가 항목 디렉터리 선택
unclick
CSE3030
Building 16bit Applications 도구->Build 16bit ASM 클릭 (아래와 같은 창이 뜬다)
Assembly Programming 45
CSE3030
Run 16bit Applications(32bit Win Only) 도구->Run 16bit ASM 클릭 (아래와 같은 창이 뜬다)
Assembly Programming 46
CSE3030
Run & Debugging 16bit Applications(Win7) Codeview와 DOSBox가 설치되어 있어야 한다.
실행 파일이 있는 폴더를 C:\MASM615 폴더에 복사한다.
Assembly Programming 47
CSE3030
Run & Debugging 16bit Applications(Win7) DOSBox를 실행한다.
창이 두 개 뜨는데 Welcome to DOSBox 어쩌구 라고 파란색으로 나오는 창에서 작업한다.
“mount c c:\” 를 입력하고 “c:”, “cd c:\masm615” 를 입력하여 Codeview 폴더로 간다(아래 그림 참조)
Assembly Programming 48
CSE3030
Run & Debugging 16bit Applications(Win7) 자신이 복사한 폴더로 가서 실행파일을 입력하면 프로그램이 수행된다.
Assembly Programming 49
CSE3030
Debugging 16bit Applications(Win7) Codeview를 수행하기 위해서는 “..\cv 실행파일 이름”을 입력한다.
Codeview가 실행되면 Codeview창 외에는 마우스가 동작하지 않을 수 있으니 이를 미리 파악하고 있도록 하자.
마우스를 동작시키려면 Codeview를 종료하고 exit를 입력하여 DOSBox의 실행을 중지시켜야 한다.
Assembly Programming 50
CSE3030
Debugging 16bit Applications(Win7) Alt-W(또는 마우스로 Windows 클릭)을 누르면 아래와 같은 창이 나타나고 여기서 Register 창 또는 Variable 관찰을 위한 Watch창을 띄울 수 있다(마우스 또는 up/down 화살표로 선택 가능).
Assembly Programming 51
CSE3030
Debugging 16bit Applications(Win7) Register와 Watch 창을 띄운 결과
F10, F5, F9 등의 단축키를 사용하여 디버그 할 수 있다(다음쪽 참조)
16비트용이므로 레지스터의 lsb 16비트만 볼 수 있다.
Assembly Programming 52
Size 조정 가능
CSE3030
Debugging 16-bit Applications Code View에서 Run 메뉴에 있는 function 키를 이용하여 프로그램의 올바른 수행을 검사한다.
F5 – Go F10 – Step F7 – Go To Cursor F8 – Trace F9 – Break Shift – Stop After Return Run 메뉴 -> Restart를 선택하면 디버깅을 재시작한다.
Assembly Programming 53
CSE3030
Defining Data Intrinsic Data Types
BYTE, SBYTE : 8-bit unsigned integer; 8-bit signed integer
WORD, SWORD : 16-bit unsigned & signed integer DWORD, SDWORD : 32-bit unsigned & signed integer QWORD : 64-bit integer TBYTE : 80-bit integer REAL4 : 4-byte IEEE short real REAL8 : 8-byte IEEE long real REAL10 : 10-byte IEEE extended real
Assembly Programming 54
CSE3030
Data Definition Statement A data definition statement sets aside storage in memory
for a variable. May optionally assign a name (label) to the data
Syntax:
[name] directive initializer[,initializer] ...
All initializers become binary data in memory
Assembly Programming 55
data type values
CSE3030
Defining Bytes Defining a single byte of storage:
Defining multiple bytes of storage with initialization:
Assembly Programming 56
value1 BYTE 'A‘ ; character constant value2 BYTE 0 ; smallest unsigned byte value3 BYTE 255 ; largest unsigned byte value4 SBYTE –128 ; smallest signed byte value5 SBYTE +127 ; largest signed byte value6 BYTE ? ; uninitialized byte
list1 BYTE 10,20,30,40
list2 BYTE 10,20,30,40
BYTE 50,60,70,80
BYTE 81,82,83,84
list3 BYTE ?,32,41h,00100010b
list4 BYTE 0Ah,20h,‘A’,22h
CSE3030
Defining Strings A string is implemented as an array of characters
For convenience, it is usually enclosed in quotation marks It usually has a null byte at the end
Examples:
Assembly Programming 57
str1 BYTE "Enter your name",0 str2 BYTE 'Error: halting program',0 str3 BYTE 'A','E','I','O','U' greeting BYTE "Welcome to the Encryption Demo“ BYTE “ program.”, 0 menu BYTE "Checking Account",0dh,0ah,0dh,0ah, "1. Create a new account",0dh,0ah, "2. Open an existing account",0dh,0ah, "3. Credit the account",0dh,0ah, "4. Debit the account",0dh,0ah, "5. Exit",0ah,0ah, "Choice> ",0
0Dh = carriage return 0Ah = line feed
Define all the data in the same area of the data segment.
CSE3030
DUP Operator Use DUP to allocate (create space for) an array or string. Counter and argument must be constants or constant
expressions
Assembly Programming 58
var1 BYTE 20 DUP(0) ; 20 bytes, all equal to zero
var2 BYTE 20 DUP(?) ; 20 bytes, uninitialized
var3 BYTE 2 DUP("STACK") ; 10 bytes: "STACKSTACK"
var4 BYTE 10,3 DUP(0),20
CSE3030
Defining WORD and SWORD Data (16 bit)
Defining DWORD and SDWORD Data (32 bit)
Defining QWORD, TBYTE, Real Data
Assembly Programming 59
quad1 QWORD 1234567812345678h val1 TBYTE 1000000000123456789Ah rVal1 REAL4 -2.1 rVal2 REAL8 3.2E-260 rVal3 REAL10 4.6E+4096 ShortArray REAL4 20 DUP(0.0)
word1 WORD 65535 ; largest unsigned value word2 SWORD –32768 ; smallest signed value word3 WORD ? ; uninitialized, unsigned word4 WORD "AB" ; double characters myList WORD 1,2,3,4,5 ; array of words array WORD 5 DUP(?) ; uninitialized array
val1 DWORD 12345678h ; unsigned val2 SDWORD –2147483648 ; signed val3 DWORD 20 DUP(?) ; unsigned array val4 SDWORD –3,–2,–1,0,1 ; signed array
CSE3030
Little Endian Order All data types larger than a byte store their individual
bytes in reverse order. The least significant byte occurs at the first (lowest) memory address.
Example: val1 DWORD 12345678h
Declaring Unitialized Data
Use the .data? directive to declare an unintialized data segment.
Within the segment, declare variables with "?" initializers: smallArray DWORD 10 DUP(?)
Advantage: the program's EXE file size is reduced. Assembly Programming 60
CSE3030
Adding Variables to AddSub
Assembly Programming 61
TITLE Add and Subtract, Ver. 2 (AddSub2.asm) ; This program adds and subtracts 32-bit unsigned ; integers and stores the sum in a variable. INCLUDE Irvine32.inc .data val1 DWORD 10000h val2 DWORD 40000h val3 DWORD 20000h finalVal DWORD ? .code main PROC mov eax,val1 ; start with 10000h
add eax,val2 ; add 40000h sub eax,val3 ; subtract 20000h mov finalVal,eax ; store the result (30000h) call DumpRegs ; display the registers exit
main ENDP END main
CSE3030
Symbolic Constants Equal-Sign Directive
name = expression (32bit int exp or const) May be redefined and name is called a symbolic constant. good programming style to use symbols.
EQU Directive Define a symbol as either an integer or text expression. Cannot be redefined.
Assembly Programming 62
COUNT = 500 . . . mov al,COUNT
PI EQU <3.1416> pressKey EQU <"Press any key to continue...",0> .data prompt BYTE pressKey
CSE3030
TEXTEQU Directive Define a symbol as either an integer or text expression. Called a text macro. Can be redefined.
Assembly Programming 63
continueMsg TEXTEQU <"Do you wish to continue (Y/N)?"> rowSize = 5 .data prompt1 BYTE continueMsg count TEXTEQU %(rowSize * 2) ; eval. the expression move TEXTEQU <mov> setupAL TEXTEQU <move al,count> .code setupAL ; generates: "mov al,10"
%expression : treat the value of expression in macro argument as text(see D.7).
CSE3030
Current location counter: $ Subtract address of list. Difference is the number of bytes. Calculating the Size of a Byte Array
The Size of a Word Array
The Size of a Doubleword Array
Assembly Programming 64
list BYTE 10,20,30,40 ListSize = ($ - list)
list WORD 1000h,2000h,3000h,4000h ListSize = ($ - list) / 2
list DWORD 1,2,3,4 ListSize = ($ - list) / 4
CSE3030
Data Transfer Instructions Operand Types
Three basic types of operands: Immediate – a constant integer (8, 16, or 32 bits)
value is encoded within the instruction
Register – the name of a register register name is converted to a number and encoded within
the instruction
Memory – reference to a location in memory memory address is encoded within the instruction, or a
register holds the address of a memory location
Assembly Programming 67
CSE3030
Instruction Operand Notation(See Appendix B)
Assembly Programming 68
CSE3030
Direct Memory Operands A direct memory operand is a named reference to storage
in memory. The named reference (label) is automatically dereferenced
by the assembler.
MOV Instruction
Move from source to destination. Syntax: MOV destination,source No more than one memory operand permitted CS, EIP, and IP cannot be the destination No immediate to segment moves
Assembly Programming 69
var1 BYTE 10h . . . mov al,var1 ; AL = 10h mov al,[var1] ; AL = 10h
alternate format
CSE3030
Examples:
Explain why each of the following MOV statements are invalid:
Assembly Programming 70
count BYTE 100 wVal WORD 2 . . . mov bl,count mov ax,wVal mov count,al mov al,wVal ; error mov ax,count ; error mov eax,count ; error
bVal BYTE 100 bVal2 BYTE ? wVal WORD 2 dVal DWORD 5 . . . mov ds,45 ; a. mov esi,wVal ; b. mov eip,dVal ; c. mov 25,bVal ; d. mov bVal2,bVal ; e.
CSE3030
Zero Extension (MOVZX instruction)
Sign Extension (MOVSX instruction)
XCHG Instruction
Assembly Programming 71
1 0 0 0 1 1 1 1
1 0 0 0 1 1 1 1
Source
Destination 0 0 0 0 0 0 0 0
0 mov bl,10001111b movzx ax,bl ; dest = reg
1 0 0 0 1 1 1 1
1 0 0 0 1 1 1 1
Source
Destination 1 1 1 1 1 1 1 1
mov bl,10001111b movsx ax,bl ; dest = reg
var1 WORD 1000h var2 WORD 2000h . . . xchg ax,bx ; exchange 16-bit regs xchg var1,bx ; exchange mem, reg xchg var1,var2 ; error: two memory operands
- At least one operand must be a reg. - No immediate operands are permitted.
CSE3030
Direct-Offset Operands A constant offset is added to a data label to produce an
effective address (EA).
Assembly Programming 72
.data arrayW WORD 1000h,2000h,3000h arrayD DWORD 1,2,3,4 .code mov ax,[arrayW+2] ; AX = 2000h mov ax,[arrayW+4] ; AX = 3000h mov eax,[arrayD+4] ; EAX = 00000002h ; Will the following statements assemble? mov ax,[arrayW-2] ; ?? mov eax,[arrayD+16] ; ?? ; What will happen when the above two run?
Yes
Some unkown values are loaded(usually 0)
Out of range.
CSE3030
Write a program that rearranges the values of three doubleword values in the following array as: 3, 1, 2. Step1: copy the first value into EAX and exchange it
with the value in the second position.
Step 2: Exchange EAX with the third array value and copy the value in EAX to the first array position.
Assembly Programming 73
.data arrayD DWORD 1,2,3
mov eax,arrayD xchg eax,[arrayD+4]
xchg eax,[arrayD+8] mov arrayD,eax
CSE3030
We want to write a program that adds the following three bytes:
What is your evaluation of the following code?
How about the following code?
Correct code? Correct code:
Assembly Programming 74
; assemble ; error ; “
.data myBytes BYTE 80h,66h,0A5h
mov al,myBytes add al,[myBytes+1] add al,[myBytes+2]
mov ax,myBytes add ax,[myBytes+1] add ax,[myBytes+2]
; al= 80h ; al=0E6h ; al= 8Bh
ans 80h 0E6h 18Bh
movzx ax,myBytes mov bl,[myBytes+1] add ax,bx mov bl,[myBytes+2] add ax,bx ; AX = sum
bh may not be 0
No! movzx ax,myBytes movzx bx,[myBytes+1] add ax,bx movzx bx,[myBytes+2] add ax,bx
mov bx,0
CSE3030
Addition and Subtraction INC and DEC Instructions ADD and SUB Instructions NEG Instruction Implementing Arithmetic Expressions Flags Affected by Arithmetic
Zero Sign Carry Overflow
Assembly Programming 75
CSE3030
INC and DEC Instructions Add 1, subtract 1 from destination operand
operand may be register or memory INC dest : dest ← dest + 1 DEC dest : dest ← dest – 1 Example
Assembly Programming 76
.data myWord WORD 1000h myDword DWORD 10000000h .code inc myWord ; 1001h dec myWord ; 1000h inc myDword ; 10000001h mov ax,00FFh inc ax ; AX = 0100h mov ax,00FFh inc al ; AX = 0000h
.data myByte BYTE 0FFh, 0 .code mov al,myByte ; AL = mov ah,[myByte+1] ; AH = dec ah ; AH = inc al ; AL = dec ax ; AX =
FFh 00h
FFh 00h FEFFh
CSE3030
ADD and SUB Instructions ADD dest, src : dest ← dest + src SUB dest, src : dest ← dest – src The same operand rules as for the MOV instruction. Example
Assembly Programming 77
.data var1 DWORD 10000h var2 DWORD 20000h .code ; ---EAX--- mov eax,var1 ; 00010000h add eax,var2 ; 00030000h add ax,0FFFFh ; 0003FFFFh add eax,1 ; 00040000h sub ax,1 ; 0004FFFFh
CSE3030
NEG (negate) Instruction Reverses the sign of an operand. Operand can be a register
or memory operand (Internally, SUB 0,operand) Any nonzero operand causes the Carry flag to be set. Example:
Assembly Programming 78
.data valB BYTE 1,0 valC SBYTE -128 .code neg valB ; CF = 1, OF = 0 neg [valB + 1]; CF = 0, OF = 0 neg valC ; CF = 1, OF = 1
.data valB BYTE -1 valW WORD +32767 .code mov al,valB ; AL = -1 neg al ; AL = +1 neg valW ; valW = -32767 mov ax, -32768 ; AX= neg ax ; AX=
8000h 8000h
CSE3030
Implementing Arithmetic Expressions Example : Rval = -Xval + (Yval – Zval)
Example : Rval = Xval - (-Yval + Zval)
Assembly Programming 79
Rval DWORD ? Xval DWORD 26 Yval DWORD 30 Zval DWORD 40 .code mov eax,Xval neg eax ; EAX = -26 mov ebx,Yval sub ebx,Zval ; EBX = -10 add eax,ebx mov Rval,eax ; -36
mov ebx,Yval neg ebx add ebx,Zval mov eax,Xval sub eax,ebx mov Rval,eax
CSE3030
Flags Affected by Arithmetic The ALU has a number of status flags that reflect the
outcome of arithmetic (and bitwise) operations. based on the contents of the destination operand.
Essential flags: Zero flag – destination equals zero. Sign flag – destination is negative. Carry flag – unsigned value out of range. Overflow flag – signed value out of range.
The MOV instruction never affects the flags.
Assembly Programming 80
CSE3030
Key to flag abbreviations In VS2012, right click mouse on the register window and
click flag.
Assembly Programming 81
CSE3030
Concept Map
Assembly Programming 82
status flags
ALU conditional jumps
branching logic
arithmetic & bitwise operations
part of
used by provide attached to
affect
CPU
executes
executes
CSE3030
Zero Flag (ZF) Whenever the destination operand equals Zero, the Zero
flag is set.
Sign Flag (SF) The Sign flag is set when the destination operand is
negative. The flag is clear when the destination is positive.
Assembly Programming 83
mov cx,1 sub cx,1 ; CX = 0, ZF = 1 (set) mov ax,0FFFFh inc ax ; AX = 0, ZF = 1 inc ax ; AX = 1, ZF = 0 (clear)
; SF is a copy of the dest’s msb. mov cx,0 sub cx,1 ; CX = -1, SF = 1 add cx,2 ; CX = 1, SF = 0 mov al,0 sub al,1 ; AL = 11111111b, SF = 1 add al,2 ; AL = 00000001b, SF = 0
CSE3030
Signed and Unsigned Integers : A Hardware View All CPU instructions operate exactly the same on signed
and unsigned integers. The CPU cannot distinguish between signed and unsigned
integers. YOU, the programmer, are solely responsible for using the
correct data type with each instruction.
Assembly Programming 84
CSE3030
Carry Flag (CF) The Carry flag is set when the result of an operation
generates an unsigned value that is out of range (too big or too small for the destination operand).
Assembly Programming 85
mov al,0FFh add al,1 ; CF = 1, AL = 00 ; Try to go below zero: mov al,0 sub al,1 ; CF = 1, AL = FF
mov ax,00FFh add ax,1 ; AX= SF= ZF= CF= sub ax,1 ; AX= SF= ZF= CF= add al,1 ; AL= SF= ZF= CF= mov bh,6Ch add bh,95h ; BH= SF= ZF= CF= mov al,2 sub al,3 ; AL= SF= ZF= CF=
0100h 00FFh 00h
01h
FFh
0 0 0
0
1
0 0 1
0
0
0 0 1
1
1
CSE3030
Overflow Flag (OF) The Overflow flag is set when the signed result of an
operation is invalid or out of range. Example:
Assembly Programming 86
; Example 1 mov al,+127 add al,1 ; OF = 1, AL = ?? ; Example 2 mov al,7Fh ; OF = 1, AL = 80h add al,1
mov al,-128 neg al ; CF = OF = mov ax,8000h add ax,2 ; CF = OF = mov ax,0 sub ax,2 ; CF = OF = mov al,-5 sub al,+125 ; OF =
1
0
1
1
1
0
0
CSE3030
Data-Related Operators and Directives OFFSET Operator PTR Operator TYPE Operator LENGTHOF Operator SIZEOF Operator LABEL Directive
Assembly Programming 87
CSE3030
OFFSET Operator OFFSET returns the distance in bytes, of a label from
the beginning of its enclosing segment Protected mode: 32 bits Real mode: 16 bits
The Protected-mode programs we write only have a single segment (we use the flat memory model).
Example:
Assembly Programming 88
offset
myByte
data segment:
.data bVal BYTE ? wVal WORD ? dVal DWORD ? dVal2 DWORD ? .code mov esi,OFFSET bVal ; ESI = 00404000 mov esi,OFFSET wVal ; ESI = 00404001 mov esi,OFFSET dVal ; ESI = 00404003 mov esi,OFFSET dVal2 ; ESI = 00404007
Assume that the data segment begins at 00404000h.
CSE3030
Relating to C/C++ The value returned by OFFSET is a pointer. Compare
the following two codes written for both C++ and assembly language:
Assembly Programming 89
// C++ version: char array[1000]; char * p = array;
; ASM version .data array BYTE 1000 DUP(?) .code mov esi,OFFSET array ; ESI is p
CSE3030
PTR Operator Overrides the default type of a label (variable). Provides
the flexibility to access part of a variable.
Little Endian Order Multi-byte integers are stored in reverse order, with the least significant byte stored at the lowest address For example, the doubleword 12345678h would be stored as: Assembly Programming 90
.data myDouble DWORD 12345678h .code mov ax,myDouble ; error – why? mov ax,WORD PTR myDouble ; loads 5678h mov ax,WORD PTR myDouble+2 ; loads 1234h mov WORD PTR myDouble, 9999h ; saves 9999h
000078
56
34
12
0001
0002
0003
offsetbyte
CSE3030
PTR Operator Examples
Assembly Programming 91
5678
1234
12345678
.data myDouble DWORD 12345678h
0000 78 56 34 12
0001 0002 0003
offset doubleword word byte myDouble myDouble + 1 myDouble + 2 myDouble + 3
mov al,BYTE PTR myDouble ; AL = 78h mov al,BYTE PTR [myDouble+1] ; AL = 56h mov al,BYTE PTR [myDouble+2] ; AL = 34h mov ax,WORD PTR [myDouble] ; AX = 5678h mov ax,WORD PTR [myDouble+2] ; AX = 1234h
CSE3030
PTR can also be used to combine elements of a smaller data type and move them into a larger operand. The CPU will automatically reverse the bytes.
Example:
Assembly Programming 92
.data myBytes BYTE 12h,34h,56h,78h .code mov ax,WORD PTR [myBytes] ; AX = 3412h mov ax,WORD PTR [myBytes+2] ; AX = 7856h mov eax,DWORD PTR myBytes ; EAX = 78563412h
.data varB BYTE 65h,31h,02h,05h varW WORD 6543h,1202h varD DWORD 12345678h .code mov ax,WORD PTR [varB+2] ; a. mov bl,BYTE PTR varD ; b. mov bl,BYTE PTR [varW+2] ; c. mov ax,WORD PTR [varD+2] ; d. mov eax,DWORD PTR varW ; e.
0502h 78h 02h 1234h 12026543h
CSE3030
TYPE Operator returns the size, in bytes, of a single element of a data declaration.
LENGTHOF Operator counts the number of elements in a single data declaration.
SIZEOF Operator returns a value that is equivalent to multiplying LENGTHOF by TYPE.
Assembly Programming 93
.data var1 BYTE ? var2 WORD ? array1 WORD 30 DUP(?),0,0 ; 32 var4 QWORD ? .code mov eax, TYPE var1 ; 1 mov eax, TYPE var2 ; 2 mov ecx, LENGTHOF array1 ; 32 mov ecx, SIZEOF array1 ; 64
CSE3030
Spanning Multiple Lines A data declaration spans multiple lines if each line (except
the last) ends with a comma. The LENGTHOF and SIZEOF operators include all lines belonging to the declaration.
Assembly Programming 94
.data array WORD 10,20, 30,40 array1 WORD 50,60 WORD 70, 80 .code mov eax,LENGTHOF array ; 4 mov ebx,SIZEOF array ; 8 mov eax,LENGTHOF array1 ; 2 mov ebx,SIZEOF array1 ; 4
CSE3030
LABEL Directive Assigns an alternate label name and type to an existing
storage location LABEL does not allocate any storage of its own Removes the need for the PTR operator
Assembly Programming 95
.data dwList LABEL DWORD wordList LABEL WORD intList BYTE 00h,10h,00h,20h .code mov eax, dwList ; 20001000h mov cx, wordList ; 1000h mov dl, intList ; 00h
CSE3030
Indirect Addressing Indirect Operands Array Sum Example Indexed Operands Pointers
Assembly Programming 96
CSE3030
Indirect Operands An indirect operand holds the address of a variable,
usually an array or string. It can be dereferenced (just like a pointer).
Use PTR when the size of a memory operand is ambiguous.
Assembly Programming 97
.data val1 BYTE 10h,20h,30h myCount WORD 0 .code mov esi,OFFSET val1 mov al,[esi] ; dereference ESI (AL = 10h) inc esi mov al,[esi] ; AL = 20h inc esi mov al,[esi] ; AL = 30h mov esi,OFFSET myCount inc [esi] ; error: ambiguous inc WORD PTR [esi] ; ok (use PTR)
CSE3030
Array Sum Example Indirect operands are ideal for traversing an array. Note
that the register in brackets must be incremented by a value that matches the array type.
Assembly Programming 98
.data arrayW WORD 1000h,2000h,3000h .code
mov esi, OFFSET arrayW mov ax, [esi] add esi, 2 ; or: add esi, TYPE arrayW add ax, [esi] add esi, 2 add ax, [esi] ; AX = sum of the array
CSE3030
Indexed Operands An indexed operand adds a constant to a register to
generate an effective address. There are two notational forms:
Assembly Programming 99
[label + reg] or label[reg]
.data arrayW WORD 1000h,2000h,3000h .code mov esi,0 mov ax, [arrayW + esi] ; AX = 1000h mov ax, arrayW[esi] ; alternate format add esi, 2 add ax, [arrayW + esi] etc.
CSE3030
Pointers You can declare a pointer variable that contains the offset
of another variable.
Assembly Programming 100
.data arrayW WORD 1000h,2000h,3000h ptrW DWORD arrayW .code mov esi,ptrW mov ax,[esi] ; AX = 1000h
Alternate format: ptrW DWORD OFFSET arrayW
CSE3030
JMP and LOOP Instructions JMP Instruction LOOP Instruction LOOP Example Summing an Integer Array Copying a String
Assembly Programming 101
CSE3030
JMP Instruction JMP is an unconditional jump to a label that is usually
within the same procedure. Syntax: JMP target Logic: EIP ← target
Example:
A jump outside the current procedure must be to a special type of label called a global label (see Section 5.5.2.3 for details).
Assembly Programming 102
top: . . jmp top
CSE3030
LOOP Instruction The LOOP instruction creates a counting loop Syntax: LOOP target Logic: ECX ← ECX – 1
if ECX != 0, jump to target Implementation:
The assembler calculates the distance, in bytes, between the offset of the following instruction and the offset of the target label. It is called the relative offset.
The relative offset is added to EIP.
Assembly Programming 103
CSE3030
LOOP Example The following loop calculates the sum of the integers 5 +
4 + 3 +2 + 1:
Assembly Programming 104
00000000 66 B8 0000 mov ax,0 00000004 B9 00000005 mov ecx,5 00000009 66 03 C1 L1: add ax,cx 0000000C E2 FB loop L1 0000000E
offset machine code source code
When LOOP is assembled, the current location = 0000000E (offset of the next instruction). –5 (FBh) is added to the the current location, causing a jump to location 00000009: 00000009 ← 0000000E + FB
CSE3030
If the relative offset is encoded in a single signed byte, what is the largest possible backward jump? what is the largest possible forward jump?
What will be the final value of AX?
How many times will the loop execute?
Assembly Programming 105
-128
+127
mov ax,6 mov ecx,4 L1: inc ax loop L1
mov ecx,0 X2: inc ax loop X2
10
4,294,967,296
CSE3030
Nested Loop If you need to code a loop within a loop, you must
save the outer loop counter's ECX value. In the following example, the outer loop executes 100 times, and the inner loop 20 times.
Assembly Programming 106
.data count DWORD ? .code mov ecx,100 ; set outer loop count L1: mov count,ecx ; save outer loop count mov ecx,20 ; set inner loop count L2: .
. loop L2 ; repeat the inner loop
mov ecx,count ; restore outer loop count loop L1 ; repeat the outer loop
CSE3030
Summing an Integer Array The following code calculates the sum of an array of 16-
bit integers.
Assembly Programming 107
.data intarray WORD 100h,200h,300h,400h .code
mov edi,OFFSET intarray ; address of intarray mov ecx,LENGTHOF intarray ; loop counter mov ax,0 ; zero the accumulator
L1: add ax,[edi] ; add an integer add edi,TYPE intarray ; point to next integer
loop L1 ; repeat until ECX = 0
CSE3030
Copying a String The following code copies a string from source to target:
Rewrite the program shown in the previous slide, using indirect addressing rather than indexed addressing.
Assembly Programming 108
.data source BYTE "This is the source string",0 target BYTE SIZEOF source DUP(0) .code mov esi,0 ; index register mov ecx,SIZEOF source ; loop counter L1: mov al,source[esi] ; get char from source mov target[esi],al ; store it in the target inc esi ; move to next character loop L1 ; repeat for entire string