Post on 13-Dec-2015
Bits and BytesSpring, 2015
Bits and BytesSpring, 2015
TopicsTopics Why bits? Representing information as bits
Binary / Hexadecimal Byte representations
» Numbers» Characters and strings» Instructions
Bit-level manipulations Boolean algebra Expressing in C
COMP 21000Comp Org & Assembly Lang
– 2 – COMP 21000, S’15
How then do we represent data?How then do we represent data?
Each different Each different ““typetype”” in a programming language has in a programming language has its own representation.its own representation.
Next few slides discuss the representation for the Next few slides discuss the representation for the common types:common types: Ints Pointers Floats Strings
– 3 – COMP 21000, S’15
Representing IntegersRepresenting Integers
int A = 15213;int A = 15213;int B = -15213;int B = -15213;long int C = 15213;long int C = 15213;
Decimal: 15213
Binary: 0011 1011 0110 1101
Hex: 3 B 6 D
6D3B0000
Linux/Alpha A
3B6D
0000
Sun A
93C4FFFF
Linux/Alpha B
C493
FFFF
Sun B
Two’s complement representation(Covered next lecture)
00000000
6D3B0000
Alpha C
3B6D
0000
Sun C
6D3B0000
Linux C
Linux/Win/OS X: little endianSolaris: big endian
– 4 – COMP 21000, S’15
Representing PointersRepresenting Pointersint B = -15213;int B = -15213;int *P = &B;int *P = &B;
Alpha Address
Hex: 1 F F F F F C A 0
Binary: 0001 1111 1111 1111 1111 1111 1100 1010 0000
01000000
A0FCFFFF
Alpha P
Sun Address
Hex: E F F F F B 2 C Binary: 1110 1111 1111 1111 1111 1011 0010 1100
Different compilers & machines assign different locations to objects
FB2C
EFFF
Sun P
FFBF
D4F8
Linux P
Linux Address
Hex: B F F F F 8 D 4 Binary: 1011 1111 1111 1111 1111 1000 1101 0100
Linux/Win/OS X: little endianSolaris: big endian
– 5 – COMP 21000, S’15
Representing FloatsRepresenting Floats
Float F = 15213.0;Float F = 15213.0;
IEEE Single Precision Floating Point Representation
Hex: 4 6 6 D B 4 0 0 Binary: 0100 0110 0110 1101 1011 0100 0000 0000
15213: 1110 1101 1011 01
Not same as integer representation, but consistent across machines
00B46D46
Linux/Win/OS X F
B400
466D
Solaris F
Can see some relation to integer representation, but not obvious
IEEE Single Precision Floating Point Representation
Hex: 4 6 6 D B 4 0 0 Binary: 0100 0110 0110 1101 1011 0100 0000 0000
15213: 1110 1101 1011 01
IEEE Single Precision Floating Point Representation
Hex: 4 6 6 D B 4 0 0 Binary: 0100 0110 0110 1101 1011 0100 0000 0000
15213 1110 1101 1011 01 (in binary)
– 6 – COMP 21000, S’15
char S[6] = "15213";char S[6] = "15213";
Representing StringsRepresenting Strings
Strings in CStrings in C Represented by array of characters Each character encoded in ASCII format
Standard 7-bit encoding of character setCharacter “0” has code 0x30
» Digit i has code 0x30+i String should be null-terminated
Final character = 0
CompatibilityCompatibility Byte ordering not an issue Text files generally platform independent
Except for different conventions of line termination character(s)!» Unix (‘\n’ = 0x0a = ^J) » Mac (‘\r’ = 0x0d = ^M) » DOS and HTTP (‘\r\n’ = 0x0d0a = ^M^J)
Linux/Win/OS X S Solaris S
3231
3135
3300
3231
3135
3300
– 7 – COMP 21000, S’15
ASCII ChartASCII Chart
ASCII ‘0’ is 0x30
Reference:http://jimprice.com/jim-asc.htm
– 8 – COMP 21000, S’15
CharactersCharacters
Other popular encoding standards: Other popular encoding standards: Unicode
16 bit codeAllows more characters: most languages can be encodedSee http://en.wikipedia.org/wiki/UnicodeThe most commonly used encodings are UTF-8, UTF-16 and the
now-obsolete UCS-2UTF-8
» uses one byte for any ASCII character,
» These bytes have the same code values in both UTF-8 and ASCII encoding,
» and up to four bytes for other characters
ISO Latin-88-bit encoding standard with dotted consonants International standard for Latin letters
– 9 – COMP 21000, S’15
Machine-Level Code RepresentationMachine-Level Code Representation
Encode Program as Sequence of InstructionsEncode Program as Sequence of Instructions Each instruction is a simple operation
Arithmetic operationRead or write memoryConditional branch
Instructions encoded as bytesAlpha’s, Sun’s, Mac’s use 4 byte instructions
» Reduced Instruction Set Computer (RISC)PC’s (and intel Mac’s) use variable length instructions
» Complex Instruction Set Computer (CISC)
Different instruction types and encodings for different machines
Most code is not binary compatible
Programs are Byte Sequences Too!Programs are Byte Sequences Too!
– 10 – COMP 21000, S’15
Representing InstructionsRepresenting Instructions
int sum(int x, int y)int sum(int x, int y){{ return x+y;return x+y;}}
Different machines use totally different instructions and encodings
00003042
Alpha sum
0180FA6B
E008
81C3
Solaris sum
90020009
For this example, Alpha & Sun use two 4-byte instructions
Use differing numbers of instructions in other cases
PC uses 7 instructions with lengths 1, 2, and 3 bytes
Same for NT, OSX (intel) and for Linux
NT / Linux/OSX not fully binary compatible
E58B
5589
PC sum
450C03450889EC5DC3
– 11 – COMP 21000, S’15
What do we know now?What do we know now?
A computer is a processor and RAM. A computer is a processor and RAM. Everything else is peripheral
Everything is 1’s and 0’s. Everything is 1’s and 0’s. A processor executes instructions
Instructions are encoded in 1’s and 0’seach instruction has binary representation
RAM stores 1’s and 0’sWe interpret these 1’s and 0’s based on context
Every type is interpreted from binary integers: binary float: special representation in binarychar: binary (ASCII encoding)
– 12 – COMP 21000, S’15
How do we get to bits?How do we get to bits?
We write programs in an abstract language like CWe write programs in an abstract language like C
How do we get to the binary representation?How do we get to the binary representation?
Compilers & Assemblers!Compilers & Assemblers!
We writeWe write
x = 5.x = 5.
The compiler changes it intoThe compiler changes it into
mov 5, 0x005Fmov 5, 0x005F
The assembler changes it into:The assembler changes it into:
80483c7: 8b 45 5F 80483c7: 8b 45 5F
– 13 – COMP 21000, S’15
Manipulating bitsManipulating bits
Since all information is in the form of bits, we must Since all information is in the form of bits, we must manipulate bits when programming at low levelsmanipulate bits when programming at low levels
We did some of this with the show_bytes program. To understand how to do more, we must understand
Boolean algebra We can manipulate bits from C; this is where C and
hardware meet.
– 14 – COMP 21000, S’15
Boolean AlgebraBoolean AlgebraDeveloped by George Boole in 19th CenturyDeveloped by George Boole in 19th Century
Algebraic representation of logicEncode “True” as 1 and “False” as 0
AndAnd A&B = 1 when both A=1 and
B=1
NotNot ~A = 1 when A=0
OrOr A|B = 1 when either A=1 or
B=1
Exclusive-Or (Xor)Exclusive-Or (Xor) A^B = 1 when either A=1 or
B=1, but not both
– 15 – COMP 21000, S’15
A
~A
~B
B
Connection when A&~B | ~A&B
Application of Boolean AlgebraApplication of Boolean Algebra
Applied to Digital Systems by Claude ShannonApplied to Digital Systems by Claude Shannon 1937 MIT Master’s Thesis Reason about networks of relay switches
Encode closed switch as 1, open switch as 0
A&~B
~A&B = A^B
– 16 – COMP 21000, S’15
General Boolean AlgebrasGeneral Boolean Algebras
Operate on Bit VectorsOperate on Bit Vectors
Operations applied bitwise
All of the Properties of Boolean Algebra ApplyAll of the Properties of Boolean Algebra Apply
01101001& 01010101 01000001
01101001| 01010101 01111101
01101001^ 01010101 00111100
~ 01010101 10101010 01000001 01111101 00111100 10101010
– 17 – COMP 21000, S’15
Representing & Manipulating SetsRepresenting & Manipulating Sets
RepresentationRepresentation Width w bit vector represents subsets of {0, …, w–1} aj = 1 if j A
01101001 { 0, 3, 5, 6 }
76543210
01010101 { 0, 2, 4, 6 }
76543210
OperationsOperations & Intersection 01000001 { 0, 6 } | Union 01111101 { 0, 2, 3, 4, 5, 6
} ^ Symmetric difference 00111100 { 2, 3, 4, 5 } ~ Complement 10101010 { 1, 3, 5, 7 }
– 18 – COMP 21000, S’15
Bit-Level Operations in CBit-Level Operations in C
Operations &, |, ~, ^ Available in COperations &, |, ~, ^ Available in C Apply to any “integral” data type
long, int, short, char, unsigned View arguments as bit vectors Arguments applied bit-wise
Examples (Char data type)Examples (Char data type) ~0x41 --> 0xBE
~010000012 --> 101111102
~0x00 --> 0xFF~000000002 --> 111111112
0x69 & 0x55 --> 0x41011010012 & 010101012 --> 010000012
0x69 | 0x55 --> 0x7D011010012 | 010101012 --> 011111012
– 19 – COMP 21000, S’15
Contrast: Logic Operations in CContrast: Logic Operations in C
Contrast to Logical OperatorsContrast to Logical Operators &&, ||, !
View 0 as “False”Anything nonzero as “True”Always return 0 or 1Early termination
Examples (char data type)Examples (char data type) !0x41 --> 0x00 !0x00 --> 0x01 !!0x41 --> 0x01
0x69 && 0x55 --> 0x01 0x69 || 0x55 --> 0x01 p && *p++ (avoids null pointer access)
Or p && (*p = 10) also protects
Works because of short-circuit evaluation in C
– 20 – COMP 21000, S’15
Shift OperationsShift Operations
Left Shift: Left Shift: x << yx << y Shift bit-vector x left y positions
Throw away extra bits on leftFill with 0’s on right
Right Shift: Right Shift: x >> yx >> y Shift bit-vector x right y
positionsThrow away extra bits on right
Logical shiftFill with 0’s on left
Arithmetic shiftReplicate most significant bit on
rightUseful with two’s complement
integer representation
01100010Argument x
00010000<< 3
00011000Log. >> 2
00011000Arith. >> 2
10100010Argument x
00010000<< 3
00101000Log. >> 2
11101000Arith. >> 2
0001000000010000
0001100000011000
0001100000011000
00010000
00101000
11101000
00010000
00101000
11101000
Which right shift does C do? Depends on the compiler!
Often: int do ASR While unsign does LSR
– 21 – COMP 21000, S’15
Effect of LSLEffect of LSL
Each shift doubles the number.Each shift doubles the number.
If the carry bit is 1 after a shift, then overflow (if the number If the carry bit is 1 after a shift, then overflow (if the number is unsigned).is unsigned).
Decimal: shift multiplies by 10.Decimal: shift multiplies by 10.35 << 350
Binary:
0 1 0 1 0 = 1 x 23 + 0 x 22 + 1 x 21 + 0 x 20
1 0 1 0 0 = 1 x 24 + 0 x 23 + 1 x 22 + 0 x 21+ 0 x 20
= 2 x (1 x 23 + 0 x 22 + 1 x 21 + 0 x 20)
– 22 – COMP 21000, S’15
Effect of LSREffect of LSR
Each shift halves the number.Each shift halves the number.
Round down (because you drop the least significant bit)Round down (because you drop the least significant bit)
Decimal: shift divides by 10.Decimal: shift divides by 10.350 >> 35
Binary:
1 0 1 0 0 = 1 x 24 + 0 x 23 + 1 x 22 + 0 x 21+ 0 x 20
0 1 0 1 0 = 1 x 23 + 0 x 22 + 1 x 21 + 0 x 20
= (1 x 24 + x 23 + 0 x 22 + 1 x 21 + 0 x 20) ÷ 2
– 23 – COMP 21000, S’15
TruncatingTruncating
Casting to a shorter form Casting to a shorter form truncatestruncates the number. the number.int x = 53191;int x = 53191;
short sx = (short) x;short sx = (short) x; /* -12345 *//* -12345 */
int y = sx;int y = sx; /* -12345*//* -12345*/
On a typical 32-bit machine:On a typical 32-bit machine: Casting int to short truncate 32-bit to 16-bits When cast back to 32 bits, do sign extension (see next
lecture) so value doesn’t change.
– 24 – COMP 21000, S’15
MasksMasks
Can use bit patterns as masksCan use bit patterns as masks
ExampleExample
Mask = 0xFFFFFF00Mask = 0xFFFFFF00 when used with & filters out when used with & filters out (i.e., zeros out) the last byte(i.e., zeros out) the last byte
Mask = 0x000000FFMask = 0x000000FF when used with | places 1when used with | places 1’’s s in the last byte, rest is unaffected.in the last byte, rest is unaffected.
– 25 – COMP 21000, S’15
MasksMasks
Can perform a mod by a power of 2 with a maskCan perform a mod by a power of 2 with a mask
Examples:Examples:
x % 2 = 0 (x is even) or 1 (x is odd)x % 2 = 0 (x is even) or 1 (x is odd)
1111 = 1 x 21111 = 1 x 233 + 1 x 2 + 1 x 222 + 1 x 2 + 1 x 211 + 1 x 2 + 1 x 200 : all places : all places except the except the lastlast are divisible by 2. So if the last bit is are divisible by 2. So if the last bit is 0, mod is 0. If last bit is 1, mod is 10, mod is 0. If last bit is 1, mod is 1
x % 4 x % 4
1111 = 1 x 21111 = 1 x 233 + 1 x 2 + 1 x 222 + 1 x 2 + 1 x 211 + 1 x 2 + 1 x 200 : all places : all places except the except the last twolast two are divisible by 4. So if the last are divisible by 4. So if the last two bits are 0, mod is 0. Otherwise the mod is two bits are 0, mod is 0. Otherwise the mod is whatever value the last two bits havewhatever value the last two bits have
– 26 – COMP 21000, S’15
MasksMasks
Can perform a mod by a power of 2 with a maskCan perform a mod by a power of 2 with a mask
Examples:Examples:
x % 8 = 0 x % 8 = 0
1111 = 1 x 21111 = 1 x 233 + 1 x 2 + 1 x 222 + 1 x 2 + 1 x 211 + 1 x 2 + 1 x 200 : all places : all places except the except the last threelast three are divisible by 8. So if the last are divisible by 8. So if the last three bits are 0, mod is 0. . Otherwise the mod is the three bits are 0, mod is 0. . Otherwise the mod is the value in the last three bitsvalue in the last three bits
soso
x % 2 == x & 00000001 (or 0x01)x % 2 == x & 00000001 (or 0x01)
x % 4 == x & 00000011 (or 0x03)x % 4 == x & 00000011 (or 0x03)
x % 8 == x & 00000111 (or 0x07)x % 8 == x & 00000111 (or 0x07)
– 27 – COMP 21000, S’15
Converting (revisited)Converting (revisited)
#include <stdlib.h>#define LOWDIGIT 07#define DIG 20
void octal_print(register int x){ register int i = 0; char s[DIG]; if ( x < 0 ){ /* treat negative x */ putchar ('-'); /* output a minus sign */ x = - x; } do{
/* mod performed with & */ s[i++] = ((x & LOWDIGIT) + '0'); } while ( ( x >>= 3) > 0 ); /* divide x by 8 via shift */ putchar('0'); /* octal number prefix */ do{ putchar(s[--i]); /* output characters on s */ } while ( i > 0 ); putchar ('\n');}
– 28 – COMP 21000, S’15
Converting to binaryConverting to binary
void printBits(unsigned int word, int nBits){void printBits(unsigned int word, int nBits){
for (int i=nBits-1;i>=0;i--){for (int i=nBits-1;i>=0;i--){
printf("%u", !!(word&(1<<i)));printf("%u", !!(word&(1<<i)));
}}
printf("\n");printf("\n");
}}
– 29 – COMP 21000, S’15
Cool Stuff with XorCool Stuff with Xor
void funny(int *x, int *y)void funny(int *x, int *y){{ *x = *x ^ *y; /* #1 */*x = *x ^ *y; /* #1 */ *y = *x ^ *y; /* #2 */*y = *x ^ *y; /* #2 */ *x = *x ^ *y; /* #3 */*x = *x ^ *y; /* #3 */}}
Bitwise Xor is form of addition
With extra property that every value is its own additive inverse
A ^ A = 0
BABegin
BA^B1
(A^B)^B = AA^B2
A(A^B)^A = B3
ABEnd
*y*x
– 30 – COMP 21000, S’15
Bit techniquesBit techniquesINTEGER CODING RULES:INTEGER CODING RULES:
Replace the "return" statement in each function with oneReplace the "return" statement in each function with one
or more lines of C code that implements the function. Your code or more lines of C code that implements the function. Your code
must conform to the following style:must conform to the following style:
int Funct(arg1, arg2, ...) {int Funct(arg1, arg2, ...) {
/* brief description of how your implementation works *//* brief description of how your implementation works */
int var1 = Expr1;int var1 = Expr1;
......
int varM = ExprM;int varM = ExprM;
varJ = ExprJ;varJ = ExprJ;
......
varN = ExprN;varN = ExprN;
return ExprR;return ExprR;
}}
– 31 – COMP 21000, S’15
Bit techniquesBit techniques Each "Expr" is an expression using ONLY the following:Each "Expr" is an expression using ONLY the following:
1. Integer constants 0 through 255 (0xFF), inclusive. You are1. Integer constants 0 through 255 (0xFF), inclusive. You are
not allowed to use big constants such as 0xffffffff.not allowed to use big constants such as 0xffffffff.
2. Function arguments and local variables (no global variables).2. Function arguments and local variables (no global variables).
3. Unary integer operations ! ~3. Unary integer operations ! ~
4. Binary integer operations & ^ | + << >>4. Binary integer operations & ^ | + << >>
Some of the problems restrict the set of allowed operators even further.Some of the problems restrict the set of allowed operators even further.
Each "Expr" may consist of multiple operators. You are not restricted toEach "Expr" may consist of multiple operators. You are not restricted to
one operator per line.one operator per line.
You are expressly You are expressly forbidden forbidden to:to:
1. Use any control constructs such as if, do, while, for, switch, etc.1. Use any control constructs such as if, do, while, for, switch, etc.
2. Define or use any macros.2. Define or use any macros.
3. Define any additional functions in this file.3. Define any additional functions in this file.
4. Call any functions.4. Call any functions.
5. Use any other operations, such as &&, ||, -, or ?:5. Use any other operations, such as &&, ||, -, or ?:
6. Use any form of casting.6. Use any form of casting.
7. Use any data type other than int. This implies that you7. Use any data type other than int. This implies that you
cannot use arrays, structs, or unions.cannot use arrays, structs, or unions.
– 32 – COMP 21000, S’15
Bit techniquesBit techniques/* pow2plus1 - returns 2^x + 1, where 0 <= x <= 31 *//* pow2plus1 - returns 2^x + 1, where 0 <= x <= 31 */
int pow2plus1(int x) {int pow2plus1(int x) {
/* exploit ability of shifts to compute powers of 2 *//* exploit ability of shifts to compute powers of 2 */
return return (1 << x) + 1;(1 << x) + 1;
}}
/* pow2plus4 - returns 2^x + 4, where 0 <= x <= 31 *//* pow2plus4 - returns 2^x + 4, where 0 <= x <= 31 */
int pow2plus4(int x) {int pow2plus4(int x) {
/* exploit ability of shifts to compute powers of 2 *//* exploit ability of shifts to compute powers of 2 */
int result = (1 << x);int result = (1 << x);
result += 4;result += 4;
return result;return result;
}}
– 33 – COMP 21000, S’15
Bit techniquesBit techniques/ * bitNor - ~(x|y) using only ~ and & / * bitNor - ~(x|y) using only ~ and & * Example: bitNor(0x6, 0x5) = 0xFFFFFFF8* Example: bitNor(0x6, 0x5) = 0xFFFFFFF8 * Legal ops: ~ &* Legal ops: ~ & * Max ops: 8* Max ops: 8 * Rating: 1* Rating: 1 */*/
int bitNor(int x, int y) {int bitNor(int x, int y) {
}}/* evenBits - return word with all even-numbered bits set to 1/* evenBits - return word with all even-numbered bits set to 1 * Legal ops: ! ~ & ^ | + << >>* Legal ops: ! ~ & ^ | + << >> * Max ops: 8* Max ops: 8 * Rating: 1* Rating: 1 */*/
int evenBits(void) {int evenBits(void) {
}}
return (~x & ~y)
int byte = 0x55;int word = byte | byte<<8;return word | word<<16;
use DeMorgan’s Law: ~(A & B) = (~A) | (~B)
And~(A | B) = (~A) & (~B)
use DeMorgan’s Law: ~(A & B) = (~A) | (~B)
And~(A | B) = (~A) & (~B)
– 34 – COMP 21000, S’15
Some Other Uses for BitvectorsSome Other Uses for Bitvectors
Representation of small setsRepresentation of small sets
Representation of polynomials:Representation of polynomials: Important for error correcting codes (see later slides) Arithmetic over finite fields, say GF(2^n) Example 0x15213 : x16 + x14 + x12 + x9 + x4 + x + 1
Representation of graphs (intersection of sys & algo)Representation of graphs (intersection of sys & algo) A ‘1’ represents the presence of an edge
Representation of bitmap images, icons, cursors, …Representation of bitmap images, icons, cursors, … Exclusive-or cursor patent (see later slides)
Representation of Boolean expressions and logic Representation of Boolean expressions and logic circuitscircuits
– 35 – COMP 21000, S’15
UDPUDP
The User Datagram Protocol (UDP) is one of the core The User Datagram Protocol (UDP) is one of the core members of the Internet protocol suite, the set of network members of the Internet protocol suite, the set of network protocols used for the Internet. With UDP, computer protocols used for the Internet. With UDP, computer applications can send messages, in this case referred to applications can send messages, in this case referred to as datagrams, to other hosts on an Internet Protocol (IP) as datagrams, to other hosts on an Internet Protocol (IP) network without prior communications to set up special network without prior communications to set up special transmission channels or data paths. The protocol was transmission channels or data paths. The protocol was designed by David P. Reed in 1980 and formally defined designed by David P. Reed in 1980 and formally defined in RFC 768.in RFC 768.
http://en.wikipedia.org/wiki/http://en.wikipedia.org/wiki/User_Datagram_ProtocolUser_Datagram_Protocol
– 36 – COMP 21000, S’15Transport Layer 3-36
UDP: User Datagram Protocol [RFC 768]UDP: User Datagram Protocol [RFC 768]
““no frills,no frills,”” ““bare bonesbare bones”” Internet transport protocolInternet transport protocol
““best effortbest effort”” service, UDP service, UDP segments may be:segments may be:
lost delivered out of order
to app
connectionless:connectionless: no handshaking
between UDP sender, receiver
each UDP segment handled independently of others
Why is there a UDP?Why is there a UDP?
no connection no connection establishment establishment (TCP: 3-(TCP: 3-way handshake)way handshake)
simple: simple: no connection no connection state state at sender, receiver at sender, receiver (TCP keeps buffers, (TCP keeps buffers, sequence no., ack, etc.)sequence no., ack, etc.)
small segment header small segment header (20 (20 bytes for TCP, 8 for UDP)bytes for TCP, 8 for UDP)
no congestion controlno congestion control: UDP : UDP can blast away as fast as can blast away as fast as desireddesired
With UDP applications basically talk directly to IP
– 37 – COMP 21000, S’15Transport Layer
UDP checksumUDP checksum
Goal: detect “errors” (e.g., flipped bits) in transmitted segment
Result:UDP does error detection not error correction
1. may just discard a segment with errors 2. may pass it to the application with a warning.
– 38 – COMP 21000, S’15Transport Layer
UDP checksumUDP checksum
Sender:Sender:
treat segment contents treat segment contents as sequence of 16-bit as sequence of 16-bit integersintegers
checksum: addition (1checksum: addition (1’’s s complement sum) of complement sum) of segment contentssegment contents
sender puts checksum sender puts checksum value into UDP value into UDP checksum fieldchecksum field
Receiver:Receiver:
compute checksum of compute checksum of received segmentreceived segment
check if computed checksum check if computed checksum equals checksum field value:equals checksum field value:
NO - error detected YES - no error detected.
But maybe errors nonetheless? More later ….
Goal: detect “errors” (e.g., flipped bits) in transmitted segment
– 39 – COMP 21000, S’15Transport Layer
Internet Checksum ExampleInternet Checksum ExampleNoteNote
When adding numbers, a carryout from the most significant bit needs to be added to the result
sendersender: add two 16-bit integers: add two 16-bit integers1 1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 01 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1
1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 01 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1
wraparound
sum
Checksum (complement the sum)
– 40 – COMP 21000, S’15Transport Layer
Internet Checksum ExampleInternet Checksum ExampleNoteNote
When adding numbers, a carryout from the most significant bit needs to be added to the result
receiverreceiver: add two 16-bit integers plus the : add two 16-bit integers plus the
checksum: should get all 1checksum: should get all 1’’s (why?)s (why?)1 1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 01 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1
1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
wraparound
sumChecksum Result
For an explanation of 1’s complement addition see http://mathforum.org/library/drmath/view/54379.html
For an explanation of 1’s complement addition see http://mathforum.org/library/drmath/view/54379.html
Do
XO
R o
f su
m &
ch
ecks
um
; sh
ou
ld g
et a
ll 1’
sD
o X
OR
of
sum
& c
hec
ksu
m;
sho
uld
get
all
1’s
– 41 – COMP 21000, S’15
UDP checksum code (part 1)UDP checksum code (part 1)typedef unsigned short u16;typedef unsigned short u16;
typedef unsigned long u32;typedef unsigned long u32;
u16 udp_sum_calc(u16 len_udp, u16 src_addr[],u16 dest_addr[], BOOL padding, u16 buff[])u16 udp_sum_calc(u16 len_udp, u16 src_addr[],u16 dest_addr[], BOOL padding, u16 buff[])
{ u16 prot_udp=17;{ u16 prot_udp=17;
u16 padd=0;u16 padd=0;
u16 word16;u16 word16;
u32 sum;u32 sum;
// Find out if the length of data is even or odd number. If odd,// Find out if the length of data is even or odd number. If odd,
// add a padding byte = 0 at the end of packet// add a padding byte = 0 at the end of packet
if (padding&1==1){if (padding&1==1){
padd=1;padd=1;
buff[len_udp]=0;buff[len_udp]=0;
}}
//initialize sum to zero//initialize sum to zero
sum=0;sum=0;
See http://www.netfor2.com/udpsum.htm for the full code.See http://www.netfor2.com/udpsum.htm for the full code.
// make 16 bit words out of every two adjacent 8 bit words and // calculate the sum of all 16 vit words
// make 16 bit words out of every two adjacent 8 bit words and // calculate the sum of all 16 vit words
– 42 – COMP 21000, S’15
UDP checksum code (part 2)UDP checksum code (part 2)for (i=0; i < len_udp+padd; i=i+2){for (i=0; i < len_udp+padd; i=i+2){
word16 =((buff[i]<<8)&0xFF00)+(buff[i+1]&0xFF);word16 =((buff[i]<<8)&0xFF00)+(buff[i+1]&0xFF);
sum = sum + (unsigned long)word16;sum = sum + (unsigned long)word16;
}}
// add the UDP pseudo header which contains the IP source and destinationn // add the UDP pseudo header which contains the IP source and destinationn addressesaddresses
for (i=0;i<4;i=i+2){for (i=0;i<4;i=i+2){
word16 =((src_addr[i]<<8)&0xFF00)+(src_addr[i+1]&0xFF);word16 =((src_addr[i]<<8)&0xFF00)+(src_addr[i+1]&0xFF);
sum=sum+word16;sum=sum+word16;
}}
for (i=0;i<4;i=i+2){for (i=0;i<4;i=i+2){
word16 =((dest_addr[i]<<8)&0xFF00)+(dest_addr[i+1]&0xFF);word16 =((dest_addr[i]<<8)&0xFF00)+(dest_addr[i+1]&0xFF);
sum=sum+word16; sum=sum+word16;
}}See http://www.netfor2.com/udpsum.htm for the full code.See http://www.netfor2.com/udpsum.htm for the full code.
// make 16 bit words out of every two adjacent 8 bit words and // calculate the sum of all 16 vit words
// make 16 bit words out of every two adjacent 8 bit words and // calculate the sum of all 16 vit words
// overflow bits are added on the next slide// overflow bits are added on the next slide
– 43 – COMP 21000, S’15
UDP checksum code (part 3)UDP checksum code (part 3)// the protocol number and the length of the UDP packet// the protocol number and the length of the UDP packet
sum = sum + prot_udp + len_udp;sum = sum + prot_udp + len_udp;
// keep only the last 16 bits of the 32 bit calculated sum and add the carries// keep only the last 16 bits of the 32 bit calculated sum and add the carries
while (sum>>16)while (sum>>16)
sum = (sum & 0xFFFF)+(sum >> 16);sum = (sum & 0xFFFF)+(sum >> 16);
// Take the one's complement of sum// Take the one's complement of sum
sum = ~sum;sum = ~sum;
return ((u16) sum);return ((u16) sum);
}}
See http://www.netfor2.com/udpsum.htm for the full code.See http://www.netfor2.com/udpsum.htm for the full code.
/* here’s where the carry’s are added in. Turns out you can save the overflow bits and add them back at the end */
/* here’s where the carry’s are added in. Turns out you can save the overflow bits and add them back at the end */
– 44 – COMP 21000, S’15
XOR patent disputeXOR patent disputefor (y=0; y < y_height; ++y) {for (y=0; y < y_height; ++y) {
int loc = (y+y_off)*stride + x_off;int loc = (y+y_off)*stride + x_off;
vidmem[loc] ^= 255; /* XOR */vidmem[loc] ^= 255; /* XOR */
}}
// another way of doing this:// another way of doing this:
for (y=0; y < y_height; ++y) {for (y=0; y < y_height; ++y) {
int loc = (y+y_off)*stride + x_off;int loc = (y+y_off)*stride + x_off;
for (x=0; x < 8; ++x)for (x=0; x < 8; ++x)
if (vidmem[loc] & (1 << x))if (vidmem[loc] & (1 << x))
vidmem[loc] &= ~(1 << x);vidmem[loc] &= ~(1 << x);
elseelse
vidmem[loc] |= (1 << x);vidmem[loc] |= (1 << x);
}} See http://nothings.org/computer/patents.html for a full discussionSee http://nothings.org/computer/patents.html for a full discussion
Closely related is a patent which apparently covers results, not process. The exclusive-or-cursor patent is a simple example of this.
There are a number of mathematically equivalent ways of expressing the xor algorithm, but it appears to be widely believed that all of them are covered by the patent. The patent apparently is understood to cover the idea of representing an onscreen cursor by complementing each "visible" pixel of the cursor with the contents of whatever is on screen.
This is somewhat surprising, since it is predated by hardware implementations that do the exact same thing for text-only modes--rendering a block or underline cursor so its appearance is the same as above.
Closely related is a patent which apparently covers results, not process. The exclusive-or-cursor patent is a simple example of this.
There are a number of mathematically equivalent ways of expressing the xor algorithm, but it appears to be widely believed that all of them are covered by the patent. The patent apparently is understood to cover the idea of representing an onscreen cursor by complementing each "visible" pixel of the cursor with the contents of whatever is on screen.
This is somewhat surprising, since it is predated by hardware implementations that do the exact same thing for text-only modes--rendering a block or underline cursor so its appearance is the same as above.
– 45 – COMP 21000, S’15
More Bitvector MagicMore Bitvector Magic
Count the number of 1’s in a wordCount the number of 1’s in a wordMIT Hackmem 169:
int bitcount(unsigned int n){ unsigned int tmp;
tmp = n - ((n >> 1) & 033333333333) - ((n >> 2) & 011111111111); return ((tmp + (tmp >> 3)) & 030707070707)%63;}
MIT Hackmem Count. The main idea.1. Consider a 3 bit unsigned number as being 4a+2b+c,
• example: consider the three bit number 101• then the unsigned number represents 1 x 22 + 0 x 21 + 1 x 20 = 1 x 4 + 0 x 2 + 1 • let “a” represent the leftmost bit (the 1), “b” the next bit (the 0) and c the rightmost bit (the rightmost 1).• then we get a x 4 + b x 2 + c or 4a+2b+c
2. If we shift the number right 1 bit, we get 010. If we follow the convention of using “abc”
for the bits, then after the shift (logical) we get 0ab or 0 x 22 + a x 21 + b x 20 = 2a+b. 3. Subtracting this shifted number (2a+b) from the original number (4a+2b+c) gives 2a+b+c. 4. If we right-shift the original 3-bit number by two bits, we go from “abc” to “00a” which is
just a x 20
5. If we subract this new shifted number (a) from the result of 3, we get 2a+b+c – a = a+b+c, 6. Since a, b, and c represent the bits (1, 0, and 1 in our example), a+b+c is just the number
of bits in the original number!
– 46 – COMP 21000, S’15
More Bitvector MagicMore Bitvector Magic
Count the number of 1’s in a wordCount the number of 1’s in a wordMIT Hackmem 169:
int bitcount(unsigned int n){ unsigned int tmp;
tmp = n - ((n >> 1) & 033333333333) - ((n >> 2) & 011111111111); return ((tmp + (tmp >> 3)) & 030707070707)%63;}
MIT Hackmem Count. (continued) How is this insight employed in the code? 1. The key insight is that we’re actually going to consider the parameter n to be composed of
11 groups of 3 (3 x 11 = 33, or one more bit than we actually have in a 32-bit number).2. So we’re going to do the algorithm from the previous page one each three bit group of bits.3. Consider the bits 101 111 010 To apply the previous algorithm on each group we want to:
a. (101 – 010 – 001) (111 – 011 – 001) (010 – 001 – 000)b. If we shift n we get: 010 111 101 Note that the shifted number in each position
has a “wrong” first bit (it’s the shifted bit from the previous group) but the correct last two digits.c. We know that the first bit must be 0 since we’re doing logical shift rightd. So we can take the result of shifting and mask out the first bit in each group.e. Since each group can be represented by an octal number, the mask for each
group will be 3 in octal (011). This explains the line ((n >> 1) & 033333333333). Note that a number beginning in “0” in C is taken as an octal number.
f. A similar explanation can be made for the next part of the expression, i.e., to shift each group twice, shift n right twice and mask with 001 (or octal 1) for each group.
Now each octal digit in the number n represents the number of 1’s the original number in that position.In our example, the 9 digits become
– 47 – COMP 21000, S’15
More Bitvector MagicMore Bitvector Magic
Count the number of 1’s in a wordCount the number of 1’s in a wordMIT Hackmem 169:
int bitcount(unsigned int n){ unsigned int tmp;
tmp = n - ((n >> 1) & 033333333333) - ((n >> 2) & 011111111111); return ((tmp + (tmp >> 3)) & 030707070707)%63;}
MIT Hackmem Count. (continued) The last return statement sums these octal digits to produce the final answer.
The key idea is to add adjacent pairs of octal digits together and then compute the remainder modulus 63.
1. Right-shifting tmp by three bits, 2. adding it to tmp itself and 3. ANDing with a suitable mask. This step saves every other group of 3 bits. We want this
because we’ve added each 3-bit group with its right neighbor group; if we save every group we will have counted each group twice.
4. The result is a number in which groups of six adjacent bits (starting from the LSB) contain the number of 1's among those six positions in n.
5. This number modulo 63 yields the final answer.
For 64-bit numbers, we would have to add triples of octal digits and use modulus 1023. This is HACKMEM 169, as used in X11 sources.
Source: MIT AI Lab memo, late 1970's.
– 48 – COMP 21000, S’15
Summary of the Main PointsSummary of the Main Points
It’s All About Bits & BytesIt’s All About Bits & Bytes Numbers Programs Text
Different Machines Follow Different Conventions forDifferent Machines Follow Different Conventions for Word size Byte ordering Representations
Boolean Algebra is the Mathematical BasisBoolean Algebra is the Mathematical Basis Basic form encodes “false” as 0, “true” as 1 General form like bit-level operations in C
Good for representing & manipulating sets