CSCI 243 The Mechanics of Programming Basics of Ctvf/CSCI243/Notes/03-c-basics.pdf · TVF / RIT...
Transcript of CSCI 243 The Mechanics of Programming Basics of Ctvf/CSCI243/Notes/03-c-basics.pdf · TVF / RIT...
CSCI 243The Mechanics of Programming
TVF / RIT 20195
Basics of C
TVF / RIT 20195 CS243: Basics of C
The Plan
• Previous courses:• Python
• Java
• Lots of similarities between languages
• How to learn a new one?• Patterns!
TVF / RIT 20195 CS243: Basics of C
Basic Numeric Types
• Variable declaration syntax:
Type Description/Size
void “No type”, no storage
_Bool 1-bit integer
char 8-bit integer
int “natural sized” signed integer (typically 32-bit)
float single-precision floating point, ≥ 6-digit precision
double double-precision floating point, ≥ 10-digit precision
[ specifier ] type var1, var2, ...;
TVF / RIT 20195 CS243: Basics of C
Type Specifiers
• Alter type characteristics
• May alter storage requirements
• May use more than one to get compound effects
• Basic categories:• Sign
• Storage size
• Access/scope
TVF / RIT 20195 CS243: Basics of C
Type Specifiers: Sign
• signed, unsigned • Positive/negative vs. non-negative
• Usable with: int, char
• Defaults:• int: signed
• char: implementation-dependent (may be signed or unsigned)
• If you want to ensure an unsigned char, you must use• unsigned char
TVF / RIT 20195 CS243: Basics of C
Type Specifiers: Storage Size
• short, long, long long• Amount of storage used to represent
• All applicable to int; long also applicable to double• short int: ≥ 16 bits
• long int: ≥ 32 bits
• long long int: ≥ 64 bits
• long double: ≥ 10 digits of precision
TVF / RIT 20195 CS243: Basics of C
Type Specifiers: Access/Scope
• static• Global vs. local access
• Persistence of storage
• Can be applied to all types
TVF / RIT 20195 CS243: Basics of C
Literal Constants
• Boolean: 0 or 1
• Character: ‘c’• Single character
• Escape character
• Integer: [sign][base]d[d*][suffix]• Optional sign: +,
• Optional base prefix: 0, 0x
• One or more digits (legal according to base)
• Optional suffix: U, L, UL, LL
• Float/double: [s][d*].[d*][f], [s][d*].[d*][e[s]i]• Optional sign
• Zero or more digits, decimal point, zero or more digits
• Optional integer exponent
TVF / RIT 20195 CS243: Basics of C
Arithmetic Expressions
• Standard operators, standard precedence/associativity
• Subexpression: ( )• Left to right
• Unary: + ++ • Right to left
• Multiplicative: * / %• Left to right
• Additive: + • Left to right
TVF / RIT 20195 CS243: Basics of C
Implicit/Automatic Type Conversion
• A.k.a. coercion
• Legend:• float, double, integer, character
Similarly for int variants (short, long)
Expression Conversion/Interpretationi = c Promote char to integer
c = i Truncate integer to char
i = f Truncate float to integer
f = i Promote integer to float
i op i Integer operation
i op f, f op i Promote integer to float, floating pt. operation
f op d, d op f Promote float to double, double operation
TVF / RIT 20195 CS243: Basics of C
Explicit Type Conversion
• Called type casting (or just casting)
• Unary prefix operator:
• Causes “conversion” to specified type• May or may not involve an actual change in representation
• If there is a change, may lose information (e.g., float to int)
(type) expression
TVF / RIT 20195 CS243: Basics of C
Full List of C OperatorsPrecedence Associativity
() [] . > left-to-right
++ + ! ~ (type) * & sizeof right-to-left
* / % left-to-right
+ left-to-right
<< >> left-to-right
< <= > >= left-to-right
!= == left-to-right
& left-to-right
^ left-to-right
| left-to-right
&& left-to-right
|| left-to-right
?: right-to-left
+= = *= /= %= &= ^= |= <<= >>= = right-to-left
, left-to-right
TVF / RIT 20195 CS243: Basics of C
Assignment
• Expression:– That’s a lowercase ‘el’, not a digit ‘1’– Evaluates to the value of the LHS
• Statement:
• Augmented:
• Equivalent:
lvalue = rvalue
lvalue = rvalue;
lvalue op= rvalue
lvalue = lvalue op ( rvalue )
TVF / RIT 20195 CS243: Basics of C
Assignment
• Assignment operator produces a result value• → assignment is an expression, not a statement!
• Result: value that was assigned
• Low precedence (below all arithmetic operators)
• Right-to-left associativitylvalue = lvalue = lvalue = rvalue
lvalue = (lvalue = (lvalue = rvalue))
TVF / RIT 20195 CS243: Basics of C
Initialization
• Follow declaration with “assignment”
• Globals:• Initialization done once
• No initialization → 0 (for numeric values)
• Locals:• “Initial assignment” – done on each entry to function
• No initialization → ???
type var1 = val1;type var2 = val2;
TVF / RIT 20195 CS243: Basics of C
Aggregate Data Types
• Arrays• Homogeneous collection of elements of some base type
• Can be multi-dimensional
● Constant dimension
• Initialization:
type name [ dimension ];type name [d1] [d2] ...;
type var[N+1] = { v0, v1, v2, …, vN };
type var[N+1][M+1] = { { v00, v01, v02, …, v0M }, { v10, v11, v12, …, v1M }, . . . { vN0, vN1, vN2, …, vNM }};
TVF / RIT 20195 CS243: Basics of C
Strings
• No such type!
• All manipulations done with arrays of char• String literal → literal array of char
• Critical component: trailing NUL (zero) byte, identifying the end of the “string”
• Classic operations:• Length
• Copy
• Concatenate
TVF / RIT 20195 CS243: Basics of C
//// length(data) calculate the length of the string 'data'//// length is defined as the number of characters in the string// up to but not including the trailing NUL character//
int length( char data[] ) {int num;
// because array indices are based at 0, the length// of a string is also the index of the NUL character,// so all we have to do is locate the NULnum = 0;while( data[num] ) {
++num;}
return( num );}
String Manipulation – length.c
TVF / RIT 20195 CS243: Basics of C
String Manipulation – copy.c//// copy(dst,src) copy 'src' into 'dst'//
int copy( char dst[], char src[] ) {int num;
// iterate through all the charcters in 'src',// copying each into 'dst', including the trailing NUL//// stops after the NUL has been copiednum = 0;while( dst[num] = src[num] ) {
++num;}
// return the new length of 'dst'return( num );
}
TVF / RIT 20195 CS243: Basics of C
String Manipulation – concat.c//// concat(dst,src) append a copy of 'src' to// the existing contents of 'dst'//int concat( char dst[], char src[] ) {
int num1, num2;
// locate the NUL character at the end of 'dst'num1 = 0;while( dst[num1] ) {
++num1;}
// append characters from 'src' to 'dst including the NULnum2 = 0;while( dst[num1] = src[num2] ) {
++num1;++num2;
}
// return the new length of 'dst'return( num1 );
}
TVF / RIT 20195 CS243: Basics of C
String Manipulation
• The C string library: <string.h>
• Forms of functions:• strn*: sequences of non-NUL characters
• All have a character count argument
• str*: sequences of characters terminated by a NUL
TVF / RIT 20195 CS243: Basics of C
Copying Functions
• Copies string from src into dst including the trailing NUL
• No buffer overrun protection!
• Copies up to n characters from src to dst (stops after a NUL)
• If src is shorter than n characters, appends NUL characters to n
• If no NUL in first n characters, result is not NUL-terminated
char *strcpy( char dst[], const char src [] );
char *strncpy( char dst[], const char src[], size_t n );
TVF / RIT 20195 CS243: Basics of C
Concatenation Functions
• Appends src to dst including the trailing NUL
• First copied character overwrites trailing NUL of dst
• No buffer overrun protection!
• Appends up to n characters from src to dst (stops after a NUL)
• If n characters are copied, also appends a trailing NUL character
char *strcat( char dst[], const char src[] );
char *strncat( char dst[], const char src[], size_t n );
TVF / RIT 20195 CS243: Basics of C
Comparison Functions
• Compare characters from s1 to characters from s2
• Stops at first NUL, or after n characters have been compared
• Return:• < 0 if s1 < s2
• 0 if s1 == s2
• > 0 if s1 > s2
int strcmp( const char s1[], const char s2[] );int strncmp( const char s1[], const char s2[], size_t n );
TVF / RIT 20195 CS243: Basics of C
Search and Length Functions
• Examines s looking for c
• strchr(): starts at first character in s
• strrchr(): starts at last character in s
• If found, returns a pointer to that character; else, returns 0
• Returns the number of characters before the trailing NUL
char *strchr( const char s[], int c );char *strrchr( const char s[], int c );
size_t strlen( const char s[] );
TVF / RIT 20195 CS243: Basics of C
Command-Line Arguments
• Supplied as parameters to main() function
• Standard prototypes for main():
• Argument count (argc), argument vector (argv)• Count includes command name
• Vector contains argc strings argv[1], argv[2], …
• and a last entry, argv[argc], which is always a null pointer
int main( void );
int main( int argc, char *argv[] );
TVF / RIT 20195 CS243: Basics of C
• Argument vector is an array of strings• I.e., an array of arrays of char
• Many ways to access them
Command-Line Arguments
#include <stdio.h>
int main( int argc, char *argv[]) {
for( int i = 0; i < argc; ++i ) { printf( “%3d: %s\n”, i, argv[i] ); }
return( 0 );}
TVF / RIT 20195 CS243: Basics of C
Accessing Command-Line Arguments
#include <stdio.h>
int main( int argc, char *argv[]) { int i = 0;
while( argv[i] ) { printf( “%3d: %s\n”, i, argv[i] ); ++i; }
return( 0 );}
#include <stdio.h>
int main( int argc, char *argv[]) {
for( int i = 0; argv[i] != NULL; ++i ) { printf( “%3d: %s\n”, i, argv[i] ); }
return( 0 );}
TVF / RIT 20195 CS243: Basics of C
Accessing Command-Line Arguments
#include <stdio.h>
int main( int argc, char *argv[]) { int i = 0;
while( argc > 0 ) { printf( “%3d: %s\n”, i, argv[i] ); ++i; }
return( 0 );}
TVF / RIT 20195 CS243: Basics of C
Accessing Command-Line Arguments
#include <stdio.h>
int main( int argc, char *argv[]) { int i = 0;
for( int i = 0; i < argc; ++i ) { for( int j = 0; argv[i][j]; ++j ) { putchar( argv[i][j] ); } putchar( ‘\n’ ); }
return( 0 );}
TVF / RIT 20195 CS243: Basics of C
Environment Variables
• Third argument to main routine
• Similar to argument vector• Array of character pointers
• NULL-terminated
• Environment variable strings have this form:
• Some examples:
NAME=value
PATH=/usr/bin:/bin:/usr/sbin:/sbinHOME=/home/fac/wrcSHELL=/bin/bashUSER=wrcPWD=/home/fac/wrc/Courses/CS243/Homeworks/2LOGNAME=wrc
TVF / RIT 20195 CS243: Basics of C
Environment Variables
• Accessing:
#include <stdio.h>
int main( int argc, char *argv[], char *env[] ) {
for( int i = 0; env[i] != NULL; ++i ) { printf( “%3d: %s\n”, i, env[i] ); }
return( 0 );}
TVF / RIT 20195 CS243: Basics of C
Control Structures: Decision
if( conditional ) { thenclause}
switch( expression ) { case c1: ... case c2: ... ... default: ...};
if( conditional ) { thenclause} else { elseclause}
TVF / RIT 20195 CS243: Basics of C
Control Structures: Looping
for( init ; test ; inc ) { body}
initdo { body inc} while( test );
initwhile( test ) { body inc}
TVF / RIT 20195 CS243: Basics of C
Conditional Expressions
• Not the same as Booleans in other languages
• Evaluate to integer result• 0 → false
• Everything else → true
• Equivalent code:int x, a;
x = f();
if( x != 0 ) { a = 5;} else { a = 17;}
int x, a;
x = f();
if( x ) { a = 5;} else { a = 17;}
TVF / RIT 20195 CS243: Basics of C
Conditional OperatorsPrecedence Associativity
() [] . > left-to-right
++ + ! ~ (type) * & sizeof right-to-left
* / % left-to-right
+ left-to-right
<< >> left-to-right
< <= > >= left-to-right
!= == left-to-right
& left-to-right
^ left-to-right
| left-to-right
&& left-to-right
|| left-to-right
?: right-to-left
+= = *= /= %= &= ^= |= <<= >>= = right-to-left
, left-to-right
TVF / RIT 20195 CS243: Basics of C
Conditional Operators
• Relational operators: < <= == != >= >
• Produce integer results• 0 if condition is false, else 1
int x, a;
x = f();
if( x != 0 ) { a = 5;} else { a = 17;}
int x, a, c;
x = f();c = x != 0;
if( c ) { a = 5;} else { a = 17;}
TVF / RIT 20195 CS243: Basics of C
Conditional Operators
• Equivalent (in terms of operation):
int x, a;
x = f();
if( x != 0 ) { a = 5;} else { a = 17;}
int x, a;
x = f();
if( x ) { a = 5;} else { a = 17;}
int x, a, c;
x = f();c = x != 0;
if( c ) { a = 5;} else { a = 17;}
int a;
if( f() ) { a = 5;} else { a = 17;}
TVF / RIT 20195 CS243: Basics of C
Boolean ConnectivesPrecedence Associativity
() [] . > left-to-right
++ + ! ~ (type) * & sizeof right-to-left
* / % left-to-right
+ left-to-right
<< >> left-to-right
< <= > >= left-to-right
!= == left-to-right
& left-to-right
^ left-to-right
| left-to-right
&& left-to-right
|| left-to-right
?: right-to-left
+= = *= /= %= &= ^= |= <<= >>= = right-to-left
, left-to-right
TVF / RIT 20195 CS243: Basics of C
Boolean Connective Operators
• Operands are integer expressions• Originally, or coerced
• 0 is false, non-0 is true
• Produce 0 or 1 result
• Forms:• Logical AND: &&
• 1 iff both operands are non-zero
• Logical OR: ||
• 1 if either operand is non-zero
• Logical NOT: !
• 1 if operand is 0
• 0 if operand is anything else
a = 9;b = 12;c = 0;d = a && b; /* 1 */e = a || b; /* 1 */f = a && c; /* 0 */g = a || c; /* 1 */h = !a; /* 0 */i = !c; /* 1 */j = !!a; /* 1 */
TVF / RIT 20195 CS243: Basics of C
More Control Structures
• Unconditional transfer• Label is an identifier followed by :
• Usable anywhere within a function
• Rarely needed
• Equivalent loops:
• Dangerous! Why?
goto label;
initwhile( test ) { body inc}
initEnter: if( test == 0 ) goto Exit; body inc goto Enter;Exit:
TVF / RIT 20195 CS243: Basics of C
More Control Structures
• Alter execution of loop body• Skips rest of body, but stays in loop
• Usable with any type of loop
• Equivalent (?) loops:
continue;
initwhile( test ) { . . . if( cond ) continue; . . .}
initwhile( test ) { . . . if( cond ) goto End; . . . End:}
TVF / RIT 20195 CS243: Basics of C
Control Structures
• Consider these loops:
• Assume init, test, inc, and body are identical
• Are they equivalent?
for( init ; test ; inc ) { body}
initwhile( test ) { body inc}
TVF / RIT 20195 CS243: Basics of C
More Control Structures
• Immediate exit from control structure• Usable in a loop or a switch statement
• Transfers to first statement after the body
• Equivalent loops:
break;
initwhile( test ) { . . . if( cond ) break; . . .}
initwhile( test ) { . . . if( cond ) goto Exit; . . .}Exit:
TVF / RIT 20195 CS243: Basics of C
The C Preprocessor (CPP)
• Filter run before the compiler sees the code• Reads source code as input (from stdin or a specified file)
• Applies filtering steps as requested
• Writes resulting code (to stdout or a specified file)
• Can be run as a standalone filter on non-C code
TVF / RIT 20195 CS243: Basics of C
CPP Directives
• Format:
• CPP processes any line beginning with # character• Early C: must be first character on the line
• ANSI C: can have leading whitespace
• Directive name• May be separated from # by whitespace
• Types of directives:• File inclusion
• Macro definition
• Conditional compilation
• Other miscellaneous
#directive [ operands ]
TVF / RIT 20195 CS243: Basics of C
File Inclusion
• CPP “splices in” the file’s contents at this point
• Two forms:
• Searches working directory, then system directories
• Only searches system directories
• Can extend list of system directories with I• Compiler (or cpp) Idirname option
• Prepends dirname to list of system directories
#include “path”
#include <path>
TVF / RIT 20195 CS243: Basics of C
Common Header Files (1/2)
<stdlib.h>• General-purpose standard C library prototypes
<unistd.h>• System call prototypes, standard symbolic constants
<sys/types.h>• System type definitions
<math.h>• Math library constants and prototypes
TVF / RIT 20195 CS243: Basics of C
Common Header Files (2/2)
<stdint.h>• Standard-width integer types
<stdbool.h>• More conventional Boolean type & constant definitions
<stdio.h>• Standard I/O package (later)
<string.h>• C string processing functions (later)
TVF / RIT 20195 CS243: Basics of C
Macros
• CPP contains a table of macros: symbol, value
• Scans input looking for occurrences of symbol• Replaces with defined value
• Several predefined• Some by standard, some by individual implementations
• Standard predefined symbols:Symbol Value
__FILE__ name of file being processed
__LINE__ line number of current line in __FILE__
__DATE__ date the file was processed
__TIME__ time the file was processed
__STDC__ 1 if compiler conforms to ANSI C; else, not defined
TVF / RIT 20195 CS243: Basics of C
Macro Definition
• Several ways to define macros:
• CPP directive
• Command-line option
• name is a standard symbolic name
• Naming convention: use only uppercase alphabetics
• Optionally, can follow name with parameter list (more later)
• value is a text string – can be anything
• Cannot redefine existing macros
#define name#define name value
DnameDname=value
TVF / RIT 20195 CS243: Basics of C
Macro Expansion Issues
• What value does ‘x’ get?
• Issue: direct textual substitution• Occurs before the compiler sees the expanded text
• Solution:
#define AAA 5 + 5#define BBB AAA * AAA ...x = BBB; x = 5 + 5 * 5 + 5
#define AAA (5 + 5)#define BBB AAA * AAA ...x = BBB;
x = (5 + 5) * (5 + 5)
TVF / RIT 20195 CS243: Basics of C
Standard I/O Library
• Part of the C library
• Built on top of OS i/o routines• Logical vs. physical i/o
• Buffered within user space• In addition to any OS i/o buffering at system level
• To use, must include <stdio.h>• Defines types, prototypes
• Two important constants: NULL EOF
TVF / RIT 20195 CS243: Basics of C
Standard I/O Library
• I/O connections called streams
• Two types: binary and text
• Binary streams are byte-oriented• Raw bytes of data
• No interpretation done by library
• Text streams are character-oriented• Stream assumed to contain character data
TVF / RIT 20195 CS243: Basics of C
Text Streams
• Text files have structure• Zero or more lines
• Lines contain zero or more characters
• Lines terminated with EOLN sequence• UNIX/Linux®: newline
• Windows: carriage return and newline
• Older MacOS: carriage return
• User program only sees newline at EOLN• I/O library translates native sequence to/from UNIX sequence
• In theory, at least….
Linux® is the registered trademark of Linus Torvalds in the United States and other countries.
TVF / RIT 20195 CS243: Basics of C
Reading and Writing Text Streams
• Three kinds of routines• Character-by-character, unformatted line-by-line, formatted
• Character routine - input:
• Return: next character as an integer, or EOF if no more data
• Output:
• Low-order eight bits of ch are written
• Return: value written, or EOF on error
int getchar( void );int getc( FILE *stream );int fgetc( FILE *stream );
int putchar( int ch );int putc( int ch, FILE *stream );int fputc( int ch, FILE *stream );
TVF / RIT 20195 CS243: Basics of C
Unformatted Line I/O
• Work with NUL-terminated strings
• Input:
• Note: gets() strips EOLN sequence, fgets() leaves it alone
• Return: buf, or 0 on EOF/error
• Output:
• Note: puts() adds EOLN sequence, fputs() doesn’t
• Return: number of bytes written, or EOF
char *gets( char *buf );char *fgets( char *buf, int n, FILE *stream );
int puts( char *buf );int fputs( char *buf, FILE *stream );
TVF / RIT 20195 CS243: Basics of C
Formatted Output
• printf() family of routines
• All traverse fmt string:• Ordinary characters are printed literally
• Format control sequences print next argument according to code
• Return: number of bytes transmitted
int printf( const char *fmt, ... );int fprintf( FILE *stream, const char *fmt, ... );int sprintf( char *buf, const char *fmt, ... );
TVF / RIT 20195 CS243: Basics of C
Output Format Codes
• Literal characters printed as-is• Exception - escape sequences: \n \t \b \r etc.
• Conversion code syntax: %[f][w][.p]c
• Common conversion characters:d, i int
u, o, x, X unsigned int
e, E, f, g, G double
s string
% single % character
TVF / RIT 20195 CS243: Basics of C
Checking for Errors
• After any syscall, errno contains a result code• Global integer variable
• Can print interpretation of errno contents
• Example:
• Output:
void perror( const char *message );
FILE *fp;char buf[512]; . . .if( fgets(buf,512,fp) == NULL ) { perror( “fgets() on fp” );}
fgets() on fp: Bad file descriptor
TVF / RIT 20195 CS243: Basics of C
Checking for Errors
• Related global variables:
• Related functions:
const char *sys_errlist[];int sys_nerr;int errno;
char *strerror( int errnum );