Download - Linux DM Lab Manual

7/29/2019 Linux DM Lab Manual

1/68

Contents

S.No TopicPage

no

1. List of Linux Programs 4

2.List of Data Mining Programs 6

3.

Week1

1. Write a shell script that accepts a file name, starting and ending line

numbers as arguments and displays all the lines between the given line

numbers.

8

2. Write a shell script that deletes all lines containing a specified word in

one or more files supplied as arguments to it.

3. Write a shell script that displays a list of all the files in the current

directory to which the user has read, write and execute permissions.

4. Write a shell script that receives any number of file names as arguments

checks if every argument supplied is a file or a directory and reports

accordingly. Whenever the argument is a file, the number of lines on it is also

reported.

4.

Week 2

5. Write a shell script that accepts a list of file names as its arguments, counts

and reports the occurrence of each word that is present in the first argument

file on other argument files12

6. Write a shell script to list all of the directory files in a directory.

7. Write a shell script to find factorial of a given integer.

5.

Week 38. Write an awk script to count the number of lines in a file that do not

contain vowels.

149. Write an awk script to find the number of characters, words and lines in a

file.

10. Write a c program that makes a copy of a file using standard I/O and

system calls

6.

Week 4

11. Implement in C the following UNIX commands using System calls

A. cat B. ls C. mv

15

1


2/68

12. Write a program that takes one or more file/directory names as command

line input and reports the following information on the file.

A. File type. B. Number of links.

C. Time of last access.

D. Read, Write and Execute permissions.

7.

Week 5

13. Write a C program to emulate the UNIX ls l command.

1814. Write a C program to list for every file in a directory, its inode number

and file name.

15. Write a C program that demonstrates redirection of standard output to a

file.

Ex: ls > f1.

8.

Week 6

16. Write a C program to create a child process and allow the parent to

display parent and the child to display child on the screen.

20

17. Write a C program to create a Zombie process.

18. Write a C program that illustrates how an orphan is created.

9.

Week 7

19. Write a C program that illustrates how to execute two commands

concurrently with a command pipe.

Ex: - ls l | sort

22

20. Write C programs that illustrate communication between two unrelated

processes using named pipe

21. Write a C program to create a message queue with read and write

permissions to write 3 messages to it with different priority numbers.

22. Write a C program that receives the messages (from the above message

queue as specified in (21)) and displays them.

10.

Week 8

23. Write a C program to allow cooperating processes to lock a resource for

exclusive use, using a) Semaphores b) flock or lockf system calls.30

24. Write a C program that illustrates suspending and resuming processes

using signals

11.

Week 9

25. Write a C program that implements a producer-consumer system with

two processes.

31

2


3/68

(Using Semaphores).

26. Write client and server programs (using c) for interaction between server

and client processes using Unix Domain sockets.

12.

Week 10

27. Write client and server programs (using c) for interaction between server

and client processes using Internet Domain sockets.33

28. Write a C program that illustrates two processes communicating using

shared memory

13. Listing of categorical attributes and the real-valued attributesseparately.

39

14. Rules for identifying attributes. 40

15. Training a decision tree. 43

16. Test on classification of decision tree. 47

17. Testing on the training set . 51

18. Using cross validation for training. 52

19. Significance of attributes in decision tree. 55

20. Trying generation of decision tree with various number of decision

tree.

58

21. Find out differences in results using decision tree and cross-validation on a data set.

60

22. Decision trees. 62

23. Reduced error pruning for training Decision Trees using cross-validation

62

24. Convert a Decision Trees into "if-then-else rules". 65

List of Linux Programs

1.

Write a shell script that accepts a file name, starting and ending line numbers as

arguments and displays all the lines between the given line numbers.

3


4/68

2.Write a shell script that deletes all lines containing a specified word in one or

more files supplied as arguments to it.

3.Write a shell script that displays a list of all the files in the current directory to

which the user has read, write and execute permissions.

4.

Write a shell script that receives any number of file names as arguments checks

if every argument supplied is a file or a directory and reports accordingly.

Whenever the argument is a file, the number of lines on it is also reported

5.

Write a shell script that accepts a list of file names as its arguments, counts and

reports the occurrence of each word that is present in the first argument file on

other argument files



8.Write an awk script to count the number of lines in a file that do not contain

vowels.

9. Write an awk script to find the number of characters, words and lines in a file.10.

Write a c program that makes a copy of a file using standard I/O and system

calls


A. cat B. ls C. mv

12.

Write a program that takes one or more file/directory names as command line

input and reports the following information on the file.


C. Time of last access. D. Read, Write and Execute permissions.


14.Write a C program to list for every file in a directory, its inode number and file

name.

15.

Write a C program that demonstrates redirection of standard output to a file.Ex:

ls > f1.

16.Write a C program to create a child process and allow the parent to display

parent and the child to display child on the screen.



19.

Write a C program that illustrates how to execute two commands concurrently

with a command pipe.

Ex: - ls l | sort

20.Write C programs that illustrate communication between two unrelated

processes using named pipe.

21.Write a C program to create a message queue with read and write permissions

to write 3 messages to it with different priority numbers.

22.Write a C program that receives the messages (from the above message queue

as specified in (21)) and displays them.23. Write a C program to allow cooperating processes to lock a resource for

4


5/68

exclusive use, using a) Semaphores b) flock or lockf system calls.

24.Write a C program that illustrates suspending and resuming processes using

signals.

25.Write a C program that implements a producer-consumer system with two

processes. (using Semaphores).

26.Write client and server programs (using c) for interaction between server and

client processes using Unix Domain sockets.

27.Write client and server programs (using c) for interaction between server and

client processes using Internet Domain sockets.

28.Write a C program that illustrates two processes communicating using shared

memory

List of Data Mining Programs

5


6/68

6

S.No. Task Description1. List all the categorical (or nominal) attributes and the real-valued

attributes separately.2. What attributes do you think might be crucial in making the credit

assessment ? Come up with some simple rules in plain English usingyour selected attributes.

3. One type of model that you can create is a Decision Tree - train aDecision Tree using the complete dataset as the training data. Report themodel obtained after training.

4. Suppose you use your above model trained on the complete dataset, and

classify credit good/bad for each of the examples in the dataset. What %of examples can you classify correctly ? (This is also called testing on thetraining set) Why do you think you cannot get 100 % training accuracy ?

5. Is testing on the training set as you did above a good idea ?Why or Why not ?

6. One approach for solving the problem encountered in the previousquestion is using cross-validation ? Describe what is cross-validationbriefly. Train a Decistion Tree again using cross-validation and reportyour results. Does your accuracy increase/decrease ? Why ? (10 marks)

7. Check to see if the data shows a bias against "foreign workers" (attribute20),or "personal-status" (attribute 9). One way to do this (perhaps rather

simple minded) is to remove these attributes from the dataset and see ifthe decision tree created in those cases is significantly different from thefulldataset case which you have already done. To remove an attribute youcan use the preprocess tab in Weka's GUI Explorer. Did removing theseattributes have any significant effect? Discuss.

8. Another question might be, do you really need to input so manyattributes to get good results? Maybe only a few would do. For example,you could try just having attributes 2, 3, 5, 7, 10, 17 (and 21, the classattribute (naturally)). Try out some combinations. (You had removed twoattributes inproblem 7. Remember to reload the arff data file to get all the attributes

initially before you start selecting the ones you want.)9. Sometimes, the cost of rejecting an applicant who actually has a good

credit (case 1) might be higher than accepting an applicant who has badcredit (case 2). Instead of counting the misclassifications equally in bothcases, give a higher cost to the first case (say cost 5) and lower cost tothe second case. You can do this by using a cost mAECx in Weka. Trainyour Decision Treeagain and report the Decision Tree and cross-validation results. Are theysignificantly different from results obtained in problem 6 (using equalcost)?

10. Do you think it is a good idea to prefer simple decision trees instead ofhaving long complex decision trees? How does the complexity of aDecision Tree relate to the bias of the model?

11. You can make your Decision Trees simpler by pruning the nodes. Oneapproach is to use Reduced Error Pruning - Explain this idea briefly. Tryreduced error pruning for training your Decision Trees using cross-validation (you can do this in Weka) and report the Decision Tree youobtain ? Also,report your accuracy using the pruned model. Does your accuracyincrease ?

12. (Extra Credit): How can you convert a Decision Trees into "if-then-elserules". Make up your own small Decision Tree consisting of 2-3 levels and

convert it into a set of rules. There also exist different classifiers thatoutput the model in the form of rules - one such classifier in Weka isrules.PART, train this model and report the set of rules obtained.Sometimes just one attribute canbe good enough in making the decision, yes, just one ! Can you predictwhat attribute that might be in this dataset ? OneR classifier uses asingle attribute to make decisions (it chooses the attribute based onminimum error). Report the rule obtained by training a one R classifier.


7/68

Week1

1. Write a shell script that accepts a file name, starting and ending line numbers as


Aim: ToWrite a shell script that accepts a file name, starting and ending line numbers as


Script:

if [ $# -ne 3 ]

then echo "Error : Invalid number of arguments."

exitfi

7


8/68

if [ $2 -gt $3 ]

then

echo "Error : Invalid range value."

exit

fil=`expr $3 - $2 + 1`

cat $1 | tail +$2 | head -$l

Output:

$sh 11b.sh test 5 7

abc 1234

def 5678

ghi 91011

Description :

head command : This command is used to display at the beginning of one ormore

files. By default it displays first 10 lines of a file

head [ count option ] filename

tail command : This command is used to display last few lines at the end of a file. By

default it displays last 10 lines of a file

tail [ +/-start] filename

startis starting line number

tail -5 filename : It displays last 5 lines of the file

tail +5 filename : It displays all the lines ,beginning from line number 5 to end of the file.

2. Write a shell script that deletes all lines containing a specified word in one or more

files supplied as arguments to it.

Aim: To write a shell script that deletes all lines containing a specified word in one or

more files supplied as arguments to it.

Script:

clear

if [ $# -eq 0 ]

then

echo no arguments passed

exit

fi

echo the contents before deleting

for i in $*

do

echo $i

cat $i

doneecho enter the word to be deleted

8


9/68

read word

for i in $*

do

grep -vi "$word" $i > temp

mv temp $iecho after deleting

cat $i

done

Output:

$ sh 8b.sh test1

the contents before deleting

test1

hello

hello

bangalore

mysore city

enter the word to be deleted

city

after deleting

hello

hello

Bangalore

$ sh 8b.shno argument passed

3. Write a shell script that displays a list of all the files in the current directory to


Aim: To write a shell script that displays a list of all the files in the current directory to


Script:

echo "enter the directory name"

read dir

if [ -d $dir ]

then

cd $dir

ls > f

exec < fwhile read line

9


10/68

do

if [ -f $line ]

then

if [ -r $line -a -w $line -a -x $line ]

thenecho "$line has all permissions"

else

echo "files not having all permissions"

fi

fi

done

fi

4. Write a shell script that receives any number of file names as arguments checks if

every argument supplied is a file or a directory and reports accordingly. Wheneverthe argument is a file, the number of lines on it is also reported

Aim: To write a shell script that receives any number of file names as arguments checks

if every argument supplied is a file or a directory

Script:

for x in $*

do

if [ -f $x ]

then

echo " $x is a file "

echo " no of lines in the file are "

wc -l $x

elif [ -d $x ]

then

echo " $x is a directory "

else

echo " enter valid filename or directory name "

fidone

10


11/68

Week 2

5. Write a shell script that accepts a list of file names as its arguments, counts and

reports the occurrence of each word that is present in the first argument file on

other argument files.

Aim : To write a shell script that accepts a list of file names as its arguments, counts

and reports the occurrence of each word that is present in the first argument file on

other argument files.

Script:

if [ $# -ne 2 ]

then

echo "Error : Invalid number of arguments."

exit

fi

str=`cat $1 | tr '\n' ' '`

for a in $str

do

echo "Word = $a, Count = `grep -c "$a" $2`"

done

Output :

$ cat test

hello AEC$ cat test1

hello AEC

hello AEC

hello

$ sh 1.sh test test1

Word = hello, Count = 3

Word = AEC, Count = 2


11


12/68

Script:

# !/bin/bash

echo"enter directory name"

read dirif[ -d $dir]

then

echo"list of files in the directory"

ls $dir

else

echo"enter proper directory name"

fi

Output:

Enter directory name

AEC

List of all files in the directoty

CSE.txt

ECE.txt


Script:

# !/bin/bash

echo "enter a number"read num

fact=1

while [ $num -ge 1 ]

do

fact=`echo $fact\* $num|bc`

let num--

done

echo "factorial of $n is $fact"

Output:

Enter a number

5

Factorial of 5 is 120

12


13/68

Week 3

8. Write an awk script to count the number of lines in a file that do not contain

vowels.

9. Write an awk script to find the number of characters, words and lines in a file.10. Write a c program that makes a copy of a file using standard I/O and system calls

Aim : To write an awk script to find the number of characters, words and lines in a file.

Script:

BEGIN{print "record.\t characters \t words"}

#BODY section

{

len=length($0)

total_len+=len

print(NR,":\t",len,":\t",NF,$0)

words+=NF

}

END{

print("\n total")

print("characters :\t" total len)

print("lines :\t" NR)

}

13


14/68

Week 4


A. cat B. ls C. mv

12. Write a program that takes one or more file/directory names as command line inputand reports the following information on the file.


C. Time of last access. D. Read, Write and Execute permissions.

AIM: Implement in C the cat Unix command using system calls

#include

#include

#define BUFSIZE 1

int main(int argc, char **argv)

{

int fd1;

int n;

char buf;

fd1=open(argv[1],O_RDONLY);

printf("Welcome to AEC\n");

while((n=read(fd1,&buf,1))>0)

{

printf("%c",buf);/* or

write(1,&buf,1); */

}

return (0);

}

AIM: Implement in C the following ls Unix command using system calls

Algorithm:

1. Start.

2. open directory using opendir( ) system call.

3. read the directory using readdir( ) system call.

4. print dp.name and dp.inode .

5. repeat above step until end of directory.

6. End

#include

#include

#include

#include

14


15/68

#define FALSE 0

#define TRUE 1

extern int alphasort();

char pathname[MAXPATHLEN];

main() {

int count,i;

struct dirent **files;

int file_select();

if (getwd(pathname) == NULL )

{ printf("Error getting pathn");

exit(0);

}

printf("Current Working Directory = %sn",pathname);

count = scandir(pathname, &files, file_select, alphasort);

if (count d_name, ".") == 0) ||(strcmp(entry->d_name, "..") == 0))

return (FALSE);

else

return (TRUE);

}

AIM: Implement in C the Unix command mv using system calls

Algorithm:

1. Start

2. open an existed file and one new open file using open()

system call

3. read the contents from existed file using read( ) systemcall

15


16/68

4. write these contents into new file using write system

call using write( ) system call

5. repeat above 2 steps until eof

6. close 2 file using fclose( ) system call

7. delete existed file using using unlink( ) system8. End.

Program:

#include

#include

#include

#include

int main(int argc, char **argv)

{

int fd1,fd2;

int n,count=0;

fd1=open(argv[1],O_RDONLY);

fd2=creat(argv[2],S_IWUSR);

rename(fd1,fd2);

unlink(argv[1]);

printf( file is copied );

return (0);

}

16


17/68

Week 5


ALGORITHM :

Step 1: Include necessary header files for manipulating directory.

Step 2: Declare and initialize required objects.

Step 3: Read the directory name form the user.

Step 4: Open the directory using opendir() system call and report error if the directory is not

available.

Step 5: Read the entry available in the directory.

Step 6: Display the directory entry ie., name of the file or sub directory.

Step 7: Repeat the step 6 and 7 until all the entries were read.

/* 1. Simulation of ls command */

#include

#include

#include

#includemain()

{

char dirname[10];

DIR *p;

struct dirent *d;printf("Enter directory name ");

scanf("%s",dirname);

p=opendir(dirname);

if(p==NULL)

{

perror("Cannot find dir.");

exit(-1);

}

while(d=readdir(p))

printf("%s\n",d->d_name);

}

SAMPLE OUTPUT:

enter directory name iii

...

f2

17


18/68

14. Write a C program to list for every file in a directory, its inode number and file

name.

15. Write a C program that demonstrates redirection of standard output to a file.

Ex: ls > f1.Description:

An Inode number points to an Inode. An Inode is a data structure that stores

the following information about a file :

Size of file

Device ID

User ID of the file

Group ID of the file

The file mode information and access privileges for owner, group and others

File protection flags

The timestamps for file creation, modification etc

link counter to determine the number of hard links

Pointers to the blocks storing files contents

18


19/68

19


20/68

Week 6

16. Write a C program to create a child process and allow the parent to display

parent and the child to display child on the screen.

#include

#includemain()

{

int childpid;

if (( childpid=fork())0)

{

}

else

printf(Child process);

}


If child terminates before the parent process then parent process with out child is

called zombie process

#include

#include

main()

{

int childpid;

if (( childpid=fork())0)

{

Printf(child process);

exit(0);

}

else

{wait(100);

20


21/68

printf(parent process);

}

}


#include

main()

{

int id;

printf("Before fork()\n");

id=fork();

if(id==0)

{

printf("Child has started: %d\n ",getpid());

printf("Parent of this child : %d\n",getppid());

printf("child prints 1 item :\n ");

sleep(25);

printf("child prints 2 item :\n");

}

else

{

printf("Parent has started: %d\n",getpid());printf("Parent of the parent proc : %d\n",getppid());

}

printf("After fork()");

}

21


22/68

Week 7

19. Write a C program that illustrates how to execute two commands concurrently with

a command pipe.

Ex: - ls l | sort

AIM: Implementing Pipes

D ESCRIPTION :

A pipe is created by calling a pipe() function.

int pipe(int filedesc[2]);

It returns a pair of file descriptors filedesc[0] is open for reading and filedesc[1] is

open for writing. This function returns a 0 if ok & -1 on error.

ALGORITHM:

The following is the simple algorithm for creating, writing to and reading from a

pipe.

1) Create a pipe through a pipe() function call.

2) Use write() function to write the data into the pipe. The syntax is as follows

write(int [],ip_string,size);

int [] filedescriptor variable, in this case if int filedesc[2] is the variable, then

use the filedesc[1] as the first parameter.

ip_string The string to be written in the pipe.

Size buffer size for storing the input

3) Use read() function to read the data that has been written to the pipe.

The syntax is as follows

read(int [], char,size);

PROGRAM:

#include

#include

main()

{

int pipe1[2],pipe2[2],childpid;

if(pipe(pipe1)


23/68

printf("cannot fork");

}

else

if(childpid >0)

{close(pipe1[0]);

close(pipe2[1]);

client(pipe2[0],pipe1[1]);

while (wait((int *) 0 ) !=childpid);

close(pipe1[1]);

close(pipe2[0]);

exit(0);

}

else

{

close(pipe1[1]);

close(pipe2[0]);

server(pipe1[0],pipe2[1]);

close(pipe1[0]);

close(pipe2[1]);

exit(0);

}

}

client(int readfd,int writefd){

int n;

char buff[1024];

if(fgets(buff,1024,stdin)==NULL)

printf("file name read error");

n=strlen(buff);

if(buff[n-1]=='\n')

n--;

if(write(writefd,buff,n)!=n)

printf("file name write error");

while((n=read(readfd,buff,1024))>0)

if(write(1,buff,n)!=n)

printf("data write error");

if(n


24/68

n=read(readfd,buff,1024);

buff[n]='\0';

if((fd=open(buff,0))0)

write(writefd,buff,n);

}

}

20. Write C programs that illustrate communication between two unrelated processes

using named pipe.

AIM: Implementing IPC using a FIFO (or) named pipe.

D ESCRIPTION :

Another kind of IPC is FIFO(First in First Out) is sometimes also called as

named pipe.It is like a pipe, except that it has a name.Here the name is that of a file

that multiple processes can open(), read and write to. A FIFO is created using the

mknod() system call. The syntax is as follows

int mknod(char *pathname, int mode, int dev);

The pathname is a normal Unix pathname, and this is the name of the FIFO.

The mode argument specifies the file mode access mode.The dev value is ignored for

a FIFO.

Once a FIFO is created, it must be opened for reading (or) writing using either the

open system call, or one of the standard I/O open functions-fopen, or freopen.

ALGORITHM:

The following is the simple algorithm for creating, writing to and reading from

a

FIFO.

1) Create a fifo through mknod() function call.

2) Use write() function to write the data into the fifo. The syntax is as follows

write(int [],ip_string,size);

24


25/68

int [] filedescriptor variable, in this case if int filedesc[2] is the variable, then

use the filedesc[1] as the first parameter.

ip_string The string to be written in the fifo.

Size buffer size for storing the input

3) Use read() function to read the data that has been written to the fifo.

The syntax is as follows

read(int [], char,size);

PROGRAM:

#define FIFO1 "Fifo1"

#define FIFO2 "Fifo2"

#include

#include

#include

#include

#include

main()

{int childpid,wfd,rfd;

mknod(FIFO1,0666|S_IFIFO,0);

mknod(FIFO2,0666|S_IFIFO,0);

if (( childpid=fork())==-1)

{

printf("cannot fork");

}

else

if(childpid >0)

{

wfd=open(FIFO1,1);

rfd=open(FIFO2,0);

client(rfd,wfd);

while (wait((int *) 0 ) !=childpid);

close(rfd);

close(wfd);

unlink(FIFO1);

unlink(FIFO2);

}

else25


26/68

{

rfd=open(FIFO1,0);

wfd=open(FIFO2,1);

server(rfd,wfd);

close(rfd);close(wfd);

}

}

client(int readfd,int writefd)

{

int n;

char buff[1024];

printf ("enter s file name");

if(fgets(buff,1024,stdin)==NULL)

printf("file name read error");

n=strlen(buff);

if(buff[n-1]=='\n')

n--;

if(write(writefd,buff,n)!=n)

printf("file name write error");

while((n=read(readfd,buff,1024))>0)

if(write(1,buff,n)!=n)

printf("data write error");

if(n


27/68

21. Write a C program to create a message queue with read and write permissions to

write 3 messages to it with different priority numbers.

22. Write a C program that receives the messages (from the above message queue as

specified in (21)) and displays them.

Aim: To create a message queue

DESCRIPTION:

Message passing between processes are part of operating system, which are done through a

message queue. Where messages are stored in kernel and are associated with message queue

identifier (msqid). Processes read and write messages to an arbitrary queue in a way such

that a process writes a message to a queue, exits and other process reads it at later time.

ALGORITHM:

Before defining a structure ipc_perm structure should be defined which is done by

including following file.

#include

#include

A structure of information is maintained by kernel, it should contain following.

struct msqid_ds{

struct ipc_perm msg_perm; /*operation permission*/

struct msg *msg_first; /*ptr to first msg on queue*/

struct msg *msg_last; /*ptr to last msg on queue*/

ushort msg_cbytes; /*current bytes on queue*/

ushort msg_qnum; /*current no of msgs on queue*/

ushort msg_qbytes; /*max no of bytes on queue*/

ushort msg_lspid; /*pid o flast msg send*/

ushort msg_lrpid; /*pid of last msgrecvd*/

time_t msg_stime; /*time of last msg snd*/

time_t msg_rtime; /*time of last msg rcv*/

time_t msg_ctime; /*time of last msg ctl*/

};To create new message queue or access existing message queue msgget() function is

used

Syntax:

int msgget(key_t key ,int msgflag);

Msg flag values

Num val Symb value desc

0400 MSG_R Read by owner

0200 MSG_w Write by owner

0040 MSG_R >>3 Read by group0020 MSG_W>>3 Write by group

27


28/68

Msgget returns msqid, or -1 if error

1. To put message on queue msgsnd() function is used.

Syntax: int msgsnd(int msqid , struct msgbuf *ptr,int length, int flag);

msqid is message queue id, a unique id

msgbufis actual content to send, a pointer to structure which contain following

struct msgbuf

{

Long mtype; /*message type >0 */

Char mtext[1]; /*data*/

};

length is the size of message in bytes

flag is

- IPC_NOWAIT which allows sys call to return immediately when no room on

queue, when this is specified msgsnd will return -1 if no room on queue.

Else flag can be specified as 0

2. To receive Message msgrcv() function is used

Syntax:

Int msgrcv(int msqid , struct msgbuf *ptr, int length, long msgtype, int flag);

*ptr is pointer to structure where message received is to be storedLength is size to be received and stored in pointer area

Flag hasMSG_NOERROR , it returns an error if length is not large enough to

receive msg, if data portion is greater than msg length it truncates and returns.

3. Variety of control operations on msg can be done through msgctl() function

Int msgctl(int msqid, int cmd, struct msqid_ds *buff);

IPC_RMID in cmd is givento remove a message queue from the system.

Let us create a header file msgq.h with following in it

#include

#include

#include

#include

extern int errno;

#define MKEY1 1234L

#define MKEY2 2345L#define PERMS 0666

28


29/68

Server operation algorithm:

#include msgq.h

main()

{ Int readid, writeid;

If((readid = msgget(MSGKEY1, PERMS |IPC_CREAT))


30/68

24. Write a C program that illustrates suspending and resuming processes using

signals.

23. a) AIM: C program that illustrate file locking using semaphores

PROGRAM:

#include

#include

#include

#include

#include

#include

int main(void)

{

key_t key;

int semid;

union semun arg;

if((key==ftok("sem demo.c","j"))== -1)

{

perror("ftok");

exit(1);

}

if(semid=semget(key,1,0666|IPC_CREAT))== -1){

perror("semget"):

exit(1);

}

arg.val=1;

if(semctl(semid,0,SETVAL,arg)== -1)

{

perror("smctl");

exit(1);

}

return 0;

}

OUTPUT:

semget

smctl

Week 9

30


31/68

25. Write a C program that implements a producer-consumer system with two

processes. (using Semaphores).

26. Write client and server programs (using c) for interaction between server and client

processes using Unix Domain sockets.

Algorithm:

1. Start

2. create semaphore using semget( ) system call

3. if successful it returns positive value

4. create two new processes

5. first process will produce

6. until first process produces second process cannot consume

7. End.

Source code:

#include

#include

#include

#include

#include

#include

#define num_loops 2int main(int argc,char* argv[])

{

int sem_set_id;

int child_pid,i,sem_val;

struct sembuf sem_op;

int rc;

struct timespec delay;

clrscr();

sem_set_id=semget(ipc_private,2,0600);

if(sem_set_id==-1)

{

perror(main:semget);

exit(1);

}

printf(semaphore set created,semaphore setid%d\n ,

sem_set_id);

child_pid=fork();

switch(child_pid)

{case -1:

31


32/68

perror(fork);

exit(1);

case 0:

for(i=0;i


33/68

27. Write client and server programs (using c) for interaction between server and client

processes using Internet Domain sockets.

28. Write a C program that illustrates two processes communicating using shared

memory.

DESCRIPTION:

Shared Memory is an efficeint means of passing data between programs. One

program will create a memory portion which other processes (if permitted) can access.

The problem with the pipes, FIFOs and message queues is that for two processes to

exchange information, the information has to go through the kernel. Shared memory provides

a way around this by letting two or more processes share a memory segment.

In shared memory concept if one process is reading into some shared memory, for

example, other processes must wait for the read to finish before processing the data.

A process creates a shared memory segment using shmget()|. The original owner of a

shared memory segment can assign ownership to another user with shmctl(). It can also

revoke this assignment. Other processes with proper permission can perform various control

functions on the shared memory segment using shmctl(). Once created, a shared segment can

be attached to a process address space using shmat(). It can be detached using shmdt() (see

shmop()). The attaching process must have the appropriate permissions for shmat(). Once

attached, the process can read or write to the segment, as allowed by the permission requestedin the attach operation. A shared segment can be attached multiple times by the same process.

A shared memory segment is described by a control structure with a unique ID that points to

an area of physical memory. The identifier of the segment is called the shmid. The structure

definition for the shared memory segment control structures and prototypews can be found in

.

shmget() is used to obtain access to a shared memory segment. It is prottyped by:

int shmget(key_t key, size_t size, int shmflg);

The key argument is a access value associated with the semaphore ID. The size argument is

the size in bytes of the requested shared memory. The shmflg argument specifies the initial

access permissions and creation control flags.

When the call succeeds, it returns the shared memory segment ID. This call is also used to get

the ID of an existing shared segment (from a process requesting sharing of some existing

memory portion).

The following code illustrates shmget():#include

33


34/68

#include

#include

...

key_t key; /* key to be passed to shmget() */

int shmflg; /* shmflg to be passed to shmget() */int shmid; /* return value from shmget() */

int size; /* size to be passed to shmget() */

...

key = ...

size = ...

shmflg) = ...

if ((shmid = shmget (key, size, shmflg)) == -1) {

perror("shmget: shmget failed"); exit(1); } else {

(void) fprintf(stderr, "shmget: shmget returned %d\n", shmid);

exit(0);

}

...

Controlling a Shared Memory Segment

shmctl() is used to alter the permissions and other characteristics of a shared memory

segment. It is prototyped as follows:

int shmctl(int shmid, int cmd, struct shmid_ds *buf);The process must have an effective shmid of owner, creator or superuser to perform this

command. The cmd argument is one of following control commands:

SHM_LOCK

-- Lock the specified shared memory segment in memory. The

process must have the effective ID of superuser to perform this

command.

SHM_UNLOCK

-- Unlock the shared memory segment. The process must have the

effective ID of superuser to perform this command.

IPC_STAT

-- Return the status information contained in the control structure

and place it in the buffer pointed to by buf. The process must have

read permission on the segment to perform this command.

IPC_SET

-- Set the effective user and group identification and access

permissions. The process must have an effective ID of owner,

creator or superuser to perform this command.

IPC_RMID

-- Remove the shared memory segment.The buf is a sructure of type struct shmid_ds which is defined in

34


35/68

The following code illustrates shmctl():

#include

#include

#include

...int cmd; /* command code for shmctl() */

int shmid; /* segment ID */

struct shmid_ds shmid_ds; /* shared memory data structure to

hold results */

...

shmid = ...

cmd = ...

if ((rtrn = shmctl(shmid, cmd, shmid_ds)) == -1) {

perror("shmctl: shmctl failed");

exit(1);

}

..

Attaching and Detaching a Shared Memory Segment

shmat() and shmdt() are used to attach and detach shared memory segments. They are

prototypes as follows:

void *shmat(int shmid, const void *shmaddr, int shmflg);

int shmdt(const void *shmaddr);

shmat() returns a pointer, shmaddr, to the head of the shared segment associated with a valid

shmid. shmdt() detaches the shared memory segment located at the address indicated byshmaddr

. The following code illustrates calls to shmat() and shmdt():

#include

#include

#include

static struct state { /* Internal record of attached segments. */

int shmid; /* shmid of attached segment */

char *shmaddr; /* attach point */

int shmflg; /* flags used on attach */

} ap[MAXnap]; /* State of current attached segments. */

int nap; /* Number of currently attached segments. */

...

char *addr; /* address work variable */

register int i; /* work area */

register struct state *p; /* ptr to current state entry */

...

p = &ap[nap++];

p->shmid = ...

p->shmaddr = ...p->shmflg = ...

35


36/68

p->shmaddr = shmat(p->shmid, p->shmaddr, p->shmflg);

if(p->shmaddr == (char *)-1) {

perror("shmop: shmat failed");

nap--;

} else(void) fprintf(stderr, "shmop: shmat returned %#8.8x\n",

p->shmaddr);

...

i = shmdt(addr);

if(i == -1) {

perror("shmop: shmdt failed");

} else {

(void) fprintf(stderr, "shmop: shmdt returned %d\n", i);

for (p = ap, i = nap; i--; p++)

if (p->shmaddr == addr) *p = ap[--nap];

}

...

Algorithm:

1. Start

2. create shared memory using shmget( ) system call

3. if success full it returns positive value

4. attach the created shared memory using shmat( ) system

call5. write to shared memory using shmsnd( ) system call

6. read the contents from shared memory using shmrcv( )

system call

7. End .

Source Code:

#include

#include

#include

#include

#include

#include

#define shm_size 1024

int main(int argc,char * argv[])

{

key_t key;

int shmid;

char *data;

int mode;

if(argc>2){

36


37/68

fprintf(stderr,usage:stdemo[data_to_writte]\n);

exit(1);

}

if((shmid=shmget(key,shm_size,0644/ipc_creat))==-1)

{perror(shmget);

exit(1);

}

data=shmat(shmid,(void *)0,0);

if(data==(char *)(-1))

{

perror(shmat);

exit(1);

}

if(argc==2)

printf(writing to segment:\%s\\n,data);

if(shmdt(data)==-1)

{

perror(shmdt);

exit(1);

}

return 0;

}

Input:#./a.out swarupa

Output:

writing to segment swarupa

Data Mining Lab

Credit Risk Assessment

Description: The business of banks is making loans. Assessing the credit worthiness

of an applicant is of crucial importance. You have to develop a system to help a loan

officer decide whether the credit of a customer is good, or bad. A banks business

rules regarding loans must consider two opposing factors. On the one hand, a bank

wants to make as many loans as possible. Interest on these loans is the bans profit

source. On the other hand, a bank cannot afford to make too many bad loans. Too

many bad loans could lead to the collapse of the bank. The banks loan policy mustinvolve a compromise not too strict, and not too lenient.

37


38/68

To do the assignment, you first and foremost need some knowledge about the world

of credit . You can acquire such knowledge in a number of ways.

1. Knowledge Engineering. Find a loan officer who is willing to talk. Interview her and

try to represent her knowledge in the form of production rules.

2. Books. Find some training manuals for loan officers or perhaps a suitable textbook on

finance. Translate this knowledge from text form to production rule form.

3. Common sense. Imagine yourself as a loan officer and make up reasonable rules

which can be used to judge the credit worthiness of a loan applicant.

4. Case histories. Find records of actual cases where competent loan officers correctly

judged when not to, approve a loan application.

The German Credit Data :

Actual historical credit data is not always easy to come by because of confidentiality

rules. Here is one such dataset ( original) Excel spreadsheet version of the German credit

data (download from web).

In spite of the fact that the data is German, you should probably make use of it for this

assignment, (Unless you really can consult a real loan officer !)

A few notes on the German dataset :

DM stands for Deutsche Mark, the unit of currency, worth about 90 centsCanadian (but looks and acts like a quarter).

Owns_telephone. German phone rates are much higher than in Canada so fewer

people own telephones.

Foreign_worker. There are millions of these in Germany (many from Turkey). It

is very hard to get German citizenship if you were not born of German parents.

There are 20 attributes used in judging a loan applicant. The goal is the classify

the applicant into one of two categories, good or bad.

Subtasks : (Turn in your answers to the following tasks)

Laboratory Manual For Data Mining

EXPERIMENT-1

Aim: To list all the categorical(or nominal) attributes and the real valued attributes using Weka

mining tool.

Tools/ Apparatus: Weka mining tool..

38


39/68

Procedure:

1) Open the Weka GUI Chooser.

2) Select EXPLORER present in Applications.

3) Select Preprocess Tab.

4) Go to OPEN file and browse the file that is already stored in the system bank.csv.

5) Clicking on any attribute in the left panel will show the basic statistics on that selected attribute.

SampleOutput:

EXPERIMENT-2

Aim:To identify the rules with some of the important attributes by a) manually and b) Using Weka .


Theory:

Association rule mining is defined as: Let be a set ofnbinary attributes called items. Let be a set of

transactions called the database. Each transaction inD has a unique transaction ID and contains a

subset of the items inI. A rule is defined as an implication of the form X=>Y where X,Y C I and X

Y= . The sets of items (for short itemsets) X and Y are called antecedent (left hand side or LHS) and

consequent(righthandside or RHS) of the rule respectively.

To illustrate the concepts, we use a small example from the supermarket domain.

39


40/68

The set of items isI= {milk,bread,butter,beer} and a small database containing the items (1 codes

presence and 0 absence of an item in a transaction) is shown in the table to the right. An example rule

for the supermarket could be meaning that if milk and bread is bought, customers also buy butter.

Note: this example is extremely small. In practical applications, a rule needs a support of several

hundred transactions before it can be considered statistically significant, and datasets often containthousands or millions of transactions.

To select interesting rules from the set of all possible rules, constraints on various measures of

significance and interest can be used. The bestknown constraints are minimum thresholds on support

and confidence. The support supp(X) of an itemsetXis defined as the proportion of transactions in the

data set which contain the itemset. In the example database, the itemset {milk,bread} has a support of

2 / 5 = 0.4 since it occurs in 40% of all transactions (2 out of 5 transactions).

The confidence of a rule is defined . For example, the rule has a confidence of 0.2 / 0.4 = 0.5 in the

database, which means that for 50% of the transactions containing milk and bread the rule is correct.

Confidence can be interpreted as an estimate of the probabilityP(Y|X), the probability of finding theRHS of the rule in transactions under the condition that these transactions also contain the LHS .

ALGORITHM:

Association rule mining is to find out association rules that satisfy the predefined minimum support

and confidence from a given database. The problem is usually decomposed into two subproblems.

One is to find those itemsets whose occurrences exceed a predefined threshold in the database; those

itemsets are called frequent or large itemsets. The second problem is to generate association rules

from those large itemsets with the constraints of minimal confidence.

Suppose one of the large itemsets is Lk, Lk = {I1, I2, , Ik}, association rules with this itemsets aregenerated in the following way: the first rule is {I1, I2, , Ik1} and {Ik}, by checking the confidence

this rule can be determined as interesting or not. Then other rule are generated by deleting the last

items in the antecedent and inserting it to the consequent, further the confidences of the new rules are

checked to determine the interestingness of them. Those processes iterated until the antecedent

becomes empty. Since the second subproblem is quite straight forward, most of the researches focus

on the first subproblem. The Apriori algorithm finds the frequent setsL In DatabaseD.

Find frequent setLk 1.

Join Step.

o Ckis generated by joiningLk 1with itself

Prune Step.

o Any (k 1) itemset that is not frequent cannot be a subset of a

frequent kitemset, hence should be removed.

Where (Ck: Candidate itemset of size k)

(Lk: frequent itemset of size k)

40


41/68

Apriori Pseudocode

Apriori (T,)

L


42/68


43/68


44/68




5) Go to Classify tab.

6) Here the c4.5 algorithm has been chosen which is entitled as j48 in Java and can be selected by

clicking the button choose

7) and select tree j48

9) Select Test options Use training set

10) if need select attribute.

11) Click Start .

12)now we can see the output details in the Classifier output.

13) right click on the result list and select visualize tree option .

Sample output:

44


45/68

The decision tree constructed by using the implementedC4.5 algorithm

45
http://en.wikipedia.org/wiki/C4.5_algorithmhttp://en.wikipedia.org/wiki/C4.5_algorithmhttp://en.wikipedia.org/wiki/C4.5_algorithm


46/68

EXPERIMENT-4

Aim: To find the percentage of examples that are classified correctly by using the above created

decision tree model? ie.. Testing on the training set.


Theory:

Naive Bayes classifier assumes that the presence (or absence) of a particular feature of a class is

unrelated to the presence (or absence) of any other feature. For example, a fruit may be considered to

be an apple if it is red, round, and about 4" in diameter. Even though these features depend on the

existence of the other features, a naive Bayes classifier considers all of these properties to

independently contribute to the probability that this fruit is an apple.

An advantage of the naive Bayes classifier is that it requires a small amount of training data toestimate the parameters (means and variances of the variables) necessary for classification. Because

independent variables are assumed, only the variances of the variables for each class need to be

determined and not the entirecovariance mAECx The naive Bayes probabilistic model :

The probability model for a classifier is a conditional model

P(C|F1 .................Fn) over a dependent class variable Cwith a small number of outcomes orclasses,

conditional on several feature variablesF1 throughFn. The problem is that if the number of features

n is large or when a feature can take on a large number of values, then basing such a model on

probability tables is infeasible. We therefore reformulate the model to make it more tractable.

Using Bayes' theorem, we write P(C|F1...............Fn)=[{p(C)p(F1..................Fn|C)}/p(F1,........Fn)]

46


47/68

In plain English the above equation can be written as

Posterior= [(prior *likehood)/evidence]

In practice we are only interested in the numerator of that fraction, since the denominator does not

depend on Cand the values of the featuresFi are given, so that the denominator is effectively

constant. The numerator is equivalent to the joint probability model p(C,F1........Fn) which can be

rewritten as follows, using repeated applications of the definition of conditional probability:

p(C,F1........Fn) =p(C) p(F1............Fn|C) =p(C)p(F1|C) p(F2.........Fn|C,F1,F2)

=p(C)p(F1|C) p(F2|C,F1)p(F3.........Fn|C,F1,F2)

= p(C)p(F1|C) p(F2|C,F1)p(F3.........Fn|C,F1,F2)......p(Fn|C,F1,F2,F3.........Fn1)

Now the "naive" conditional independence assumptions come into play: assume that each featureFi is

conditionally independent of every other featureFj for ji .

This means that p(Fi|C,Fj)=p(Fi|C)

and so the joint model can be expressed as p(C,F1,.......Fn)=p(C)p(F1|C)p(F2|C)...........

=p(C) p(Fi|C)

This means that under the above independence assumptions, the conditional distribution over the class

variable Ccan be expressed like this:

p(C|F1..........Fn)= p(C) p(Fi|C)

Z

whereZis a scaling factor dependent only on F1.........Fn, i.e., a constant if the values of the feature

variables are known.

Models of this form are much more manageable, since they factor into a so called class prior p(C) and

independent probability distributions p(Fi|C). If there are kclasses and if a model for eachp(Fi|C=c)

can be expressed in terms ofrparameters, then the corresponding naive Bayes model has (k 1) + n r

kparameters. In practice, often k= 2 (binary classification) and r= 1 (Bernoulli variables as features)

are common, and so the total number of parameters of the naive Bayes model is 2n + 1, where n is the

number of binary features used for prediction

P(h/D)= P(D/h) P(h) P(D)

P(h) : Prior probability of hypothesis h

P(D) : Prior probability of training data D

P(h/D) : Probability of h given D

P(D/h) : Probability of D given h

47


48/68


49/68


50/68

17 309 | b = NO

EXPERIMENT-5

Aim: To Is testing a good idea.

Tools/ Apparatus: Weka Mining tool

Procedure:

1) In Test options, select the Supplied test set radio button

2) click Set

3) Choose the file which contains records that were not in the training set we used to create the

model.

4) click Start(WEKA will run this test data set through the model we already created. )

5) Compare the output results with that of the 4th experiment

Sample output:

This can be experienced by the different problem solutions while doing practice.

The important numbers to focus on here are the numbers next to the "Correctly Classified Instances"(92.3 percent) and the "Incorrectly Classified Instances" (7.6 percent). Other important numbers are inthe "ROC Area" column, in the first row (the 0.936); Finally, in the "Confusion MAECx," it showsthe number of false positives and false negatives. The false positives are 29, and the false negativesare 17 in this mAECx.

Based on our accuracy rate of 92.3 percent, we say that upon initial analysis, this is a good model.

One final step to validating our classification tree, which is to run our test set through the model and

ensure that accuracy of the model

50


51/68

Comparing the "Correctly Classified Instances" from this test set with the "Correctly ClassifiedInstances" from the training set, we see the accuracy of the model , which indicates that the modelwill not break down with unknown data, or when future data is applied to it.

EXPERIMENT-6

Aim: To create a Decision tree by cross validation training data set using Weka mining tool.


Theory:

Decision tree learning, used in data mining and machine learning, uses a decision tree as a predictive

model which maps observations about an item to conclusions about the item's target value In these

tree structures, leaves represent classifications and branches represent conjunctions of features that

lead to those classifications. In decision analysis, a decision tree can be used to visually and explicitly

represent decisions and decision making. In data mining, a decision tree describes data but not

decisions; rather the resulting classification tree can be an input for decision making. This page deals

with decision trees in data mining.

Decision tree learning is a common method used in data mining. The goal is to create a model that

predicts the value of a target variable based on several input variables. Each interior node corresponds

to one of the input variables; there are edges to children for each of the possible values of that input

variable. Each leaf represents a value of the target variable given the values of the input variables

represented by the path from the root to the leaf.

A tree can be "learned" by splitting the source set into subsets based on an attribute value test. This

process is repeated on each derived subset in a recursive manner called recursive partitioning. The

recursion is completed when the subset at a node all has the same value of the target variable, or when

splitting no longer adds value to the predictions.

In data mining, trees can be described also as the combination of mathematical and computational

techniques to aid the description, categorisation and generalization of a given set of data.

Data comes in records of the form:

(x, y) = (x1, x2, x3..., xk, y)

51


52/68

The dependent variable, Y, is the target variable that we are trying to understand, classify or

generalise. The vectorx is comprised of the input variables, x1, x2, x3 etc., that are used for that task.

Procedure:

1) Given the Bank database for mining.

2) Use the Weka GUI Chooser.





7) Choose Classifier Tree

8) Select J48

9) Select Test options Cross-validation.

10) Set Folds Ex:10


12) now Start weka.


14)Compare the output results with that of the 4 th experiment

15) check whether the accuracy increased or decreased?

Sample output:

52


53/68

=== Stratified cross-validation ===

=== Summary ===

Correctly Classified Instances 539 89.8333 %

Incorrectly Classified Instances 61 10.1667 %

Kappa statistic 0.7942

Mean absolute error 0.167

Root mean squared error 0.305

Relative absolute error 33.6511 %

Root relative squared error 61.2344 %

Total Number of Instances 600

=== Detailed Accuracy By Class ===

53


54/68


55/68


56/68

56


57/68

57


58/68

EXPERIMENT-8

Aim: Select some attributes from GUI Explorer and perform classification and see the effect using

Weka mining tool.


Procedure:






6) select some of the attributes from attributes list which are to be removed. With this step only the

attributes necessary for classification are left in the attributes panel.

7) The go to Classify tab.


9) Select j48



12) now Start weka.



15)Compare the output results with that of the 4 th experiment

16) check whether the accuracy increased or decreased?

17)check whether removing these attributes have any significant effect.

Sample output:

58


59/68

EXPERIMENT-9

59


60/68

Aim: To create a Decision tree by cross validation training data set by changing the cost mAECx in

Weka mining tool.


Procedure:








8) Select j48

9) Select Test options Training set.

10)Click on more options.

11)Select cost sensitive evaluation and click on set button

12)Set the mAECx values and click on resize. Then close the window.

13)Click Ok

14)Click start.

15) we can see the output details in the Classifier output

16) Select Test options Cross-validation.

17) Set Folds Ex:10


19) now Start weka.


21)Compare results of 15th and 20th steps.

22)Compare the results with that of experiment 6.

Sample output:

60


61/68


62/68


Procedure:

This will be based on the attribute set, and the requirement of relationship among attribute we want to

study. This can be viewed based on the database and user requirement.

EXPERIMENT-11

Aim: To create a Decision tree by using Prune mode and Reduced error Pruning and show accuracy

for cross validation trained data set using Weka mining tool.


Theory :

Reduced-error pruning

Each node of the (over-fit) tree is examined for pruning

A node is pruned (removed) only if the resulting pruned tree

performs no worse than the original over the validation set

Pruning a node consists of

Removing the sub-tree rooted at the pruned node

Making the pruned node a leaf node

Assigning the pruned node the most common classification of the training instances attached to that

node

Pruning nodes iteratively

Always select a node whose removal most increases the DT accuracy over the validation set

Stop when further pruning decreases the DT accuracy over the validation set

IF (Children=yes) (income=>30000)

THEN (car=Yes)

Procedure:




62


63/68



6) select some of the attributes from attributes list



9) Select NBTree i.e., Navie Baysiean tree.


11) right click on the text box besides choose button ,select show properties

12) now change unprone mode false to true.

13) change the reduced error pruning % as needed.


15) now Start weka.



Sample output:

63


64/68

64


65/68


66/68


67/68


68/68

One R

PART