Lempel-Ziv-Welch Compression

11
Lempel-Ziv-Welch Compression CS 1501

description

Lempel-Ziv-Welch Compression. CS 1501. Assignment Overview. Read command line options -c for compression / -d for decompression For compression –n / -r / -m Name of input file Name of output file lzwmod -c –r bogus.exe bogus.lzw lzwmod -d bogus.lzw bogus.orig - PowerPoint PPT Presentation

Transcript of Lempel-Ziv-Welch Compression

Page 1: Lempel-Ziv-Welch Compression

Lempel-Ziv-Welch Compression

CS 1501

Page 2: Lempel-Ziv-Welch Compression

2023.04.20 2/11

Assignment Overview

– Read command line options• -c for compression / -d for decompression• For compression –n / -r / -m• Name of input file• Name of output file• lzwmod -c –r bogus.exe bogus.lzw• lzwmod -d bogus.lzw bogus.orig

– Test compression algorithm performance and write a short report.

Page 3: Lempel-Ziv-Welch Compression

2023.04.20 3/11

Assignment Overview

• The original LZW implementation is written in C.– You MUST use UNIX account and GCC compiler for this assignment.– gcc source.c –o binary_file

• Thoroughly understand the implementation and modify the code to lzwmod.c:– Codesize vary from 9 bits to 16bits.– Change the table size to 99991 (65536/99991 = 0.655).– Verify the compression/decompression works correctly.

• Test against a large .exe file.• Check the byte counts of the .exe file.

– Codewords overflow handling:• Do nothing and use the existing codes.• Reset the distionary back to empty.• Monitoring compression ratio.

Page 4: Lempel-Ziv-Welch Compression

2023.04.20 4/11

LZW.c – main()

main(int argc, char *argv[]) {FILE *input_file;FILE *output_file;FILE *lzw_file;char input_file_name[81];

/* * The three buffers are needed for the compression phase. */code_value = (int*) malloc (TABLE_SIZE*sizeof(int));prefix_code = (unsigned int *) malloc (TABLE_SIZE*sizeof(unsigned int));append_character = (unsigned char *) malloc (TABLE_SIZE*sizeof(unsigned char));if (code_value==NULL || prefix_code==NULL || append_character==NULL) {

printf("Fatal error allocating table space!\n");exit(-1);

}

Page 5: Lempel-Ziv-Welch Compression

2023.04.20 5/11

LZW.c – main() cont’d

/* * Get the file name, open it up, and open up the lzw output file. */if (argc>1)

strcpy(input_file_name,argv[1]);else {

printf("Input file name? ");scanf("%s",input_file_name);

}input_file=fopen(input_file_name,"rb");lzw_file=fopen("test.lzw","wb");if (input_file==NULL || lzw_file==NULL) {

printf("Fatal error opening files.\n");exit(-1);

};

Page 6: Lempel-Ziv-Welch Compression

2023.04.20 6/11

LZW.c – main() cont’d

/* * Compress the file. */compress(input_file,lzw_file);fclose(input_file);fclose(lzw_file);free(code_value);……

/* * Expand the file. */expand(lzw_file,output_file);fclose(lzw_file);fclose(output_file);

free(prefix_code);free(append_character);

}

Page 7: Lempel-Ziv-Welch Compression

2023.04.20 7/11

LZW.c – find_match()

int find_match(int hash_prefix,unsigned int hash_character) {int index;int offset;

index = (hash_character << HASHING_SHIFT) ^ hash_prefix;if (index == 0)

offset = 1;else

offset = TABLE_SIZE - index;

while (1) {if (code_value[index] == -1)

return(index);if (prefix_code[index] == hash_prefix && append_character[index] == hash_character)

return(index);index -= offset;if (index < 0)

index += TABLE_SIZE;}

}

Page 8: Lempel-Ziv-Welch Compression

2023.04.20 8/11

LZW.c – compress()void compress(FILE *input,FILE *output) {

……next_code=256; /* Next code is the next available string code*/for (i=0;i<TABLE_SIZE;i++) /* Clear out the string table before starting */

code_value[i]=-1;

i=0;printf("Compressing...\n");string_code=getc(input); /* Get the first code */

while ((character=getc(input)) != (unsigned)EOF) {index=find_match(string_code,character); /* See if the string is in */if (code_value[index] != -1) /* the table. If it is, */

string_code=code_value[index]; /* get the code value. If */else { /* the string is not in the*/

if (next_code <= MAX_CODE) { /* table, try to add it. */code_value[index]=next_code++;prefix_code[index]=string_code;append_character[index]=character;

}output_code(output,string_code); /* When a string is found */string_code=character; /* that is not in the table*/

} /* I output the last string*/} /* after adding the new one*/

}

Page 9: Lempel-Ziv-Welch Compression

2023.04.20 9/11

Example

• Compress the string:– This is an apple.– Is this compressed correctly?

Page 10: Lempel-Ziv-Welch Compression

2023.04.20 10/11

Page 11: Lempel-Ziv-Welch Compression

2023.04.20 11/11