Python Programming - XII. File Processing

60
XII. FILE PROCESSING Engr. Ranel O. Padon

description

 

Transcript of Python Programming - XII. File Processing

Page 1: Python Programming - XII. File Processing

XII . FILE PROCESSINGEngr. Ranel O. Padon

Page 2: Python Programming - XII. File Processing

PYTHON PROGRAMMING TOPICS

I• Introduction to Python Programming

II• Python Basics

III• Controlling the Program Flow

IV• Program Components: Functions, Classes, Packages, and Modules

V• Sequences (List and Tuples), and Dictionaries

VI• Object-Based Programming: Classes and Objects

VII• Customizing Classes and Operator Overloading

VIII• Object-Oriented Programming: Inheritance and Polymorphism

IX• Randomization Algorithms

X• Exception Handling and Assertions

XI• String Manipulation and Regular Expressions

XII• File Handling and Processing

XIII• GUI Programming Using Tkinter

Page 3: Python Programming - XII. File Processing

FileProcessing

Data Hierarchy

File-Open Modes

Dissecting Files

The Power of Buffering

Page 4: Python Programming - XII. File Processing
Page 5: Python Programming - XII. File Processing

FILE HANDLING

variables offer only temporary storage of data

they are lost when they “goes out of scope” or

when the program terminates

Page 6: Python Programming - XII. File Processing

FILE HANDLING

files are used for long-term retention of

large amounts of data, even after the program

that created the data terminates.

data maintained in files is called persistent data

Page 7: Python Programming - XII. File Processing

FILE HANDLING | Data Hierarchy

Bit (“Binary digit”) => the smallest computer data item

Bit is a digit that can assume one of two values

Page 8: Python Programming - XII. File Processing

FILE HANDLING | Data Hierarchy

Programming with low-level bit formats is tedious & boring.

use decimal digits, letters, and symbols instead.

Page 9: Python Programming - XII. File Processing

FILE HANDLING | Data Hierarchy

Characters are made-up of digits, letters, and characters.

Characters are represented as combination of bits (bytes).

Page 10: Python Programming - XII. File Processing

FILE HANDLING | Data Hierarchy

Page 11: Python Programming - XII. File Processing

FILE HANDLING | Data Hierarchy

Field (Column) is a collection of characters,

represented as words.

Record (Row) is a collection of fields,

represented as a tuple, dictionary, instance of a class.

File (Table) is a collection of records,

implemented as sequential access or random-access.

Database (Folder) is a collection of files,

handled by DBMS softwares.

Page 12: Python Programming - XII. File Processing

FILE HANDLING | Data Hierarchy

Page 13: Python Programming - XII. File Processing

FILE HANDLING | Data Hierarchy

Page 14: Python Programming - XII. File Processing

FILE HANDLING | open() & close()

magical_file.close()

magical_file = open(“file_name.txt” [, a|r|r+|w|w+] [, buffer_mode])

Page 15: Python Programming - XII. File Processing

FILE HANDLING | Other Functions

Page 16: Python Programming - XII. File Processing

FILE HANDLING | open()

Open Mode Read Write Appends Overwrites CreatesCursor @

Start

Cursor @

EOF

r

r+

w

w+

a

a+

Page 17: Python Programming - XII. File Processing

FILE HANDLING | Common Modes

Open Mode Read Write Appends Overwrites CreatesCursor @

Start

Cursor @

EOF

r

w

Page 18: Python Programming - XII. File Processing

FILE HANDLING | open()

“r” is the default file-open mode

open(“input.dat”) = open(“input.dat”, “r”)

Page 19: Python Programming - XII. File Processing

FILE HANDLING | r

Page 20: Python Programming - XII. File Processing

FILE HANDLING | r

Page 21: Python Programming - XII. File Processing

FILE HANDLING | r

Page 22: Python Programming - XII. File Processing

FILE HANDLING | w

try removing line #6

try removing "\n" in lines #3 and #4

Page 23: Python Programming - XII. File Processing

FILE HANDLING | w

Page 24: Python Programming - XII. File Processing

FILE HANDLING | with-as Keyword

Page 25: Python Programming - XII. File Processing

FILE HANDLING | Parsing

Paninda.txt

Page 26: Python Programming - XII. File Processing

FILE HANDLING | Parsing | split

Page 27: Python Programming - XII. File Processing

FILE HANDLING | Parsing | split

Page 28: Python Programming - XII. File Processing

FILE HANDLING | Parsing | csv

Paranormal_Sightings.csv

Page 29: Python Programming - XII. File Processing

FILE HANDLING | Parsing | strip

Page 30: Python Programming - XII. File Processing

FILE HANDLING | Parsing | strip

Page 31: Python Programming - XII. File Processing

FILE HANDLING | Parsing & Classes

Page 32: Python Programming - XII. File Processing

FILE HANDLING | Parsing & Classes

Page 33: Python Programming - XII. File Processing

FILE HANDLING | Parsing & Classes

Page 34: Python Programming - XII. File Processing

FILE HANDLING | Parsing & Classes 2

Page 35: Python Programming - XII. File Processing

FILE HANDLING | Parsing & Classes 2

Page 36: Python Programming - XII. File Processing

FILE HANDLING | Parsing & Classes 2

Page 37: Python Programming - XII. File Processing

FILE HANDLING | Parsing & Classes 2

Page 38: Python Programming - XII. File Processing

FILE HANDLING | HTML Parsing

MangJose.html

Page 39: Python Programming - XII. File Processing

FILE HANDLING | HTML Parsing

MangJose.html

Page 40: Python Programming - XII. File Processing

FILE HANDLING | HTML Parsing

Page 41: Python Programming - XII. File Processing

FILE HANDLING | HTML Parsing

Page 42: Python Programming - XII. File Processing

FILE HANDLING | HTML Parsing

Page 43: Python Programming - XII. File Processing

FILE HANDLING | HTML Parsing

Page 44: Python Programming - XII. File Processing

FILE HANDLING | HTML Parsing

Page 45: Python Programming - XII. File Processing

FILE HANDLING | HTML Parsing

Page 46: Python Programming - XII. File Processing

FILE HANDLING | r+, w+, a+

All of the "plus" modes allow reading and writing:

the main difference between them is where

we're positioned in the file.

“r+” puts us at the beginning

“w+” puts us at the beginning & the end,

because the file's truncated

“a+” puts us at the end.

Page 47: Python Programming - XII. File Processing

FILE HANDLING | w+

Page 48: Python Programming - XII. File Processing

FILE HANDLING | Buffering

Page 49: Python Programming - XII. File Processing

FILE HANDLING | Buffering

“-1” is the default file-open buffering mode

open(“input.dat”) = open(“input.dat”, “r”, “-1”)

Flag Meaning

0 unbuffered

1 buffered line

n buffered with size n

-1 system default

Page 50: Python Programming - XII. File Processing

FILE HANDLING | Creating A Big File!

Page 51: Python Programming - XII. File Processing

FILE HANDLING | Unbuffered r

Then, let’s read that big file.

Page 52: Python Programming - XII. File Processing

FILE HANDLING | Buffered r

Now, with the help of buffering.

Page 53: Python Programming - XII. File Processing

FILE HANDLING | Buffered By Default

In other languages, like C or Java,

buffering is not the default mode.

Page 54: Python Programming - XII. File Processing

FILE HANDLING | What else?

1. Random-Access Files: for fast searching/editing of records

* use the shelve module

* shelve.open()

2. Serialization: compressing file as objects for efficiency;

useful for transferring data (objects, sequences, etc)

across a network connection or saving states of a game

* use the pickle or cPickle module

* cPickle.dump(stringList_to_be_written, serialized_file)

* records = cPickle.load(serialized_file)

Page 55: Python Programming - XII. File Processing
Page 56: Python Programming - XII. File Processing

PRACTICE EXERCISE| MORSE CODE

Page 57: Python Programming - XII. File Processing

PRACTICE EXERCISE| MC CHART

Page 58: Python Programming - XII. File Processing

PRACTICE EXERCISE| MC CHART

Page 59: Python Programming - XII. File Processing

PRACTICE EXERCISE| MORSE CODE

A. Read a file containing Filipino/English-language

phrases and encodes it into Morse code.

B. Read a Morse code file and converts it into the

Filipino/English-language equivalent.

Use one blank between each Morse-coded letter and three blanks between each Morse-coded word.

Page 60: Python Programming - XII. File Processing

REFERENCES

Deitel, Deitel, Liperi, and Wiedermann - Python: How to Program (2001).

Disclaimer: Most of the images/information used here have no proper source

citation, and I do not claim ownership of these either. I don’t want to reinvent the

wheel, and I just want to reuse and reintegrate materials that I think are useful or

cool, then present them in another light, form, or perspective. Moreover, the

images/information here are mainly used for illustration/educational purposes only,

in the spirit of openness of data, spreading light, and empowering people with

knowledge.