•Iterators •Regular expressions •Various things ...€¦ · Processing sequences ... The...
Transcript of •Iterators •Regular expressions •Various things ...€¦ · Processing sequences ... The...
Lecture 4 - Overview
• Iterators•Regular expressions•Various things around Python
Iterators
• One of the most common tasks in a program is to repeat code.
• In many languages loops are done with numerical indices:
for i in range(len(a))
• Another solution is to use iterators.A• An iterator says: “I can go through all objects in the collection I'm associated to one at a time".
• Must have the functions: __iter__(), next()
A new range-functionclass my_range: def __init__(self,last=10): self.last=last; def __iter__(self): self.current_number = -1 return self def next(self): self.current_number += 1 if self.current_number == self.last: raise StopIteration return self.current_number
for n in my_range(10): print n
$ python my_range.py0123456789
Another iterator
# -*- coding: utf-8 -*-# iterator_01.pyclass ColorIterator:
colors = ["Red", "Green", "Blue", "Yellow", "Black", "Brown"]
def __iter__(self):self.current_color = -1return self
def next(self):self.current_color += 1if self.current_color == len(self.__class__.colors):
raise StopIterationreturn self.__class__.colors[self.current_color]
if __name__ == "__main__":ci = ColorIterator()for color in ci:
print color
$> python iterator_01.pyRedGreenBlueYellowBlackBrown
Implementing using generators• The statement yield store a function's state• Next time the function is called, execution will continue after
the yield-statement, with local variables kept.
# -*- coding: utf-8 -*-# iterator_02.pydef colors(available = ["Red", "Green", "Blue",
"Yellow", "Black", "Brown"]): for color in available: yield colorfor color in colors(): print color
$> python iterator_02.pyRedGreenBlueYellowBlackBrown
Fibonacci-series with a generator iterator
• The Fibonacci-series is a number series that is begun with 0 and 1. Thereafter each number is the sum of the two preceding numbers, 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144…
• The series occur in many different contexts, for example within biology.
• The series is never ending and therefore suitable to be implemented using an iterator.
# -*- coding: utf-8 -*-# iterator_03.pydef fib(limit=10):
x, y, count = 0, 1, 0while count < limit:
yield xx, y = y, x + ycount += 1
if __name__ == "__main__":for num in fib(15):
print num,
Create a list from an operator• If we want to save the
generated elements in an iterator we can use functions from the module itertools.
# -*- coding: utf-8 -*-# iterator_04.pydef fib( ):
x, y = 0, 1while True:
yield xx, y = y, x + y
if __name__ == "__main__":import itertoolsprint list(itertools.islice(fib(), 10))
$> python iterator_04.py[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
Parallell iterators• To iterate over two collections in
parallell the function izip is used, also from itertools.
• The module itertools contain several functions for iterators, see the documentation for details.
# -*- coding: utf-8 -*-# iterator_05.pyimport itertoolsa = ["a1", "a2", "a3", "a4"]b = ["b1", "b2", "b3"]if __name__ == "__main__":
for x, y in itertools.izip(a, b):print x, y
$> python iterator_05.pya1 b1a2 b2a3 b3
Processing sequences
● Often you process sequences in similar ways. For some common operations you don't have to use for-loops:
● To select a subset of a sequence:filtered sequence = filter(function, orig_seq)
● To apply any elementwise function:new_sequence = map(function, orig_seq)
Filter and lambda
● Filters are used to remove elements: short_sequence=filter(function, sequence)
● Lambda is used to write anonymous functions
>>> g = lambda x: x*x>>> g(3)9 >>> >>> nums = range(2, 50) >>> for i in range(2, 7): ... nums = filter(lambda x: x == i or x % i, nums)... >>> print nums # gives what?
Regular expressions
● Used when searching or matching for string patterns.
● ”You can think of regular expressions as wildcards on steroids.” - http://www.regular-expressions.info/
● Using wildcard notation, you specify *.txt to find .txt-files. The regex equivalent is:
.*\.txt$ ● Defines a set of normal strings out of a regular
expression string
Regular expressions● Regular expressions aren't specific to Python
but rather used in many other languages.● Often abbreviated 'regex'● The module in Python is 're'● A more complicated example:
\b[a-zA-Z0-9._%-]+@[a-zA-Z0-9.-]+\.[A-Z]{2,4}\b
which matches most e-mail addresses.
Using regular expressions
Create a pattern.Matching the pattern against one or more texts.Use the result.
# -*- coding: utf-8 -*-
# Import regex support, more about this laterimport re
# 1. Create a patternp = re.compile(r'[a-z]+')
# 2. Match the patternresult = p.match('test word')
# 3. Use the patternif result:
print "Match"else:
print "No match"
Searching in the whole string
Create a pattern in the same way. Use the method search instead of match.
# -*- coding: utf-8 -*-import re
# Create a patternp = re.compile(r'[a-z]+')
# Test strings = '12345 testword 12345'
# Use a matchif p.match(s):
print "Match"else:
print "No match"
# Use search insteadif p.search(s):
print "Found"else:
print "Not found"
Finding all occurances
To find all substrings in the text that match the pattern the method findall is used.
>>> import re>>> p = re.compile(r'[a-z]+')>>> print p.findall("a text with several words")['a', 'text', 'with', 'several', 'words']
Operators in regular expressions
Character(s) Matches. Any character except newline^ The start of the string$ The end of a string* Zero or more repetitions of the
preceding token.+ One or more repetitions{m} Exactly m repetitions{m, n} At least m and max m repetitions[...] Any character inside the setA|B A or B
More operators in regular expressions
\d Numbers between 0 and 9. Equivalent to [0-9]\D Everything except\s White spaces. [\t\n\r\f\v]\S\w\W
Examples
Pattern At least one hit No hitr'ain' 'main', 'rain', 'Maine' 'MAIN', 'ai'r'ab+' 'abc', 'abbbbbbc' 'acc', 'bbc'r'ab*' 'abc', 'abbc', 'acc' 'bbc'r'[abc]' 'a', 'b', 'c' 'hej, 'A'r'[a-h]' 'a', 'f', 'ehiusc' 'urtyx'r'(ab){2}' 'abab', 'ababababc' 'ab', 'abcdab'r'^ab'
Some things around Python
● Debugging● Profiling● Python implementations● Other libraries● Automated testing
Debugger
● A program that runs your program in a controlled way. Enables you to step through your code, line-by-line, and see how local variables change value.
● Helps you track down bugs and can also be used to understand how code works.
● Classical use case: Some unknown error is occuring and the cause isn't evident from the program output. You run the program in debug mode and try to produce error, i.e. to recreate the error scenario. When the error occurs, the debugger halts and you can see where it happens and the local variables.
● Seldom used in academia, very common in the industry.
Debugger in Python
● The standard debugger for Python is pdb. You definately want to use some GUI-frontend to this debugger, such as the one provided in Wing.
● Error messages are quite good in Python, so debuggers are not as useful as when programming C++ for example.
Profiling
● A profiler gathers information during execution about performance in some aspect (usually execution time)
Two types:● exact (deterministic) profilers - follows the
execution and save function call count for example● statistical profilers - use random sampling to
determine where the program is spending the time. Only provides an approximation, but the program can run with less intrusion.
Python implementations
● CPython, the official version. Stands for Classic Python. Implemented in C.
● IronPython. Running in Microsofts .NET-framework. Makes the .NET-libraries available in Python. Implemented in C#.
● Jython. For integration with Java-applications. Can import any Java-object.
● Unladen Swallow, CPython modification at Google to increase speed
Other libraries
● Django, Pylons, web2py, TurboGears - web frameworks
● SQLAlchemy - database toolkit● PIL - Imaging library
Automated testing
For small programs it's often simple to determine whether the program does what it's supposed to do.When the program feature set grows, it's not practical to do manual testing any longer.When you make a small change, you want to verify that the existing functionality is still intact.This requires some form of systematic code testing.
Automated testing
● Especially important when using an rapid prototyping development, which Python partly is aimed at.
● So how can we test that the correct output/behaviour of the program without implementing exactly the same functions again?
Automated testing
● .. you could compare to previous output of the program, if your program is deterministic and no internal or external random variables affect the output. Random variables could be clearly defined as such in an algorithm, or could be introduced be letting your output be dependent on things as execution time.
● Or you can check that program output is within some reasonable limits that you set yourself.
● If developing an algorithm, it's good to build a test suite of known problematic cases, and preferably have some automatic quality measurements for each case.
Knowing a programming language
● Not only syntax and function matter● The underlying implementation of different
constructs is essential for all kinds of performance.
● Having an overview about libraries and being able to quickly start using them if necessary
● Knowing about available development tools and roughly how to use them.
Knowledge - skill
● Being a good programmer is a combination of having both the right knowledge and skill.
● Knowledge - knowing concepts, principles and information (the theory) regarding a particular subject
● Skill - ability to produce solutions in a problem domain (”a well trained piano player”, ”a skilled carpenter”) Often a combination of using knowledge and experience.
Training• Training is something natural in many areas: Music, sports
and mathematics for example. • In programming training often gets less focus.• Programming is a huge area, but many things are repeated in
different contexts.• If you have good command of a language, you decrease the
threshold when entering new areas of the language, tackle a new problem or start using a new module.
• It's easier to learn new languages, when you already know some.
What is good training?
• Time without interruptions.• The possibility to try many times until success is reached.
• Being able to explore and try out things, without negative consequences if things don't work.
• A task that challenges you, but is within reach.• Quick feedback