Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays,...

64
Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures

Transcript of Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays,...

Page 1: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python:From First Programs Through Data

Structures

Chapter 13

Collections, Arrays, and Linked Structures

Page 2: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 2

Objectives

After completing this chapter, you will be able to:

• Recognize different categories of collections and the operations on them

• Understand the difference between an abstract data type and the concrete data structures used to implement it

• Perform basic operations on arrays, such as insertions and removals of items

Page 3: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 3

Objectives (continued)

• Resize an array when it becomes too small or too large

• Describe the space/time trade-offs for users of arrays

• Perform basic operations, such as traversals, insertions, and removals, on linked structures

• Explain the space/time trade-offs of arrays and linked structures in terms of the memory models that underlie these data structures

Page 4: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 4

Overview of Collections

• Collection: Group of items that we want to treat as conceptual unit

• Examples:– Lists, strings, stacks, queues, binary search trees,

heaps, graphs, dictionaries, sets, and bags

• Can be homogeneous or heterogeneous– Lists are heterogeneous in Python

• Four main categories:– Linear, hierarchical, graph, and unordered

Page 5: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 5

Linear Collections

• Ordered by position

• Everyday examples: – Grocery lists– Stacks of dinner plates– A line of customers waiting at a bank

Page 6: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 6

Hierarchical Collections

• Structure reminiscent of an upside-down tree

– D3’s parent is D1; its children are D4, D5, and D6

• Examples: a file directory system, a company’s organizational tree, a book’s table of contents

Page 7: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 7

• Graph: Collection in which each data item can have many predecessors and many successors

– D3’s neighbors are its predecessors and successors

• Examples: Maps of airline routes between cities; electrical wiring diagrams for buildings

Graph Collections

Page 8: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 8

Unordered Collections

• Items are not in any particular order– One cannot meaningfully speak of an item’s

predecessor or successor

• Example: Bag of marbles

Page 9: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 9

Operations on Collections

Page 10: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 10

Operations on Collections (continued)

Page 11: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 11

Abstraction and Abstract Data Types

• To a user, a collection is an abstraction

• In CS, collections are abstract data types (ADTs)– ADT users are concerned with learning its interface– Developers are concerned with implementing their

behavior in the most efficient manner possible

• In Python, methods are the smallest unit of abstraction, classes are the next in size, and modules are the largest

• We will implement ADTs as classes or sets of related classes in modules

Page 12: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 12

Data Structures for Implementing Collections: Arrays

• “Data structure” and “concrete data type” refer to the internal representation of an ADT’s data

• The two data structures most often used to implement collections in most programming languages are arrays and linked structures– Different approaches to storing and accessing data

in the computer’s memory– Different space/time trade-offs in the algorithms that

manipulate the collections

Page 13: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 13

• Array: Underlying data structure of a Python list – More restrictive than Python lists

• We’ll define an Array class

The Array Data Structure

Page 14: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 14

Random Access and Contiguous Memory

• Array indexing is a random access operation

• Address of an item: base address + offset Index operation has two steps:

Page 15: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 15

Static Memory and Dynamic Memory

• Arrays in older languages were static

• Modern languages support dynamic arrays

• To readjust length of an array at run time:– Create an array with a reasonable default size at

start-up– When it cannot hold more data, create a new, larger

array and transfer the data items from the old array– When the array seems to be wasting memory,

decrease its length in a similar manner

• These adjustments are automatic with Python lists

Page 16: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 16

Physical Size and Logical Size

• The physical size of an array is its total number of array cells

• The logical size of an array is the number of items currently in it

• To avoid reading garbage, must track both sizes

Page 17: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 17

Physical Size and Logical Size (continued)

• In general, the logical and physical size tell us important things about the state of the array:– If the logical size is 0, the array is empty– Otherwise, at any given time, the index of the last

item in the array is the logical size minus 1.– If the logical size equals the physical size, there is no

more room for data in the array

Page 18: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 18

Operations on Arrays

• We now discuss the implementation of several operations on arrays

• In our examples, we assume the following data settings:

• These operations would be used to define methods for collections that contain arrays

Page 19: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 19

Increasing the Size of an Array

• The resizing process consists of three steps:– Create a new, larger array– Copy the data from the old array to the new array– Reset the old array variable to the new array object

• To achieve more reasonable time performance, double array size each time you increase its size:

Page 20: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 20

Decreasing the Size of an Array

• This operation occurs in Python’s list when a pop results in memory wasted beyond a threshold

• Steps:– Create a new, smaller array– Copy the data from the old array to the new array– Reset the old array variable to the new array object

Page 21: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 21

Inserting an Item into an Array That Grows

• Programmer must do four things:– Check for available space and increase the physical

size of the array, if necessary– Shift items from logical end of array to target index

position down by one• To open hole for new item at target index

– Assign new item to target index position– Increment logical size by one

Page 22: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 22

Inserting an Item into an Array That Grows (continued)

Page 23: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 23

Removing an Item from an Array

• Steps:– Shift items from target index position to logical end of

array up by one• To close hole left by removed item at target index

– Decrement logical size by one– Check for wasted space and decrease physical size

of the array, if necessary

• Time performance for shifting items is linear on average; time performance for removal is linear

Page 24: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 24

Removing an Item from an Array (continued)

Page 25: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 25

Complexity Trade-Off: Time, Space, and Arrays

• Average-case use of memory for is O(1)

• Memory cost of using an array is its load factor

Page 26: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 26

Two-Dimensional Arrays (Grids)

• Sometimes, two-dimensional arrays or grids are more useful than one-dimensional arrays

• Suppose we call this grid table; to access an item:

Page 27: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 27

Processing a Grid

• In addition to the double subscript, a grid must recognize two methods that return the number of rows and the number of columns– Examples: getHeight and getWidth

• The techniques for manipulating one-dimensional arrays are easily extended to grids:

Page 28: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 28

Creating and Initializing a Grid

• Let’s assume that there exists a Grid class:

• After a grid has been created, we can reset its cells to any values:

heightwidth

initial fill value

Page 29: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 29

Defining a Grid Class

Page 30: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 30

Defining a Grid Class (continued)

• The method __getitem__ is all that is needed to support the client’s use of the double subscript:

Page 31: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 31

Ragged Grids and Multidimensional Arrays

• In a ragged grid, there are a fixed number of rows, but the number of columns in each row can vary

• An array of lists or arrays provides a suitable structure for implementing a ragged grid

• Dimensions can be added to grid definition if needed; the only limit is computer’s memory– A three-dimensional array can be visualized as a box

filled with smaller boxes stacked in rows and columns

• Depth, height, and width; three indexes; three loops

Page 32: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 32

Linked Structures

• After arrays, linked structures are probably the most commonly used data structures in programs

• Like an array, a linked structure is a concrete data type that is used to implement many types of collections, including lists

• We discuss in detail several characteristics that programmers must keep in mind when using linked structures to implement any type of collection

Page 33: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 33

Singly Linked Structures and Doubly Linked Structures

• No random access; must traverse list

• No shifting of items needed for insertion/removal

• Resize at insertion/removal with no memory cost

An emptylink

Page 34: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 34

Noncontiguous Memory and Nodes

• A linked structure decouples logical sequence of items in the structure from any ordering in memory– Noncontiguous memory representation scheme

• The basic unit of representation in a linked structure is a node:

Page 35: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 35

Noncontiguous Memory and Nodes (continued)

• Depending on the language, you can set up nodes to use noncontiguous memory in several ways:– Using two parallel arrays

Page 36: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 36

Noncontiguous Memory and Nodes (continued)

• Ways to set up nodes to use noncontiguous memory (continued):– Using pointers (a null or nil represents the

empty link as a pointer value)• Memory allocated from the object heap

– Using references to objects (e.g., Python)• In Python, None can mean an empty link

• Automatic garbage collection frees programmer from managing the object heap

• In the discussion that follows, we use the terms link, pointer, and reference interchangeably

Page 37: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 37

Defining a Singly Linked Node Class

• Node classes are fairly simple

• Flexibility and ease of use are critical– Node instance variables are usually referenced

without method calls, and constructors allow the user to set a node’s link(s) when the node is created

• A singly linked node contains just a data item and a reference to the next node:

Page 38: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 38

Using the Singly Linked Node Class

• Node variables are initialized to None or a new Node object

Page 39: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 39

Using the Singly Linked Node Class (continued)

• node1.next = node3 raises AttributeError– Solution: node1 = Node("C", node3), or

Node1 = Node("C", None)

node1.next = node3

• To guard against exceptions:if nodeVariable != None:

<access a field in nodeVariable>

• Like arrays, linked structures are processed with loops

Page 40: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 40

Using the Singly Linked Node Class (continued)

– Note that when the data are displayed, they appear in the reverse order of their insertion

Page 41: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 41

Operations on Singly Linked Structures

• Almost all of the operations on arrays are index based– Indexes are an integral part of the array structure

• Emulate index-based operations on a linked structure by manipulating links within the structure

• We explore how these manipulations are performed in common operations, such as:– Traversals– Insertions– Removals

Page 42: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 42

Traversal

• Traversal: Visit each node without deleting it– Uses a temporary pointer variable

• Example:probe = headwhile probe != None: <use or modify probe.data> probe = probe.next– None serves as a sentinel that stops the process

• Traversals are linear in time and require no extra memory

Page 43: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 43

Traversal (continued)

Page 44: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 44

Searching

• Resembles a traversal, but two possible sentinels:– Empty link– Data item that equals the target item

• Example:probe = headwhile probe != None and targetItem != probe.data: probe = probe.nextif probe != None: <targetItem has been found>else: <targetItem is not in the linked structure>

• On average, it is linear for singly linked structures

Page 45: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 45

Searching (continued)

• Unfortunately, accessing the ith item of a linked structure is also a sequential search operation– We start at the first node and count the number of

links until the ith node is reached

• Linked structures do not support random access– Can’t use a binary search on a singly linked structure– Solution: Use other types of linked structures

Page 46: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 46

Replacement

• Replacement operations employ traversal pattern

• Replacing the ith item assumes 0 <= i < n

• Both replacement operations are linear on average

Page 47: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 47

Inserting at the Beginning

• Uses constant time and memory

Page 48: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 48

Inserting at the End

• Inserting an item at the end of an array (append in a Python list) requires constant time and memory– Unless the array must be resized

• Inserting at the end of a singly linked structure must consider two cases:

– Linear in time and constant in memory

Page 49: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 49

Inserting at the End (continued)

Page 50: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 50

Removing at the Beginning

Page 51: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 51

Removing at the End

• Removing an item at the end of an array (pop in a Python list) requires constant time and memory– Unless the array must be resized

• Removing at the end of a singly linked structure must consider two cases:

Page 52: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 52

Removing at the End (continued)

Page 53: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 53

Inserting at Any Position

• Insertion at beginning uses code presented earlier

• In other position i, first find the node at position i - 1 (if i < n) or node at position n - 1 (if i >= n)

• Linear time performance; constant use of memory

Page 54: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 54

Inserting at Any Position (continued)

Page 55: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 55

Removing at Any Position

• The removal of the ith item from a linked structure has three cases:

Page 56: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 56

Complexity Trade-Off: Time, Space, and Singly Linked Structures

• The main advantage of singly linked structure over array is not time performance but memory performance

Page 57: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 57

Variations on a Link

• A Circular Linked Structure with a Dummy Header Node

• Doubly Linked Structures

Page 58: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 58

A Circular Linked Structure with a Dummy Header Node

• A circular linked structure contains a link from the last node back to the first node in the structure– Dummy header node serves as a marker for the

beginning and the end of the linked structure

Page 59: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 59

A Circular Linked Structure with a Dummy Header Node (continued)

• To initialize the empty linked structure:

• To insert at the ith position:

Page 60: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 60

Doubly Linked Structures

Page 61: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 61

Doubly Linked Structures (continued)

To insert a new item at the end of the linked structure

Page 62: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 62

Doubly Linked Structures (continued)

Page 63: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 63

Summary

• Collections are objects that hold 0+ other objects– Main categories: Linear, hierarchical, graph, and

unordered– Collections are iterable– Collections are thus abstract data types

• A data structure is an object used to represent the data contained in a collection

• The array is a data structure that supports random access, in constant time, to an item by position– Can be two-dimensional (grid)

Page 64: Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures.

Fundamentals of Python: From First Programs Through Data Structures 64

Summary (continued)

• A linked structure is a data structure that consists of 0+ nodes– A singly linked structure’s nodes contain a data item

and a link to the next node– Insertions or removals in linked structures require no

shifting of data elements– Using a header node can simplify some of the

operations, such as adding or removing items