Post on 06-May-2015
description
04/11/2023 1
Working with Functional Data Structures
Practical F# Application
acster.com jackfoxy.com @foxyjackfox
Jack Fox
@foxyjackfoxJackfoxy.com
04/11/2023 2
Bibliography
http://jackfoxy.com/fsharp-user-group-working-with-functional-data-structures-bibliography
acster.com jackfoxy.com @foxyjackfox
04/11/2023 3
tl;dr Singly-linked list -- the fundamental purely functional data structure
Time complexity overview
Garbage collection and real-world performance
Reasons to use Purely Functional Data Structures
When not to use Purely Functional Data Structures
Choices and shapes
Build your own Purely Functional Data Structure
acster.com jackfoxy.com @foxyjackfox
04/11/2023acster.com jackfoxy.com@foxyjackfox 4
What is purely functional?
Immutable
Persistent
Thread safe
Recursive
Incremental
04/11/2023 5
Theoretical Performance O(1)
O(log * n) practically O(1)
O(log log n)
O(log n)
O(n) linear time
O(n2) gets real bad from here on out
…
acster.com jackfoxy.com @foxyjackfox
04/11/2023 6
Theoretical Performance (most common) O(1)
O(log * n) practically O(1)
O(log log n)
O(log n)
O(n) linear time
O(n2) gets real bad from here on out
O(i) variables other than n require explanation
acster.com jackfoxy.com @foxyjackfox
04/11/2023 7
Actual Performance Processor architecture (instruction look-ahead, cache,
etc.) .NET Garbage Collection O(n) behavior starts for “large enough size”
Recursive Benchmarks over different Structure Sizes
102
103
104
105
106
acster.com jackfoxy.com @foxyjackfox
often looks like << O(n)
usually settles down to O(n),sometimes looks like > O(n)
04/11/2023 8
List as a recursive structure
acster.com jackfoxy.com @foxyjackfox
[ ]1::
234
Empty List
Adding Element
Head
Tail
04/11/2023acster.com jackfoxy.com@foxyjackfox 9
So what the heck would you do with a list?
Demo 1
04/11/2023acster.com jackfoxy.com@foxyjackfox 10
“Getting” the recursive thing
SICP
a.k.a
Abelson &Sussman
a.k.a
The Wizard Book
04/11/2023 11
Why no update or remove in List ?
acster.com jackfoxy.com @foxyjackfox
Graphics: unattributed, all over the internet
04/11/2023 12
Okasaki’s Pseudo-Canonical List Update1. let rec loop i updateElem (l:list<'a>) =
2. match (i, l) with
3. | i', [] -> raise (System.Exception("subscript"))
4. | 0, x::xs -> updateElem::xs
5. | i', x::xs -> x::(loop (i' - 1) y xs)
acster.com jackfoxy.com @foxyjackfox
[ ]1234 ::
::
::
found it!
04/11/2023 13
Okasaki’s Pseudo-Canonical List Update1. let rec loop i updateElem (l:list<'a>) =
2. match (i, l) with
3. | i', [] -> raise (System.Exception("subscript"))
4. | 0, x::xs -> updateElem::xs
5. | i', x::xs -> x::(loop (i' - 1) y xs)
Do you see a problem?
acster.com jackfoxy.com @foxyjackfox
04/11/2023 14
We could just punt
1. let punt i updateElem (l:list<'a>) =
2. let a = List.toArray l
3. a.[i] <- updateElem
4. List.ofArray a
acster.com jackfoxy.com @foxyjackfox
04/11/2023 15
…or try a Hybrid approach1. let hybrid i updateElem (l:list<'a>) =
2. if (i = 0) then List.Cons (y, (List.tail l))
3. else
4. let rec loop i' (front:'a array) back =
5. match i' with
6. | x when x < 0 -> front, (List.tail back)
7. | x ->
8. Array.set front x (List.head back)
9. loop (x-1) front (List.tail back)
10. let front, back = loop (i - 1) (Array.create i y) l
11. let rec loop2 i' frontLen (front’:'a array) back’ =
12. match i' with
13. | x when x > frontLen -> back’
14. | x -> loop2 (x + 1) frontLen front’ (front’.[x]::back’)
15. loop2 0 ((Seq.length front) - 1) front (updateElem ::back)
acster.com jackfoxy.com @foxyjackfox
04/11/2023 16
Time complexity of update optionsPseudo-Canonical
O(i)
Punt
O(n)
Hybrid
O(i)
acster.com jackfoxy.com @foxyjackfox
Place your bets !
Graphics: unattributed, all over the internet
04/11/2023 17
Actual Performance
102 PC - 2.9ms Punt - 0.2msHybrid 1.4X 4.0 PC 1.1X 0.2Punt 1.5 4.5 Hybrid 4.1
0.8
acster.com jackfoxy.com @foxyjackfox
10k Random Updates One-time Worst Case
PC looks perfect !
Graphics: http://www.freebievectors.com/es/material-de-antemano/51738/material-vector-dinamico-estilo-comic-femenino/
04/11/2023 18
Actual Performance
102 PC - 2.9ms Punt - 0.2msHybrid 1.4X 4.0 PC 1.1X 0.2Punt 1.5 4.5 Hybrid 4.1
0.8
103 Hybrid - 29.6 Punt - 0.2Punt 1.6 47.6 PC 1.1 0.2PC 1.7 50.3 Hybrid 4.1 0.8
104 Hybrid - 320.3 Punt - 0.3Punt 1.7 534.9 PC 1.3 0.4PC 2.9 920.2 Hybrid 3.2 0.9
105 Hybrid - 4.67sec Punt - 1.0Punt 2.0 9.34 Hybrid 1.5 1.5PC stack overflow !
acster.com jackfoxy.com @foxyjackfox
10k Random Updates One-time Worst Case
04/11/2023acster.com jackfoxy.com@foxyjackfox 19
Benchmarking performance
Hard to reason about actual performance
DS_Benchmark◦Open source on Github◦Discards outliers◦Fully isolates code to benchmark◦Fully documented◦“how to extend” documented
04/11/2023 20
Shapes: let your imagination run wild!
acster.com jackfoxy.com @foxyjackfox
Graphics: Larry D. Moore Attribution-Share Alike 3.0 Unported license. http://commons.wikimedia.org/wiki/File:Playdoh.jpg
04/11/2023acster.com jackfoxy.com@foxyjackfox 21
Binary Random Access List
Same Cons, Head, Tail signature
Optimized for Lookup and Update O(log n)
…but not for Remove
Why Not?
Does it with alternate internal structures
04/11/2023 22
Queue (FIFO)
acster.com jackfoxy.com @foxyjackfox
54::
321
Adding Element
Head
Tail[ ]Empty Queue
;;
04/11/2023acster.com jackfoxy.com@foxyjackfox 23
Deque (double-ended queue)
54::
321
Adding Element
Head
Tail[ ]Empty Deque
;;
Init Last
04/11/2023acster.com jackfoxy.com@foxyjackfox 24
Deque and remove
Approximately O(i/2)
(where i is index to element)
04/11/2023acster.com jackfoxy.com@foxyjackfox 25
Heap
* names in signature altered from Okasaki’s implementationGraphics: http://www.turbosquid.com/3d-models/heap-gravel-max/668104
::
1 Head
Tail
Insert Element
Merge Heaps
[ ]
Empty Heap
04/11/2023acster.com jackfoxy.com@foxyjackfox 26
Heap and remove
O(1) (if implemented)
…but implementation raises issues
Deleting before inserting
Order of events could nullify deletion before insertion
Equal values?
04/11/2023 27
Canonical Functional Linear StructuresOrder
by constructionascendingdescending random
Grow
Shrink
Peek
acster.com jackfoxy.com @foxyjackfox
04/11/2023 28
Fsharpx.Collections RandomAccessList = List + iLookup + iUpdate
DList = List + conj + append
Deque = List + conj + last + initial + rev = initial U tail
LazyList = ListLazy
Heap = List + sorted + append
Queue = List - cons + conj
Vector = List - cons - head - tail + conj + last + initial + iLookup + iUpdate
= RandomAccessList-1
acster.com jackfoxy.com @foxyjackfox
04/11/2023acster.com jackfoxy.com@foxyjackfox 29
Summary of time complexity performance Vector & Binary Random Access List
O(1) cons-conj / head-last / tail-init O(log32n) lookup / update
Dlist
O(1) cons / conj / head / append
O(log n) tail
Deque
O(1) cons / head / tail / conj / last / init
O(1) reverse O(i/2) lookup / update (generally)
Heap
O(1) insert / head O(log n) merge / tail
Queue
O(1) conj / head / tail (generally)
O(log n) merge / tail
04/11/2023acster.com jackfoxy.com@foxyjackfox 30
Measured performance (grow by one)
102
103
104
105
106
ms.f#.array 0.8 1.8 100.9 11,771.4 n/a
ms.f#.array — list 0.3 1 69.5 n/a n/a
ms.f#.list 0.4 0.4 0.4 1.0 13.8
ms.f#.list — list 0.7 0.7 0.9 2.3 45.3
Deque — conj 0.3 0.3 0.5 4.7 *
Deque — cons 0.3 0.3 0.5 4.7 *
Dlist — conj 0.7 0.7 1.0 7.7 153.0
Dlist — cons 0.7 0.7 1.0 6.4 118.4
Heap 3.2 3.3 5.0 22.5 254.7
LazyList 0.9 0.9 1.0 2.6 108.3
Queue 1.0 1.1 1.4 7.6 106.6
RandomAccessList 0.8 0.9 3.3 19.6 189.8
Vector 0.8 0.9 3.3 19.7 189.1
04/11/2023acster.com jackfoxy.com@foxyjackfox 31
Trees
04/11/2023acster.com jackfoxy.com@foxyjackfox 32
Trees
Wide variety of applications
Binary (balanced or unbalanced)
Multiway (a.k.a. RoseTree)
04/11/2023 33
Red Black Tree Balancing
acster.com jackfoxy.com @foxyjackfox
a b
c
db ca d
Source: https://wiki.rice.edu/confluence/download/attachments/2761212/Okasaki-Red-Black.pdf
a
a
b c
d
b c
d
a
c
b
d
04/11/2023 34
1.type 'a t = Node of color * 'a * 'a t * 'a t | Leaf
2.let balance = function
3. | Black, z, Node (Red, y, Node (Red, x, a, b), c), d
4. | Black, z, Node (Red, x, a, Node (Red, y, b, c)), d
5. | Black, x, a, Node (Red, z, Node (Red, y, b, c), d)
6. | Black, x, a, Node (Red, y, b, Node (Red, z, c, d)) ->
7. Node (Red, y, Node (Black, x, a, b), Node (Black, z, c, d))
8. | x -> Node x
Source: http://fsharpnews.blogspot.com/2010/07/f-vs-mathematica-red-black-trees.html
acster.com jackfoxy.com @foxyjackfox
Talk about reducing complexity!
04/11/2023 35
Extra Credit
Write the Remove operation for a Red Black Tree
Here’s how: http://en.wikipedia.org/wiki/Red-black_tree#Removal
acster.com jackfoxy.com @foxyjackfox
04/11/2023acster.com jackfoxy.com@foxyjackfox 36
Fsharpx.Collections.Experimental
IntMap(Map-like structure)
BKTree
RoseTree(lazy multiway)
EagerRoseTree IndexedRoseTree
MS.F#.Collections
Map
Set
04/11/2023acster.com jackfoxy.com@foxyjackfox 37
To Do:
Benchmark:
RoseTree (lazy)
EagerRoseTree (not yet implemented)
IndexedRoseTree
Multiway as unbalanced binary tree(polymorphic recursion)
04/11/2023acster.com jackfoxy.com@foxyjackfox 38
Another To Do:
The (not-so-) Naïve Binary Tree:
As seen all over the internet…
04/11/2023acster.com jackfoxy.com@foxyjackfox 39
Another To Do:
The (not-so-) Naïve Binary Tree:
As seen all over the internet…
…yet often missing: Pre-orderPost-orderIn-order
fold traversals (better be tail-recursive).And maybe a zipper navigator while you
are at it!
04/11/2023acster.com jackfoxy.com@foxyjackfox 40
Call for Action!
Fsharpx.Collections.Experimental
GitHub fork FSharpxImplement some interesting structure and
testsSync back to your forkPull request
Out of ideas or just want to practice?
Unimplemented Okasaki structures:
http://github.com/jackfoxy/DS_Benchmark/tree/master/PurelyFunctionalDataStructures
04/11/2023acster.com jackfoxy.com@foxyjackfox 41
When not to use purely functional
Consider Array if performance is critical
Functional dictionary–like structures (Map) may not perform well-enough, especially after scale 104
Consider .NET dictionary–like object
04/11/2023acster.com jackfoxy.com@foxyjackfox 42
Publishing your functional DS
FSharpx.Collections.readme.md
Include Try value returning option for values that can throw Exception
Include other common values if < O(n)
Reason about edge cases(more unit tests better than not enough)
04/11/2023acster.com jackfoxy.com@foxyjackfox 43
Build your own structure
Demo 3
Leverage Heap as internal structure to create RandomStack
04/11/2023acster.com jackfoxy.com@foxyjackfox 44
Closing Thought
The functional data structures further from the “mainstream” (if such a measure were possible) tend to have less inherit value in their generic form.
Therefore the ultimate functional data structures collection would combine the characteristics of a library, a snippet collection, a benchmarking tool, superb documentation, test cases, and EXAMPLES!
04/11/2023acster.com jackfoxy.com@foxyjackfox 45
Resources
FSPowerPack.Core.Community (NuGet)
FSharpx.Core (GitHub & NuGet)
FSharpx.Collections.Experimental (GitHub & NuGet)
DS_Benchmark (GitHub) raw code for structures not yet merged to FSharpx