Tries, conversions and field goals

17
Tries, Conversions and Field Goals A RUGBY (LEAGUE) FAN’S GUIDE TO TRIE DATA STRUCTURES

description

Presentation on the Trie datastructure, showing how it works, how it's used and what it can be used for; and an implementation of Tries in PHP... with occasional references to Rugby League Example code to go with the slides can be found at https://github.com/MarkBaker/Tries

Transcript of Tries, conversions and field goals

Page 1: Tries, conversions and field goals

Tries, Conversions and Field GoalsA RUGBY (LEAGUE) FAN’S GUIDE TO TRIE DATA STRUCTURES

Page 2: Tries, conversions and field goals

Who am I?Mark Baker

Design and Development ManagerInnovEd (Innovative Solutions for Education) Ltd

Coordinator and Developer of:Open Source PHPOffice library

PHPExcel, PHPWord,PHPPowerPoint, PHPProject, PHPVisioMinor contributor to PHP core

@Mark_Baker

https://github.com/MarkBaker

http://uk.linkedin.com/pub/mark-baker/b/572/171

Page 3: Tries, conversions and field goals

Tries – What is a Trie? A trie is a key/value data structure that stores the information about the key of each node in the path from the root to the node, rather than in the node itself.

Each path between the root and a leaf represents a key.

Each transition between two nodes is labelled with a single character from a key.

Typically (though not always) the keys will be string values.

Page 4: Tries, conversions and field goals

Tries – What is a Trie? A special marker on each node indicates whether or not it represents the end of a key.

An “end” node may still have child nodes.

Page 5: Tries, conversions and field goals

Tries – What is a Trie? Methods:

◦ Insert(key, value)◦ Delete(key)◦ Search(key)

Page 6: Tries, conversions and field goals

Tries – What is a Trie? Used for:

◦ Dictionary lookups◦ Predictive text / Autocomplete◦ Spell checkers◦ DNA sequencing◦ Burst Sort

Page 7: Tries, conversions and field goals

Conversions – Tries in PHPclass TrieNode {     /**      * Array of child nodes indexed by next character      *      * @var   TrieNode[]      **/     public $children = array();

    /**      * Flag indicating if this node is an end node      *      * @var   boolean      **/     public $valueNode = false;

    /**      * Data value (empty unless this is an end node)      *      * @var   mixed      **/     public $value; }

Page 8: Tries, conversions and field goals

Conversions – Tries in PHP

Page 9: Tries, conversions and field goals

Conversions – Tries in PHP

Page 10: Tries, conversions and field goals

Conversions – Tries in PHP class Trie {

    /**      * Adds a new entry to the Trie      * If the specified node already exists, then its value will be overwritten      *      * @param   mixed   $key     Key for this node entry      * @param   mixed   $value   Data Value for this node entry      * @return  null      */     public function add($key, $value = null) {         $trieNodeEntry = $this->getTrieNodeByKey($key, true);         $trieNodeEntry->valueNode = true;         $trieNodeEntry->value = $value;     }

}

Page 11: Tries, conversions and field goals

Conversions – Tries in PHP

Page 12: Tries, conversions and field goals

Conversions – Tries in PHP class Trie {

    /**      * Backtrack toward the root of the Trie, deleting as we go, 

* until we reach a node that we shouldn't delete      *      * @param   TrieNode   $trieNode   This node entry      * @param   mixed       $key        The full key for this node entry      * @return  null      */     private function delete_backtrace(TrieNode $trieNode, $key) {         $previousKey = substr($key, 0, -1);         $thisChar = substr($key, -1);         $previousTrieNode = $this->getTrieNodeByKey($previousKey);         unset($previousTrieNode->children[$thisChar]);

        if ((count($previousTrieNode->children) == 0) && (!$previousTrieNode->valueNode)) {             $this->delete_backtrace($previousTrieNode, $previousKey);         }     }

    /**      * Delete a node in the Trie      *      * @param   mixed   $key   The key for the node that we want to delete      * @return  boolean        Success or failure, false if the node didn't exist      */     public function delete($key) {         $trieNode = $this->getTrieNodeByKey($key);         if (!$trieNode) {             return false;         }

        if (!empty($trieNode->children)) {             $trieNode->valueNode = false;             $trieNode->value = null;         } else {             $this->delete_backtrace($trieNode, $key);         }

        return true;     }

}

Page 13: Tries, conversions and field goals

Conversions – Tries in PHP function buildTries($fileName) {     $playerData = json_decode(         file_get_contents($fileName)     );

    $trie = new \Trie();     foreach($playerData as $player) {         $playerName = $player->surname . ', ' . $player->firstname;         $trie->add($playerName, $player);     }     return $trie; }

/* Populate the trie  */ $tries = buildTries(__DIR__ . '/RugbyData.json');

/* Do some searches */ $searchResult = $tries->search($searchName); if (empty($searchResult)) {     echo 'No matches found', PHP_EOL; } else {     $players = array_slice($searchResult, 0, $limit);     foreach($players as $player) {         echo $player->surname, ', ', $player->firstname, PHP_EOL;     } }

Page 14: Tries, conversions and field goals

Conversions – Tries in PHP/usr/mark/presentations/tries php trieSearch.php HallLoad Time: 0.0221 sCurrent Memory: 4499.16 kPeak Memory: 4569.01 k

Hall, BillHall, HarryHall, JamesHall, MartinHalliwell, BillyHalliwell, CHalliwell, FrankHalliwell, Jimmy

Search Time: 0.0045 sCurrent Memory: 4500.70 kPeak Memory: 4569.01 k

Page 15: Tries, conversions and field goals

Field Goals – Compressed Tries

Page 16: Tries, conversions and field goals

Field Goals – Patricia TriesPatricia - Practical Algorithm to Retrieve Information Coded in Alphanumeric

Page 17: Tries, conversions and field goals

Field Goals

Questions?